What is HTML validation?
The HTML validator performs several checks on your code. The main ones are:
- Validation syntax - checking for syntax errors. <foo bar = "baz"> is the correct syntax, although <foo> is not a valid HTML tag, so syntax checking is minimally useful for writing good HTML.
- Checking tag nesting - tags should be closed in the reverse order of their opening. For example, this check catches errors with incorrectly closed <div>.
- DTD Validation — Verify that your code conforms to the specified Document Type Definition. It includes checking tag names, attributes, and tag embedding (tags of one type inside tags of another type)
- Check for extraneous elements - check reveals everything that is in the code, but is absent in the DTD. For example, user tags and attributes.
Keep in mind that these are logical checks, and it does not matter how the validator is implemented. If at least one of the checks fails, then HTML is considered invalid. And therein lies the problem.
Arguments
The main argument for validating HTML is to provide cross-browser compatibility. Each browser has its own parser and “feeds” to it what all browsers understand - this is the only way to be sure that your code will work correctly in all browsers. Because each browser has its own HTML error correction mechanism. You cannot rely on invalid code.
The main argument against validation is that it is too strict and does not correspond to the way browsers actually work. Yes, HTML may not be valid, but all browsers can handle some non-valid code the same way. If I’m willing to take responsibility for the wrong code I’m writing, then I don’t have to worry about checking. The only thing I have to take care of is for it to work.
My position
This is one of the few times when I publicly speak about my position in relation to something. I have always been among opponents of validation, based on the fact that the validator is too strict to be practical in real-world applications. There are things that are supported by most browsers (<noscript> in <head>, <script> after </ html>), which are invalid, but sometimes very necessary for proper operation.
')
In general, my biggest problem with validation is check # 4 (for foreign elements). I’m in favor of using custom attributes in HTML tags to store additional meta data related to a specific element. In my understanding, it is, for example, to add the attribute foo, when I have data (bar) that I need to associate with a specific element. Sometimes people overload existing attributes for these purposes only in order to pass validation, despite the fact that the attribute will be misused. For me it makes no sense.
The secret of browsers lies in the fact that they never check whether the HTML code corresponds to the specified DTD. The doctype you specified in the document switches the browser's parser to a specific mode, but this does not load the doctype and check the code for compliance with it. That is, the browser parser handles HTML with some invalidating assumptions, such as self-closing tags and block elements inside string (I'm sure there are others).
In the case of user attributes, all browsers parse and recognize syntactically correct attributes as valid. This makes it possible to access such attributes through DOM using Javascript. So why should I worry about validity? I will continue to use my attributes and I am very glad that HTML5 formalizes them.
The best example of technology that leads to invalid HTML, but of great importance, is
ARIA . ARIA works by adding new attributes to HTML 4. These attributes provide additional semantic meaning to HTML elements and the browser is able to convey this semantics to assistive devices to help people with disabilities. All major browsers now support ARIA markup. However, if you use these attributes, you will have invalid HTML.
As for user tags, I think that adding new, syntactically correct tags to the page is nothing bad, but I do not see much practical sense in this.
To clarify my position: I believe that checks # 1 and # 2 are very important and should always be carried out. I also consider check # 3 important, but not as much as the first two. Check # 4 is very dubious to me, as it hits user attributes. I believe that, at maximum, user attributes should be marked as warnings (and not errors) in the test results in order to be able to check if I made a mistake when entering the attribute name. Marking custom tags as errors is probably a good idea, but it also has some problems, for example, when embedding content in another markup - SVG or MathML.
Validation for validation?
I believe that validation for validation is extremely silly. Valid HTML only means that all 4 checks passed without errors. There are several important things that valid HTML does not guarantee:
- valid HTML does not guarantee accessibility;
- valid HTML does not guarantee good UX (user experience);
- valid HTML does not guarantee a functioning website;
- valid HTML does not guarantee the correct display of the site.
Valid HTML can be a reason to be proud of yourself, but this in itself is not an indicator of skill. Your valid code does not always perform its functions better than mine is invalid.
HTML5 Validation
Validating HTML5 will correct some of the problems that have been with the validation of HTML 4. It clearly allows the use of custom attributes (they must start with data-). This will allow my code to be validated for HTML5. Of course, there are some points in the HTML5 validator that I don’t agree with, but I think that it corresponds much more to practical needs than the HTML 4 validator.
Conclusion
I believe that some of the components of HTML validation are extremely important and useful, but I do not want to be held hostage by it, because I use my attributes. I am proud that I use ARIA in my work and I don’t care that it is considered invalid code. Again, out of four validator checks, I have problems with only one. And the HTML5 validator will relieve me of most of these problems.
I know that for many it is a controversial topic, so please refrain from purely emotional statements in the comments.
About the author - Nicholas C. Zakas, an employee of Yahoo, a specialist in UI and JS, the author of the books Professional JavaScript for Web Developers and High Performance JavaScript.
UPD: thanks for the karma, moved to the thematic. I will repeat the words of the author: I understand that this is a controversial topic, but please refrain from purely emotional comments, give arguments.