Faced such a problem: many programmers either do not know about the existence of an accept-charset, or ignore this attribute. Having come to my current company, I started developing the REST API service, but periodically the bugs of “XML response is broken for ...” fell on me. I had to dig deeper into the GUI and found the absence of a favorite tag. Why do we need another attribute, you ask?
What is accept-charset better than I have long been described in the W3C via
this link (http://www.w3.org/TR/html401/interact/forms.html#adef-accept-charset)
Let us now present the situation:
- do you have a Web site
- you have entered the meta encoding utf-8
- you have configured the server part to work with utf-8 (base, backend, etc.)
You test: go to the site, send from the form - everything is fine. However, the problem is that many people forget:
1. In most cases, the browser is set to auto-detect the encoding and your site will correctly post the data to the server side.
2. There are people who manually set their encoding
3. there are fans to play with your site
4. Other: bots, software for testing, etc.
')
What happens in this case in the absence of a sabzhevy attribute in the FORM tag:
1. open your website
2. Change the encoding in the browser, let it be ISO-8859-1
3. try to enter data in Russian or, for example: German, using umlauts; want to go further - try specials. characters
4. post your form
5. open your record in the database and look in what encoding your characters got there and how they were processed by the server part
Answer : the text in ISO-8859-1 will come to you because the browser follows standards and a certain sequence in the definition of encoding, which means that if ISO-8859-1 is strictly stated, then the browser will comply and use ISO-8859-1 for submitting data from the form
How to deal with it?
Look at the topic header: yes, it is accept-charset = “utf-8” in the FORM tag that will save you from this problematic situation. This attribute will give the browser the necessary “knowledge” that data from the form should be sent only in utf-8 encoding and no other
Conclusion : everything ingenious is simple, but information owns the world in our time.
PS Itube still remains a mystery for me, they deliberately do not use accept-charset, instead they use some kind of functionality that does the same thing (it looks like javascript)