📜 ⬆️ ⬇️

Using encodings in MySQL> = 4.1

When I started to learn InnoDB and transactions in MySQL (it was necessary to update the version from 3.23 to 4.1) I encountered the problem of incorrect data exchange between PHP and MySQL, which was manifested in the fact that the server instead of Cyrillic characters in the requests generated by the php script inserted into the cells DB tables question marks. In the process of “smoking out” documentation, reading forums and studying articles, an understanding of the problem came and a way to solve it was found.

The root cause of the problem turned out that before version 4.1 the encoding could only be set for the entire server (by defining the value of the parameter - --default-charset ), and starting from version 4.1 the developers added the ability to define encoding at different levels of the DBMS hierarchy (for the entire server, database, tables, columns).

A bit of terminology

CHARACTER SET is a set of characters called encoding. Different CHARACTER SET include different character sets. Different CHARACTER SET can include approximately the same character sets but in a different order (see for example koi8ru and cp1251). MySQL needs to know which CHARACTER SET will be used for the data in the table in order to properly sort and index the data.
')
COLLATION is the way to organize and compare data in the database.
For the same CHARACTER SET there are usually several COLLATION. For example: cp1251_general_ci is a case-insensitive comparison, cp1251_bin is case-sensitive.

In order to avoid problems with encodings, you need to have an idea of ​​how they can be specified for different levels of the DBMS hierarchy.

Ways to set encodings

1) For the entire server at compilation, defining the parameters --with-charset and --with-collation :
  ./configure --with-charset = cp1251 --with-collation = cp1251_general_ci 

2) For the entire server when you start mysqld, defining the --character-set-server and --collation-server parameters:
  mysqld --character-set-server = cp1251 --collation-server = cp1251_bin 

3) When creating a database:
  CREATE DATABASE dbname DEFAULT CHARACTER SET cp1251 COLLATE cp1251_bin; 

4) When creating tables:
  CREATE TABLE tblname (col INT) DEFAULT CHARACTER SET cp1251 COLLATE cp1251_bin; 

5) In the column definitions:
  CREATE TABLE tblname (<br> column1 varchar (255), <br> column2 varchar (255) CHARACTER SET cp1251 COLLATE cp1251_general_ci <br>) DEFAULT CHARACTER SET cp1251 COLLATE cp1251_bin; 

In order to see to which encoding the data structure is attached, you can use the SHOW CREATE operator:
  mysql> SHOW CREATE TABLE tree_nodes;
 |  tree_nodes |  CREATE TABLE `tree_nodes` (<br> ... <br>) ENGINE = InnoDB <b> DEFAULT CHARSET = cp1251 COLLATE = cp1251_bin </ b> | 

Work with client programs

Any mysql client, when connected to a server, can set several variables:
For the client to work correctly with the server, at least character_set_client, character_set_connection, character_set_results should be set using the SET operator:
  mysql_query ("SET character_set_client = 'cp1251'"); <br> mysql_query ("SET character_set_connection = 'cp1251'"); <br> mysql_query ("SET character_set_results = 'cp1251'"); 
If the query and the data in the database are in the same encoding, and no transcoding of the result is required, then instead of setting character_set_client, character_set_connection, character_set_results, it is enough to execute:
  mysql_query ("SET NAMES 'cp1251'"); 
To view the default values ​​of these variables, you can use the SHOW VARIABLES operator:
  SHOW VARIABLES LIKE 'character_set%'; 

Source: https://habr.com/ru/post/10983/


All Articles