• 浅谈MySQL字符集


     
    Preface
     
        MySQL use character set & collation to organize the different charater.It provides a flexible way in setting individual character set on a database,a table even on a single column of table.Each character set has a series of collations with one default collation.We can generally see the character set in MySQL as the combination of code page & character encoding.
        In the early version of MySQL(eg. version 5.5) ,latin1 is the default character set which does not support Chinese characters.There're some other commonly used character set such as GBK,UTF-8.MySQL 5.7 chooses UTF-8 as default character set nowadays in order to support more characters of different languages.
        Messy code is a general issue about character set.It always occur in our MySQL databases if we do not use character rules appropriately.Worse,it leads to data loss in some cases what is really a big trouble we should avoid.
     
    Introduce
     
        MySQL provides a lot of parameters to specify character set in various dimentionality.We should know clearly about the principle and function of each parameter to avert underlying messy code issue.Let's see details about it.
     
    Procedure
     
    Check the character set supported by MySQL(version 5.7).
     1 (root@localhost mysql3306.sock)[(none)]>show character set;
     2 +----------+---------------------------------+---------------------+--------+
     3 | Charset  | Description                     | Default collation   | Maxlen |
     4 +----------+---------------------------------+---------------------+--------+
     5 | big5     | Big5 Traditional Chinese        | big5_chinese_ci     |      2 |
     6 | dec8     | DEC West European               | dec8_swedish_ci     |      1 |
     7 | cp850    | DOS West European               | cp850_general_ci    |      1 |
     8 | hp8      | HP West European                | hp8_english_ci      |      1 |
     9 | koi8r    | KOI8-R Relcom Russian           | koi8r_general_ci    |      1 |
    10 | latin1   | cp1252 West European            | latin1_swedish_ci   |      1 |
    11 | latin2   | ISO 8859-2 Central European     | latin2_general_ci   |      1 |
    12 | swe7     | 7bit Swedish                    | swe7_swedish_ci     |      1 |
    13 | ascii    | US ASCII                        | ascii_general_ci    |      1 |
    14 | ujis     | EUC-JP Japanese                 | ujis_japanese_ci    |      3 |
    15 | sjis     | Shift-JIS Japanese              | sjis_japanese_ci    |      2 |
    16 | hebrew   | ISO 8859-8 Hebrew               | hebrew_general_ci   |      1 |
    17 | tis620   | TIS620 Thai                     | tis620_thai_ci      |      1 |
    18 | euckr    | EUC-KR Korean                   | euckr_korean_ci     |      2 |
    19 | koi8u    | KOI8-U Ukrainian                | koi8u_general_ci    |      1 |
    20 | gb2312   | GB2312 Simplified Chinese       | gb2312_chinese_ci   |      2 |
    21 | greek    | ISO 8859-7 Greek                | greek_general_ci    |      1 |
    22 | cp1250   | Windows Central European        | cp1250_general_ci   |      1 |
    23 | gbk      | GBK Simplified Chinese          | gbk_chinese_ci      |      2 |
    24 | latin5   | ISO 8859-9 Turkish              | latin5_turkish_ci   |      1 |
    25 | armscii8 | ARMSCII-8 Armenian              | armscii8_general_ci |      1 |
    26 | utf8     | UTF-8 Unicode                   | utf8_general_ci     |      3 |
    27 | ucs2     | UCS-2 Unicode                   | ucs2_general_ci     |      2 |
    28 | cp866    | DOS Russian                     | cp866_general_ci    |      1 |
    29 | keybcs2  | DOS Kamenicky Czech-Slovak      | keybcs2_general_ci  |      1 |
    30 | macce    | Mac Central European            | macce_general_ci    |      1 |
    31 | macroman | Mac West European               | macroman_general_ci |      1 |
    32 | cp852    | DOS Central European            | cp852_general_ci    |      1 |
    33 | latin7   | ISO 8859-13 Baltic              | latin7_general_ci   |      1 |
    34 | utf8mb4  | UTF-8 Unicode                   | utf8mb4_general_ci  |      4 |
    35 | cp1251   | Windows Cyrillic                | cp1251_general_ci   |      1 |
    36 | utf16    | UTF-16 Unicode                  | utf16_general_ci    |      4 |
    37 | utf16le  | UTF-16LE Unicode                | utf16le_general_ci  |      4 |
    38 | cp1256   | Windows Arabic                  | cp1256_general_ci   |      1 |
    39 | cp1257   | Windows Baltic                  | cp1257_general_ci   |      1 |
    40 | utf32    | UTF-32 Unicode                  | utf32_general_ci    |      4 |
    41 | binary   | Binary pseudo charset           | binary              |      1 |
    42 | geostd8  | GEOSTD8 Georgian                | geostd8_general_ci  |      1 |
    43 | cp932    | SJIS for Windows Japanese       | cp932_japanese_ci   |      2 |
    44 | eucjpms  | UJIS for Windows Japanese       | eucjpms_japanese_ci |      3 |
    45 | gb18030  | China National Standard GB18030 | gb18030_chinese_ci  |      4 |
    46 +----------+---------------------------------+---------------------+--------+
    47 41 rows in set (0.00 sec)
    48 
    49 //There're 41 results of the supported character set.
    50 //Each character set has a default collation.
    51 //Maxlen is the max bytes of corresponding character set(eg. utf8mb4 supports 4 bytes).
    Check character set parameters of current MySQL server.
     1 (root@localhost mysql3306.sock)[(none)]>show variables like 'character%';
     2 +--------------------------+----------------------------------------------------------------+
     3 | Variable_name            | Value                                                          |
     4 +--------------------------+----------------------------------------------------------------+
     5 | character_set_client     | utf8                                                           |
     6 | character_set_connection | utf8                                                           |
     7 | character_set_database   | utf8                                                           |
     8 | character_set_filesystem | binary                                                         |
     9 | character_set_results    | utf8                                                           |
    10 | character_set_server     | utf8                                                           |
    11 | character_set_system     | utf8                                                           |
    12 | character_sets_dir       | /usr/local/mysql-5.7.21-linux-glibc2.12-x86_64/share/charsets/ |
    13 +--------------------------+----------------------------------------------------------------+
    14 8 rows in set (0.00 sec)
    15 
    16 //character_set_client:It's used by client when connect to servers for requesting data.
    17 //character_set_connection:It's used for those literals not have a character set introducer for conversion. 
    18 //character_set_database:It's used by default database.The value of "character_set_server" will be inherited if it is not specified.
    19 //character_set_filesystem:It's used to interpret string literals refer to file names.
    20 //character_set_results:It's used to return query results to the client.
    21 //character_set_server:It's the default character set of server.
    22 //character_set_system:It's used by server for storing identifiers,the value is utf8 forever.
    23 //character_sets_dir:It's the directory where contains the xml files of installed character set.
    The relationship of  above character set parameters shows below.
     Check collation parameters of current MySQL server.
     1 (root@localhost mysql3306.sock)[(none)]>show variables like 'collation%';
     2 +----------------------+-----------------+
     3 | Variable_name        | Value           |
     4 +----------------------+-----------------+
     5 | collation_connection | utf8_general_ci |
     6 | collation_database   | utf8_general_ci |
     7 | collation_server     | utf8_general_ci |
     8 +----------------------+-----------------+
     9 3 rows in set (0.01 sec)
    10 
    11 //collation_connection:The collation of connection character set.
    12 //collation_database:The collation of default database.It will inherite the value of "collation_server" if not specified.
    13 //collation_server:The default collation of server.
    14 //"ci" means Case Insensitive.
    Case of change character set from utf8 to latin1 with "set names ...;".
     1 (root@localhost mysql3306.sock)[(none)]>s
     2 --------------
     3 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
     4 
     5 Connection id:        7
     6 Current database:    
     7 Current user:        root@localhost
     8 SSL:            Not in use
     9 Current pager:        stdout
    10 Using outfile:        ''
    11 Using delimiter:    ;
    12 Server version:        5.7.21-log MySQL Community Server (GPL)
    13 Protocol version:    10
    14 Connection:        Localhost via UNIX socket
    15 Server characterset:    utf8
    16 Db     characterset:    utf8
    17 Client characterset:    utf8
    18 Conn.  characterset:    utf8
    19 UNIX socket:        /tmp/mysql3306.sock
    20 Uptime:            4 hours 33 min 11 sec
    21 
    22 Threads: 1  Questions: 52  Slow queries: 0  Opens: 110  Flush tables: 1  Open tables: 103  Queries per second avg: 0.003
    23 --------------
    24 
    25 (root@localhost mysql3306.sock)[(none)]>set names latin1;
    26 Query OK, 0 rows affected (0.00 sec)
    27 
    28 (root@localhost mysql3306.sock)[(none)]>s
    29 --------------
    30 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
    31 
    32 Connection id:        7
    33 Current database:    
    34 Current user:        root@localhost
    35 SSL:            Not in use
    36 Current pager:        stdout
    37 Using outfile:        ''
    38 Using delimiter:    ;
    39 Server version:        5.7.21-log MySQL Community Server (GPL)
    40 Protocol version:    10
    41 Connection:        Localhost via UNIX socket
    42 Server characterset:    utf8
    43 Db     characterset:    utf8
    44 Client characterset:    latin1
    45 Conn.  characterset:    latin1
    46 UNIX socket:        /tmp/mysql3306.sock
    47 Uptime:            4 hours 33 min 18 sec
    48 
    49 Threads: 1  Questions: 56  Slow queries: 0  Opens: 110  Flush tables: 1  Open tables: 103  Queries per second avg: 0.003
    50 --------------
    51 
    52 (root@localhost mysql3306.sock)[(none)]>select @@character_set_client;
    53 +------------------------+
    54 | @@character_set_client |
    55 +------------------------+
    56 | latin1                 |
    57 +------------------------+
    58 1 row in set (0.00 sec)
    59 
    60 (root@localhost mysql3306.sock)[(none)]>select @@character_set_connection;
    61 +----------------------------+
    62 | @@character_set_connection |
    63 +----------------------------+
    64 | latin1                     |
    65 +----------------------------+
    66 1 row in set (0.00 sec)
    67 
    68 (root@localhost mysql3306.sock)[(none)]>select @@character_set_results;
    69 +-------------------------+
    70 | @@character_set_results |
    71 +-------------------------+
    72 | latin1                  |
    73 +-------------------------+
    74 1 row in set (0.00 sec)
    75 
    76 (root@localhost mysql3306.sock)[(none)]>show variables like '%collation%';
    77 +----------------------+-------------------+
    78 | Variable_name        | Value             |
    79 +----------------------+-------------------+
    80 | collation_connection | latin1_swedish_ci |
    81 | collation_database   | utf8_general_ci   |
    82 | collation_server     | utf8_general_ci   |
    83 +----------------------+-------------------+
    84 3 rows in set (0.00 sec)
    85 
    86 //The influence of command "set names latin1" to character set is to change "character_set_client","character_set_connection","character_set_results" into latin1.
    87 //The influence of command "set names latin1" to collation is to change "collation_connection" into latin1.
    Case of change character set from utf8 to latin1 with "set character set ...;".
     1 (root@localhost mysql3306.sock)[(none)]>s
     2 --------------
     3 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
     4 
     5 Connection id:        7
     6 Current database:    
     7 Current user:        root@localhost
     8 SSL:            Not in use
     9 Current pager:        stdout
    10 Using outfile:        ''
    11 Using delimiter:    ;
    12 Server version:        5.7.21-log MySQL Community Server (GPL)
    13 Protocol version:    10
    14 Connection:        Localhost via UNIX socket
    15 Server characterset:    utf8
    16 Db     characterset:    utf8
    17 Client characterset:    latin1
    18 Conn.  characterset:    utf8
    19 UNIX socket:        /tmp/mysql3306.sock
    20 Uptime:            4 hours 44 min 8 sec
    21 
    22 Threads: 1  Questions: 72  Slow queries: 0  Opens: 111  Flush tables: 1  Open tables: 104  Queries per second avg: 0.004
    23 --------------
    24 
    25 (root@localhost mysql3306.sock)[(none)]>select @@character_set_client;
    26 +------------------------+
    27 | @@character_set_client |
    28 +------------------------+
    29 | latin1                 |
    30 +------------------------+
    31 1 row in set (0.00 sec)
    32 
    33 (root@localhost mysql3306.sock)[(none)]>select @@character_set_connection;
    34 +----------------------------+
    35 | @@character_set_connection |
    36 +----------------------------+
    37 | utf8                       |
    38 +----------------------------+
    39 1 row in set (0.00 sec)
    40 
    41 (root@localhost mysql3306.sock)[(none)]>select @@character_set_results;
    42 +-------------------------+
    43 | @@character_set_results |
    44 +-------------------------+
    45 | latin1                  |
    46 +-------------------------+
    47 1 row in set (0.00 sec)
    48 
    49 (root@localhost mysql3306.sock)[(none)]>show variables like '%collation%';
    50 +----------------------+-----------------+
    51 | Variable_name        | Value           |
    52 +----------------------+-----------------+
    53 | collation_connection | utf8_general_ci |
    54 | collation_database   | utf8_general_ci |
    55 | collation_server     | utf8_general_ci |
    56 +----------------------+-----------------+
    57 3 rows in set (0.00 sec)
    58 
    59 //The two variables about "connection" was not altered this time.They were still utf8 relevant.
    Case of change character set of server & database from utf8 to latin1.
     1 (root@localhost mysql3306.sock)[(none)]>s
     2 --------------
     3 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
     4 
     5 Connection id:        7
     6 Current database:    
     7 Current user:        root@localhost
     8 SSL:            Not in use
     9 Current pager:        stdout
    10 Using outfile:        ''
    11 Using delimiter:    ;
    12 Server version:        5.7.21-log MySQL Community Server (GPL)
    13 Protocol version:    10
    14 Connection:        Localhost via UNIX socket
    15 Server characterset:    utf8
    16 Db     characterset:    utf8
    17 Client characterset:    utf8
    18 Conn.  characterset:    utf8
    19 UNIX socket:        /tmp/mysql3306.sock
    20 Uptime:            4 hours 50 min 33 sec
    21 
    22 Threads: 1  Questions: 92  Slow queries: 0  Opens: 111  Flush tables: 1  Open tables: 104  Queries per second avg: 0.005
    23 --------------
    24 
    25 (root@localhost mysql3306.sock)[(none)]>set character_set_server=latin1;
    26 Query OK, 0 rows affected (0.00 sec)
    27 
    28 (root@localhost mysql3306.sock)[(none)]>set character_set_database=latin1;
    29 Query OK, 0 rows affected, 1 warning (0.00 sec)
    30 
    31 (root@localhost mysql3306.sock)[(none)]>show warnings;
    32 +---------+------+-------------------------------------------------------------------------------------------------+
    33 | Level   | Code | Message                                                                                         |
    34 +---------+------+-------------------------------------------------------------------------------------------------+
    35 | Warning | 1681 | Updating 'character_set_database' is deprecated. It will be made read-only in a future release. |
    36 +---------+------+-------------------------------------------------------------------------------------------------+
    37 1 row in set (0.00 sec)
    38 
    39 (root@localhost mysql3306.sock)[(none)]>s
    40 --------------
    41 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
    42 
    43 Connection id:        7
    44 Current database:    
    45 Current user:        root@localhost
    46 SSL:            Not in use
    47 Current pager:        stdout
    48 Using outfile:        ''
    49 Using delimiter:    ;
    50 Server version:        5.7.21-log MySQL Community Server (GPL)
    51 Protocol version:    10
    52 Connection:        Localhost via UNIX socket
    53 Server characterset:    latin1
    54 Db     characterset:    latin1
    55 Client characterset:    utf8
    56 Conn.  characterset:    utf8
    57 UNIX socket:        /tmp/mysql3306.sock
    58 Uptime:            4 hours 51 min 0 sec
    59 
    60 Threads: 1  Questions: 98  Slow queries: 0  Opens: 111  Flush tables: 1  Open tables: 104  Queries per second avg: 0.005
    61 --------------
    62 
    63 //It shows that change "character_set_server" online is not supported in future release because of safety concern.
    64 //Change character set of database may bring about risk of data loss if your client program using the supersetwhile database using subset.(eg. client->utf8mb4,database->utf8,will lost emoji data.) 
    Example of messy code.
      1 (root@localhost mysql3306.sock)[zlm]>create table test_charset(
      2     -> s1 char(10) character set latin1 not null,
      3     -> s2 char(10) char set gbk,
      4     -> s3 varchar(10) charset utf8,
      5     -> s4 varchar(10)) character set=utf8mb4 engine=innodb;
      6 Query OK, 0 rows affected (0.01 sec)
      7 
      8 (root@localhost mysql3306.sock)[zlm]>show create table test_charset;
      9 +--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
     10 | Table        | Create Table                                                                                                                                                                                                                                             |
     11 +--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
     12 | test_charset | CREATE TABLE `test_charset` (
     13   `s1` char(10) CHARACTER SET latin1 NOT NULL,
     14   `s2` char(10) CHARACTER SET gbk DEFAULT NULL,
     15   `s3` varchar(10) CHARACTER SET utf8 DEFAULT NULL,
     16   `s4` varchar(10) DEFAULT NULL
     17 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
     18 +--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
     19 1 row in set (0.00 sec)
     20 
     21 (root@localhost mysql3306.sock)[zlm]>insert into test_charset values('ASCII','国标','万国','表情');
     22 Query OK, 1 row affected (0.00 sec)
     23 
     24 (root@localhost mysql3306.sock)[zlm]>select * from test_charset;
     25 +-------+--------+--------+--------+
     26 | s1    | s2     | s3     | s4     |
     27 +-------+--------+--------+--------+
     28 | ASCII | 国标   | 万国   | 表情   |
     29 +-------+--------+--------+--------+
     30 1 row in set (0.00 sec)
     31 
     32 (root@localhost mysql3306.sock)[zlm]>s
     33 --------------
     34 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
     35 
     36 Connection id:        9
     37 Current database:    zlm
     38 Current user:        root@localhost
     39 SSL:            Not in use
     40 Current pager:        stdout
     41 Using outfile:        ''
     42 Using delimiter:    ;
     43 Server version:        5.7.21-log MySQL Community Server (GPL)
     44 Protocol version:    10
     45 Connection:        Localhost via UNIX socket
     46 Server characterset:    utf8
     47 Db     characterset:    utf8
     48 Client characterset:    utf8
     49 Conn.  characterset:    utf8
     50 UNIX socket:        /tmp/mysql3306.sock
     51 Uptime:            5 hours 26 min 54 sec
     52 
     53 Threads: 1  Questions: 123  Slow queries: 0  Opens: 118  Flush tables: 1  Open tables: 111  Queries per second avg: 0.006
     54 --------------
     55 
     56 (root@localhost mysql3306.sock)[zlm]>set names latin1;
     57 Query OK, 0 rows affected (0.00 sec)
     58 
     59 (root@localhost mysql3306.sock)[zlm]>s
     60 --------------
     61 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
     62 
     63 Connection id:        9
     64 Current database:    zlm
     65 Current user:        root@localhost
     66 SSL:            Not in use
     67 Current pager:        stdout
     68 Using outfile:        ''
     69 Using delimiter:    ;
     70 Server version:        5.7.21-log MySQL Community Server (GPL)
     71 Protocol version:    10
     72 Connection:        Localhost via UNIX socket
     73 Server characterset:    utf8
     74 Db     characterset:    utf8
     75 Client characterset:    latin1
     76 Conn.  characterset:    latin1
     77 UNIX socket:        /tmp/mysql3306.sock
     78 Uptime:            5 hours 18 min 0 sec
     79 
     80 Threads: 1  Questions: 114  Slow queries: 0  Opens: 118  Flush tables: 1  Open tables: 111  Queries per second avg: 0.008
     81 --------------
     82 
     83 (root@localhost mysql3306.sock)[zlm]>
     84     
     85 (root@localhost mysql3306.sock)[zlm]>select * from test_charset;
     86 +-------+------+------+------+
     87 | s1    | s2   | s3   | s4   |
     88 +-------+------+------+------+
     89 | ASCII | ??   | ??   | ??   |
     90 +-------+------+------+------+
     91 1 row in set (0.00 sec)
     92 
     93 (root@localhost mysql3306.sock)[zlm]>set names gbk;
     94 Query OK, 0 rows affected (0.00 sec)
     95 
     96 (root@localhost mysql3306.sock)[zlm]>s
     97 --------------
     98 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
     99 
    100 Connection id:        9
    101 Current database:    zlm
    102 Current user:        root@localhost
    103 SSL:            Not in use
    104 Current pager:        stdout
    105 Using outfile:        ''
    106 Using delimiter:    ;
    107 Server version:        5.7.21-log MySQL Community Server (GPL)
    108 Protocol version:    10
    109 Connection:        Localhost via UNIX socket
    110 Server characterset:    utf8
    111 Db     characterset:    utf8
    112 Client characterset:    gbk
    113 Conn.  characterset:    gbk
    114 UNIX socket:        /tmp/mysql3306.sock
    115 Uptime:            5 hours 29 min 22 sec
    116 
    117 Threads: 1  Questions: 129  Slow queries: 0  Opens: 118  Flush tables: 1  Open tables: 111  Queries per second avg: 0.006
    118 --------------
    119 
    120 (root@localhost mysql3306.sock)[zlm]>select * from test_charset;
    121 +-------+------+------+------+
    122 | s1    | s2   | s3   | s4   |
    123 +-------+------+------+------+
    124 | ASCII | ¹螠   | β¹� | ±�     |
    125 +-------+------+------+------+
    126 1 row in set (0.00 sec)
    127 
    128 (root@localhost mysql3306.sock)[zlm]>set names utf8mb4;
    129 Query OK, 0 rows affected (0.00 sec)
    130 
    131 (root@localhost mysql3306.sock)[zlm]>s
    132 --------------
    133 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
    134 
    135 Connection id:        9
    136 Current database:    zlm
    137 Current user:        root@localhost
    138 SSL:            Not in use
    139 Current pager:        stdout
    140 Using outfile:        ''
    141 Using delimiter:    ;
    142 Server version:        5.7.21-log MySQL Community Server (GPL)
    143 Protocol version:    10
    144 Connection:        Localhost via UNIX socket
    145 Server characterset:    utf8
    146 Db     characterset:    utf8
    147 Client characterset:    utf8mb4
    148 Conn.  characterset:    utf8mb4
    149 UNIX socket:        /tmp/mysql3306.sock
    150 Uptime:            5 hours 30 min 15 sec
    151 
    152 Threads: 1  Questions: 134  Slow queries: 0  Opens: 118  Flush tables: 1  Open tables: 111  Queries per second avg: 0.006
    153 --------------
    154 
    155 (root@localhost mysql3306.sock)[zlm]>select * from test_charset;
    156 +-------+--------+--------+--------+
    157 | s1    | s2     | s3     | s4     |
    158 +-------+--------+--------+--------+
    159 | ASCII | 国标   | 万国   | 表情   |
    160 +-------+--------+--------+--------+
    161 1 row in set (0.00 sec)
    162 
    163 //MySQL support define character set on database,table even on a single column.
    164 //Messy code will occur when "character_set_result" is subset of the value of character set which has been stored only if it turns back to the value equal or bigger than the stored value.
    Example of losting data.
      1 (root@localhost mysql3306.sock)[(none)]>s
      2 --------------
      3 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
      4 
      5 Connection id:        4
      6 Current database:    
      7 Current user:        root@localhost
      8 SSL:            Not in use
      9 Current pager:        stdout
     10 Using outfile:        ''
     11 Using delimiter:    ;
     12 Server version:        5.7.21-log MySQL Community Server (GPL)
     13 Protocol version:    10
     14 Connection:        Localhost via UNIX socket
     15 Server characterset:    utf8
     16 Db     characterset:    utf8
     17 Client characterset:    utf8
     18 Conn.  characterset:    utf8
     19 UNIX socket:        /tmp/mysql3306.sock
     20 Uptime:            1 min 45 sec
     21 
     22 Threads: 2  Questions: 23  Slow queries: 0  Opens: 108  Flush tables: 1  Open tables: 101  Queries per second avg: 0.219
     23 --------------
     24 
     25 (root@localhost mysql3306.sock)[(none)]>set @@character_set_server=latin1;
     26 Query OK, 0 rows affected (0.00 sec)
     27 
     28 (root@localhost mysql3306.sock)[(none)]>set @@character_set_database=latin1;
     29 Query OK, 0 rows affected, 1 warning (0.01 sec)
     30 
     31 (root@localhost mysql3306.sock)[(none)]>set @@character_set_connection=latin1;
     32 Query OK, 0 rows affected (0.00 sec)
     33 
     34 (root@localhost mysql3306.sock)[(none)]>s
     35 --------------
     36 mysql  Ver 14.14 Distrib 5.7.21, for linux-glibc2.12 (x86_64) using  EditLine wrapper
     37 
     38 Connection id:        4
     39 Current database:    
     40 Current user:        root@localhost
     41 SSL:            Not in use
     42 Current pager:        stdout
     43 Using outfile:        ''
     44 Using delimiter:    ;
     45 Server version:        5.7.21-log MySQL Community Server (GPL)
     46 Protocol version:    10
     47 Connection:        Localhost via UNIX socket
     48 Server characterset:    latin1
     49 Db     characterset:    latin1
     50 Client characterset:    utf8
     51 Conn.  characterset:    latin1
     52 UNIX socket:        /tmp/mysql3306.sock
     53 Uptime:            2 min 16 sec
     54 
     55 Threads: 2  Questions: 29  Slow queries: 0  Opens: 108  Flush tables: 1  Open tables: 101  Queries per second avg: 0.213
     56 --------------
     57 
     58 (root@localhost mysql3306.sock)[(none)]>insert into test_charset values('ASCII','国标','万国','表情');
     59 ERROR 1046 (3D000): No database selected
     60 (root@localhost mysql3306.sock)[(none)]>use zlm
     61 Reading table information for completion of table and column names
     62 You can turn off this feature to get a quicker startup with -A
     63 
     64 Database changed
     65 (root@localhost mysql3306.sock)[zlm]>insert into test_charset values('ASCII','国标','万国','表情');
     66 Query OK, 1 row affected, 3 warnings (0.01 sec)
     67 
     68 (root@localhost mysql3306.sock)[zlm]>show warnings;
     69 +---------+------+-----------------------------------------------------------+
     70 | Level   | Code | Message                                                   |
     71 +---------+------+-----------------------------------------------------------+
     72 | Warning | 1300 | Invalid utf8 character string: 'xE5x9BxBDxE6xA0x87' |
     73 | Warning | 1300 | Invalid utf8 character string: 'xE4xB8x87xE5x9BxBD' |
     74 | Warning | 1300 | Invalid utf8 character string: 'xE8xA1xA8xE6x83x85' |
     75 +---------+------+-----------------------------------------------------------+
     76 3 rows in set (0.00 sec)
     77 
     78 (root@localhost mysql3306.sock)[zlm]>select @@character_set_results;
     79 +-------------------------+
     80 | @@character_set_results |
     81 +-------------------------+
     82 | utf8                    |
     83 +-------------------------+
     84 1 row in set (0.00 sec)
     85 
     86 (root@localhost mysql3306.sock)[zlm]>select * from test_charset;
     87 +-------+--------+--------+--------+
     88 | s1    | s2     | s3     | s4     |
     89 +-------+--------+--------+--------+
     90 | ASCII | 国标   | 万国   | 表情   |
     91 | ASCII | ??     | ??     | ??     |
     92 +-------+--------+--------+--------+
     93 2 rows in set (0.00 sec)
     94 
     95 (root@localhost mysql3306.sock)[zlm]>set @@character_set_results=latin1;
     96 Query OK, 0 rows affected (0.00 sec)
     97 
     98 (root@localhost mysql3306.sock)[zlm]>select @@character_set_results;
     99 +-------------------------+
    100 | @@character_set_results |
    101 +-------------------------+
    102 | latin1                  |
    103 +-------------------------+
    104 1 row in set (0.00 sec)
    105 
    106 (root@localhost mysql3306.sock)[zlm]>select * from test_charset;
    107 +-------+------+------+------+
    108 | s1    | s2   | s3   | s4   |
    109 +-------+------+------+------+
    110 | ASCII | ??   | ??   | ??   |
    111 | ASCII | ??   | ??   | ??   |
    112 +-------+------+------+------+
    113 2 rows in set (0.00 sec)
    114 
    115 //The data of first row has been correctly resored in database.
    116 //The data of second row has lost the Chinese character data.
    117 //The value of "character_set_result" only influence the screen output.
    118 //There's a data loss risk while value of character set of client is superset of the one of database.
    Summary
    • MySQL character set is flexible and various,be more careful when modify data.
    • The parameter "default_character_set" only affect original mysql client not for the other client tools.
    • Make sure your character set of client is a subset but superset of the value of database when modifying data.Meanwhile it needs to be small than character set of connection to avoid data loss.
    • It's recommended to set character set of database to a big set such as utf8 even utf8mb4 to be compatible with most characters of various languages.
     
    版权声明:本文为博主原创文章,如需转载请保留此声明及博客链接,谢谢!
    博客地址: http://www.cnblogs.com/aaron8219 & http://blog.csdn.net/aaron8219
  • 相关阅读:
    【BZOJ3144】切糕(HNOI2013)-最小割
    【BZOJ1934】善意的投票(SHOI2007)-最小割
    【BZOJ2125】最短路-圆方树+倍增LCA
    【BZOJ4868】期末考试(六省联考2017)-三分
    【BZOJ1951】古代猪文(SDOI2010)-数论大集合
    【BZOJ2257】瓶子和燃料(JSOI2009)-裴蜀定理+排序
    【BZOJ1485】有趣的数列(HNOI2009)-卡特兰数+线性筛
    【51Nod1952】栈-单调栈+单调队列
    【BZOJ4517】排列计数(SDOI2016)-组合数学:错排
    【BZOJ2111】排列计数(ZJOI2010)-DP+Lucas定理
  • 原文地址:https://www.cnblogs.com/aaron8219/p/9256331.html
Copyright © 2020-2023  润新知