收藏文章 楼主

MySQL utf8和utf-8mb4编码表有哪些重要的查询区别?

版块:编程开发   类型:普通   作者:小羊羔links   查看:752   回复:0   获赞:0   时间:2022-02-11 20:47:08
MySQL utf8和utf-8mb4编码表有哪些重要的查询区别?MySQL在5.5.3之后开始引入utf8mb4(most bytes 4)编码,主要用来兼容4字节的unicode编码。而Utf8是最大长度是3字节,遇到4字节的unicode编码uft8mb4,比如某些表情或者少见的汉字等,会存在异常情况;需要使用uft8mb4编码,同时utf8也是uft8mb4的子集。在使用中,会经常存在两种编码的表共存的情况,在关联查询中,语法没有问题,都可以使用各种join操作,但通过测试案例发现,在性能上会存在较大差别。
案例说明,构建两种编码字符的表,进行left join操作,分别作为驱动表进行测试,观察执行计划。

一、准备
1、建表,字符字段加索引

mysql> show create table testc_2\G;

*************************** 1. row ***************************

       Table: testc_2

Create Table: CREATE TABLE `testc_2` (

  `c1` int NOT NULL,

  `c2` int DEFAULT NULL,

  `c3` varchar(100) DEFAULT NULL,

  `c4` char(100) DEFAULT NULL,

  `c5` datetime(6) DEFAULT NULL,

  PRIMARY KEY (`c1`),

  KEY `inc4` (`c4`),

  KEY `inc3` (`c3`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

1 row in set (0.00 sec)

mysql> show create table testc_1\G;

*************************** 1. row ***************************

       Table: testc_1

Create Table: CREATE TABLE `testc_1` (

  `c1` int NOT NULL,

  `c2` int DEFAULT NULL,

  `c3` varchar(100) DEFAULT NULL,

  `c4` char(100) DEFAULT NULL,

  `c5` datetime(6) DEFAULT NULL,

  PRIMARY KEY (`c1`),

  KEY `inc4` (`c4`),

  KEY `inc3` (`c3`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8

1 row in set (0.00 sec

2、导数

mysql> insert into testc_2 select * from testc_1;

Query OK, 400934 rows affected (4.83 sec)

Records: 400934  Duplicates: 0  Warnings

3、检查字段编码

mysql> show full columns from testc_2;

+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+

| Field | Type         | Collation          | Null | Key | Default | Extra | Privileges                      | Comment |

+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+

| c1    | int          | NULL               | NO   | PRI | NULL    |       | select,insert,update,references |         |

| c2    | int          | NULL               | YES  |     | NULL    |       | select,insert,update,references |         |

c3    | varchar(100) | utf8mb4_0900_ai_ci | YES  | MUL | NULL    |       | select,insert,update,references |         |

c4    | char(100)    | utf8mb4_0900_ai_ci | YES  | MUL | NULL    |       | select,insert,update,references |         |

| c5    | datetime(6)  | NULL               | YES  |     | NULL    |       | select,insert,update,references |         |

+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+

5 rows in set (0.00 sec)

mysql> show full columns from testc_1;

+-------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+

| Field | Type         | Collation       | Null | Key | Default | Extra | Privileges                      | Comment |

+-------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+

| c1    | int          | NULL            | NO   | PRI | NULL    |       | select,insert,update,references |         |

| c2    | int          | NULL            | YES  |     | NULL    |       | select,insert,update,references |         |

c3    | varchar(100) | utf8_general_ci | YES  | MUL | NULL    |       | select,insert,update,references |         |

c4    | char(100)    | utf8_general_ci | YES  | MUL | NULL    |       | select,insert,update,references |         |

| c5    | datetime(6)  | NULL            | YES  |     | NULL    |       | select,insert,update,references |         |

+-------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+

4、查询数据量

mysql> select count(*) from testc_1;

+----------+

| count(*) |

+----------+

|   400934 |

+----------

5、检查索引散列情况

mysql> analyze table testc_2;

+-------------+---------+----------+----------+

| Table       | Op      | Msg_type | Msg_text |

+-------------+---------+----------+----------+

| sms.testc_2 | analyze | status   | OK       |

+-------------+---------+----------+----------+

1 row in set (0.00 sec)


mysql> analyze table testc_1;

+-------------+---------+----------+----------+

| Table       | Op      | Msg_type | Msg_text |

+-------------+---------+----------+----------+

| sms.testc_1 | analyze | status   | OK       |

+-------------+---------+----------+----------+

1 row in set (0.00 sec)


mysql> show index from testc_1\G;

*************************** 1. row ***************************

        Table: testc_1

   Non_unique: 0

     Key_name: PRIMARY

 Seq_in_index: 1

  Column_name: c1

    Collation: A

  Cardinality: 361395

     Sub_part: NULL

       Packed: NULL

         Null:

   Index_type: BTREE

      Comment:

Index_comment:

      Visible: YES

   Expression: NULL

*************************** 2. row ***************************

        Table: testc_1

   Non_unique: 1

     Key_name: inc4

 Seq_in_index: 1

  Column_name: c4

    Collation: A

  Cardinality: 361395

     Sub_part: NULL

       Packed: NULL

         Null: YES

   Index_type: BTREE

      Comment:

Index_comment:

      Visible: YES

   Expression: NULL

*************************** 3. row ***************************

        Table: testc_1

   Non_unique: 1

     Key_name: inc3

 Seq_in_index: 1

  Column_name: c3

    Collation: A

  Cardinality: 361395

     Sub_part: NULL

       Packed: NULL

         Null: YES

   Index_type: BTREE

      Comment:

Index_comment:

      Visible: YES

   Expression: NULL

3 rows in set (0.00 sec)


ERROR:

No query specified


mysql> show index from testc_2\G;

*************************** 1. row ***************************

        Table: testc_2

   Non_unique: 0

     Key_name: PRIMARY

 Seq_in_index: 1

  Column_name: c1

    Collation: A

  Cardinality: 395232

     Sub_part: NULL

       Packed: NULL

         Null:

   Index_type: BTREE

      Comment:

Index_comment:

      Visible: YES

   Expression: NULL

*************************** 2. row ***************************

        Table: testc_2

   Non_unique: 1

     Key_name: inc4

 Seq_in_index: 1

  Column_name: c4

    Collation: A

  Cardinality: 395232

     Sub_part: NULL

       Packed: NULL

         Null: YES

   Index_type: BTREE

      Comment:

Index_comment:

      Visible: YES

   Expression: NULL

*************************** 3. row ***************************

        Table: testc_2

   Non_unique: 1

     Key_name: inc3

 Seq_in_index: 1

  Column_name: c3

    Collation: A

  Cardinality: 395232

     Sub_part: NULL

       Packed: NULL

         Null: YES

   Index_type: BTREE

      Comment:

Index_comment:

      Visible: YES

   Expression: NULL

3 rows in set (0.01 sec)


二、测试案例

建表时,testc_2是utf8mb4编码,testc_1是utf8编码

1、left join 驱动表utf8mb4

mysql> desc select * from testc_2 a left join testc_1 b on a.c4=b.c4 where a.c3 = 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb10009' \G;

*************************** 1. row ***************************

           id: 1

  select_type: SIMPLE

        table: a

   partitions: NULL

         type: ref

possible_keys: inc3

          key: inc3

      key_len: 403

          ref: const

         rows: 1

     filtered: 100.00

        Extra: NULL

*************************** 2. row ***************************

           id: 1

  select_type: SIMPLE

        table: b

   partitions: NULL

         type: ALL

possible_keys: NULL

          key: NULL

      key_len: NULL

          ref: NULL

         rows: 361395

     filtered: 100.00

        Extra: Using where; Using join buffer (hash join)


mysql> show warnings \G;

*************************** 1. row ***************************

  Level: Note

   Code: 1003

Message: /* select#1 */ select `sms`.`a`.`c1` AS `c1`,`sms`.`a`.`c2` AS `c2`,`sms`.`a`.`c3` AS `c3`,`sms`.`a`.`c4` AS `c4`,`sms`.`a`.`c5` AS `c5`,`sms`.`b`.`c1` AS `c1`,`sms`.`b`.`c2` AS `c2`,`sms`.`b`.`c3` AS `c3`,`sms`.`b`.`c4` AS `c4`,`sms`.`b`.`c5` AS `c5` from `sms`.`testc_2` `a` left join `sms`.`testc_1` `b` on((`sms`.`a`.`c4` = convert(`sms`.`b`.`c4` using utf8mb4))) where (`sms`.`a`.`c3` = 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb10009')

1 row in set (0.00 sec)

结论:通过执行计划发现,被驱动表(utf8)索引失效,发生全表扫描;原因发生了编码转换(sms`.`a`.`c4` = convert(`sms`.`b`.`c4` using utf8mb4)),遵循由小到大转换规则。

2、left join 驱动表utf8

mysql> desc select * from testc_1 a left join testc_2 b on a.c4=b.c4 where a.c3 = 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb10009' \G;

*************************** 1. row ***************************

           id: 1

  select_type: SIMPLE

        table: a

   partitions: NULL

         type: ref

possible_keys: inc3

          key: inc3

      key_len: 303

          ref: const

         rows: 1

     filtered: 100.00

        Extra: NULL

*************************** 2. row ***************************

           id: 1

  select_type: SIMPLE

        table: b

   partitions: NULL

         type: ref

possible_keys: inc4

          key: inc4

      key_len: 401

          ref: func

         rows: 1

     filtered: 100.00

        Extra: Using where

结论:两表全走索引,未发生编码转换。


三、结论

Mysql两种编码的表共存时,当发生关联查询时,驱动表若是超集(utf8mb4),则会存在被驱动表(utf8)索引失效全表扫描的性能问题;反之不会发生;在开发和使用中,尽量编码一致,避免入坑。

小羊羔锚文本外链网站长https://seo-links.cn 
回复列表
默认   热门   正序   倒序

回复:MySQL utf8和utf-8mb4编码表有哪些重要的查询区别?

Powered by 小羊羔外链网 8.3.11

©2015 - 2024 小羊羔外链网

免费发软文外链 鄂ICP备16014738号-6

您的IP:3.14.132.214,2024-04-20 00:13:12,Processed in 0.05653 second(s).

支持原创软件,抵制盗版,共创美好明天!
头像

用户名:

粉丝数:

签名:

资料 关注 好友 消息