MYSQL的CHAR和VARCHAR注意事项以及binary和varbinary存储方式是怎样的

126次阅读

共计 7037 个字符，预计需要花费 18 分钟才能阅读完成。

行业资讯
数据库
MySQL 数据库
MYSQL 的 CHAR 和 VARCHAR 注意事项以及 binary 和 varbinary 存储方式是怎样的

MYSQL 的 CHAR 和 VARCHAR 注意事项以及 binary 和 varbinary 存储方式是怎样的，针对这个问题，这篇文章详细介绍了相对应的分析和解答，希望可以帮助更多想解决这个问题的小伙伴找到更简单易行的方法。

版本使用 5.7 存储引擎 INNODB 行格式 Dynamic
从概念上讲他们用于存放字符型数据，其允许的范围：
1、char 0-255 bytes，固定长度，不够的长度用 ASCII 0X20 进行补足。
2、varchar 0-65535(2^8-1)bytes, 注意是整个表所有的 varchar 字段的长度，所谓可变长度，就是按需分配空间。

下面就几个问题进行讨论：

第一个问题：varchar 的可变长度
那么这里引出了第一个问题，可变长度，在 INNODB(Dynamic 行格式)，在行头使用，1- 2 个字节
来存储这个可变长度及：
variable field lengths (1-2 bytes* var)
(具体参考 http://blog.itpub.net/7728585/viewspace-2071787/)
2 个字节也刚好是 65535 的长度，这是 INNODB 对 MYSQL 的一个实现方法，同时如果使用 5.7 INNODB
online DDL 进行 modify varchar column 的长度，在 1 -255 和 256-65535 之间都可以迅速完成，但是
如果跨越比如改变一个 varchar 字段的长度从 250 到 300 注意是字节，就会出现需要使用
inpace 或者 copy 等方法，那就非常慢了，原因也在这里因为涉及到行头的扩张了，以前是一
个字节来存储长度，而改变后需要二个字节，当然也就需要重新组织表，而如果不跨越就不
会改变表的组织方式，也就值需要修改数据字典和 frm 文件而已，当然瞬间完成，下面来做
一个测试。对于 UTF8 字符集，它的这个点是 255/3=85。
注意使用版本 5.7 引擎为 innodb 行格式为 Dynamic，并且这一列不能有索引，如果有索引
索引会带入而外的操作，也是比较慢的
mysql select count(*) from testshared3;
+———-+
| count(*) |
+———-+
| 1048576 |
+———-+
1 row in set (0.35 sec)

mysql show create table testshared3;
+————-+———————————————————————————————————————————-+
| Table | Create Table |
+————-+———————————————————————————————————————————-+
| testshared3 | CREATE TABLE `testshared3` (
`id` int(11) DEFAULT NULL,
`name` varchar(13) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+————-+———————————————————————————————————————————-+
1 row in set (0.01 sec)

mysql alter table testshared3 change name name varchar(85) ;
Query OK, 0 rows affected (0.07 sec)
Records: 0 Duplicates: 0 Warnings: 0

可以看到虽然有 1048576 行的数据但是 modify 还是瞬间完成了。但是如果从 85 改到 86 如何呢？
mysql alter table testshared3 ALGORITHM=INPLACE, change name name varchar(86) ;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.
mysql alter table testshared3 change name name varchar(86) ;
Query OK, 1048576 rows affected (15.68 sec)
Records: 1048576 Duplicates: 0 Warnings: 0
可以看到使用了 15 秒多，而且 ALGORITHM=COPY。

第二个问题：关于 char 和 varchar 左空格存储以及显示的不同

mysql create table testvc(name1 varchar(5),name2 char(5));
Query OK, 0 rows affected (0.08 sec)

mysql insert into testvc values(gao , gao
Query OK, 1 row affected (0.01 sec)
mysql select concat(name1,) ),concat(name2,) ) from testvc;
+——————-+——————-+
| concat(name1,) ) | concat(name2,) ) |
+——————-+——————-+
| gao ) | gao) |
+——————-+——————-+
1 row in set (0.06 sec)
可以看到 varchar 可以正常显示 gao 后面的空格，而 char 却不行，那么他们内部到底是如何存储的，我们需要
用二进制方式查看一下：
(下面是我解析好的，具体的方法参考 http://blog.itpub.net/7728585/viewspace-2071787/ 需要用到我自己
写的几个小工具)
04 –varchar 长度
00 –NULL 位图
00
0010
0024
00000089a25
40000002e0c1bd9
000001640110
67616f20 –varchar(5) gao
67616f2020 –char(5) gao

这里我们可以明显看到 varchar 的长度为 4，当存储 varchar 的 gao 的时候存储的是 0X67616f20 也就是 ASCII 的 gao , 当存储 char 类型的 gao
的时候为 0X67616f2020，可以看到他后面有两个 0X20，也就是 ASCII 的空格那么我们可以知道 char(5)会对不够的字节全部补上 ASCII 0X20，这也就是
为什么输出的时候空格不在了，因为了 char 字段中存储的时候尾部的 0X20 作为了补足的字符，而 VARCHAR 中却不是这样 0X20 作为了实际的字符，也就
是空格那么输出就有了。

第三个问题：比较和 varchar 以及 char 尾部的空格。
在 MYSQL 文档中描述：
This means that all CHAR, VARCHAR, and TEXT values in MySQL are compared without regard to any trailing spaces
“Comparison”in this context does not include the LIKEpattern-matching operator
For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has
an index that requires unique values, inserting into the column values that differ only in number of trailing
pad characters will result in a duplicate-key error.

也就是除了 LIKE 的比较操作，都会忽略尾部空格不管是 VARCHAR CHAR 还是 TEXT，并且如果字段是唯一键，唯一性判断的时候
也会忽略空格。

还是刚才的表我们在 varchar 的 name1 上加上唯一索引。
mysql alter table testvc add unique key(name1);
Query OK, 0 rows affected (0.07 sec)
Records: 0 Duplicates: 0 Warnings: 0

我们刚才插入的数据为
insert into testvc values(gao , gao

mysql select * from testvc where name1= gao
+——-+——-+
| name1 | name2 |
+——-+——-+
| gao | gao |
+——-+——-+
1 row in set (0.00 sec)

可以看到不管存储有没有 0X20 空格，也不管条件 = 后面是否有 0X20 空格
都会查询出来，我们来测试一下插入

mysql insert into testvc values(gao , gao10
ERROR 1062 (23000): Duplicate entry gao for key name1
mysql insert into testvc values(gao , gao10
ERROR 1062 (23000): Duplicate entry gao for key name1

不管我插入的是 gao 还是 gao 都是重复的值，证明的文档的说法，另外
这个问题在 ORACLE 中是不存在，MYSQL 也比较奇怪。很多 ORACLE 的概念在 MYSQL
中需要打一个问号。
ORACLE:
SQL create table testui1(name varchar2(20));
Table created
SQL create unique index testuiq_IDX on testui1(name);
Index created
SQL insert into testui1 values(gao
1 row inserted
SQL insert into testui1 values(gao
1 row inserted
SQL insert into testui1 values(gao
1 row inserted
SQL commit;
Commit complete

接下来看看 LIKE：
varchar：
mysql select * from testvc where name1 like gao %
+——-+——-+
| name1 | name2 |
+——-+——-+
| gao | gao |
+——-+——-+
1 row in set (0.00 sec)

mysql select * from testvc where name1= gao
+——-+——-+
| name1 | name2 |
+——-+——-+
| gao | gao |
+——-+——-+
1 row in set (0.00 sec)

char：
mysql select * from testvc where name2 like gao %
Empty set (0.00 sec)

mysql select * from testvc where name2= gao
+——-+——-+
| name1 | name2 |
+——-+——-+
| gao | gao |
+——-+——-+
1 row in set (0.00 sec)

这里 char name2 like gao % 没有出来数据，而 varchar name1 like gao %
出来了数据也正是证明了我们对存储格式的剖析，因为 char 对尾部的 0X20 空格在
存储的时候已经去掉了，但是 VARCHAR 没有，只要 LIKE 是严格匹配就会出来这样的
效果。

最后来看看 MYSQL 的 binary 和 varbinary 格式，这种格式就是说明其存储和比较都使用二进制格式，也就是按照一个
字节一个字节的比较 ASCII 值，官方描述如下：
they contain byte strings rather than character strings. This
means that they have no character set, and sorting and comparison are based on the numeric values of
the bytes in the values.
他们的意义和 char 和 varchar 相似，但是有一点，其比较方法和存储方法不太一样
binary 使用 0X00 也就是 \0 补足不够的字节，而其比较也是严格和存储中的格式进行
匹配不存在 char 和 varchar 那样对空格的处理

mysql insert into testbin3 values(a , a
Query OK, 1 row affected (0.03 sec)

mysql select * from testbin3 where name1= a
+——-+————+
| name1 | name2 |
+——-+————+
| a | a |
+——-+————+
1 row in set (0.00 sec)

mysql select * from testbin3 where name2= a
Empty set (0.00 sec)

mysql select * from testbin3 where name2= a \0\0\0\0\0\0\0\0
+——-+————+
| name1 | name2 |
+——-+————+
| a | a |
+——-+————+
1 row in set (0.00 sec)

可以看到 varbinary 使用 a 可以查询到记录但是 binary 使用 a 不能查到，为什么呢？
我们看看他的内部存储
00000089a25f
0000002e0c66bc
0000012a0110

6120 –binary a
612000000000000000 –varbinary a

可以看到 varbinary 使用 8 个 0X00 进行补足，既然他严格按照而进行进行匹配那么我们这样可以
查出数据：

mysql select * from testbin3 where name2= a \0\0\0\0\0\0\0\0
+——-+————+
| name1 | name2 |
+——-+————+
| a | a |
+——-+————+
1 row in set (0.00 sec)
当然 unique 也是严格按照而进行进行比较
增加一个 unique key 在 binary 上
mysql alter table testbin3 add unique key(name2);
mysql insert into testbin3 values(a , a \0\0\0\0\0\0\0\0
ERROR 1062 (23000): Duplicate entry a for key name2
可以看到重复的行

关于 MYSQL 的 CHAR 和 VARCHAR 注意事项以及 binary 和 varbinary 存储方式是怎样的问题的解答就分享到这里了，希望以上内容可以对大家有一定的帮助，如果你还有很多疑惑没有解开，可以关注丸趣 TV 行业资讯频道了解更多相关知识。

正文完