mediumtext 长度_InnoDB表行长度超限问题的处理和思考
InnoDB表行長度超限問題的處理和思考
問題描述:
今天在在做MySQL數據庫字符集升級測試,采用數據導出、修改表結構文件字符集為utf8mb4、再導入數據的方式,在導入表結構時遇到以下錯誤:
ERROR?1118?(42000)?at?line?38436:?Row?size?too?large.?The?maximum?row?size?for?the?used?table?type,?not?counting?BLOBs,?is?65535.?This?includes?storage?overhead,?check?the?manual.?You?have?to?change?some?columns?to?TEXT?or?BLOBs錯誤提示很清楚,行定義的長度超過了InnoDB限制的65535字節,建議把表中的某些列改為TEXT或BLOB。經檢查,這張業務表中定義了多個varchar2(4000)列,在utf8字符集時,長度在65535字節內,但是當我們修改表的字符集為utf8mb4時,一行總的數據長度超過了65535字節,所以報以上錯誤。
一、創建實驗環境
1.配置信息:
CPU:2核8C內存:16G
操作系統:RHEL7.4
磁盤:200G?SATA?XFS
MySQL:5.7.30
2.創建測試表t1,并插入測試數據
create?table?t1(????id?int?unsigned?NOT?NULL?AUTO_INCREMENT,
????col1?varchar(4000),
????col2?varchar(4000),
????col3?varchar(4000),
????col4?varchar(4000),
????col5?varchar(4000),
????PRIMARY?KEY?(id)
)ENGINE=InnoDB?DEFAULT?CHARSET=utf8;
insert?into?t1(col1,col2,col3,col4,col5)?select?...?from?...;
3.修改測試表字符集為utf8mb4
root@mysql3307.sock?23:18:58[wydb]>?ALTER?TABLE?t1?CONVERT?TO?CHARACTER?SET?utf8mb4;ERROR?1118?(42000):?Row?size?too?large.?The?maximum?row?size?for?the?used?table?type,?not?counting?BLOBs,?is?65535.?This?includes?storage?overhead,?check?the?manual.?You?have?to?change?some?columns?to?TEXT?or?BLOBs
這時錯誤重現了,根據提示,我們分別測試把字段修改為TEXT和BLOB的影響。
二、問題解決
1.修改字段類型為TEXT或BLOB
alter?table?t1?modify?col1?text;--?表結構如下:
Create?Table:?CREATE?TABLE?`t1`?(
??`id`?int(10)?unsigned?NOT?NULL?AUTO_INCREMENT,
??`col1`?text,
??`col2`?varchar(4000)?DEFAULT?NULL,
??`col3`?varchar(4000)?DEFAULT?NULL,
??`col4`?varchar(4000)?DEFAULT?NULL,
??`col5`?varchar(4000)?DEFAULT?NULL,
??PRIMARY?KEY?(`id`)
)?ENGINE=InnoDB?AUTO_INCREMENT=1024?DEFAULT?CHARSET=utf8
2.再次修改表字符集
root@mysql3307.sock?23:31:53[wydb]>?ALTER?TABLE?t1?CONVERT?TO?CHARACTER?SET?utf8mb4;Create?Table:?CREATE?TABLE?`t1`?(
??`id`?int(10)?unsigned?NOT?NULL?AUTO_INCREMENT,
??`col1`?mediumtext,
??`col2`?varchar(4000)?DEFAULT?NULL,
??`col3`?varchar(4000)?DEFAULT?NULL,
??`col4`?varchar(4000)?DEFAULT?NULL,
??`col5`?varchar(4000)?DEFAULT?NULL,
??PRIMARY?KEY?(`id`),
??KEY?`col1`?(`col1`(10))
)?ENGINE=InnoDB?AUTO_INCREMENT=1024?DEFAULT?CHARSET=utf8mb4
--?表字符集修改成功,檢查表中列的字符集也修改成功
root@mysql3307.sock?23:35:04[wydb]>?select?COLUMN_NAME,CHARACTER_SET_NAME?from?information_schema.COLUMNS?where?table_name='t1';
+-------------+--------------------+
|?COLUMN_NAME?|?CHARACTER_SET_NAME?|
+-------------+--------------------+
|?id??????????|?NULL???????????????|
|?col1????????|?utf8mb4????????????|
|?col2????????|?utf8mb4????????????|
|?col3????????|?utf8mb4????????????|
|?col4????????|?utf8mb4????????????|
|?col5????????|?utf8mb4????????????|
+-------------+--------------------+
6?rows?in?set?(0.01?sec)
三、修改字段類型對索引影響
通過修改字段類型,解決行長度報錯后,對列上已經創建的索引是否有影響,我們通過實驗給出答案。
1.根據列的選擇性,創建前綴索引
>?select?count(distinct?col1)/count(*)?from?t1;+-------------------------------+
|?count(distinct?col1)/count(*)?|
+-------------------------------+
|????????????????????????0.9298?|
+-------------------------------+
1?row?in?set?(0.01?sec)
>?select?count(distinct?left(col1,10))/count(*)?from?t1;
+----------------------------------------+
|?count(distinct?left(col1,10))/count(*)?|
+----------------------------------------+
|?????????????????????????????????0.9038?|
+----------------------------------------+
1?row?in?set?(0.00?sec)
--?當前綴索引為10個字符時,已經很接近0.9298
>?alter?table?t1?add?key?(col1(10));
2.修改字段類型前SQL執行計劃
root@mysql3307.sock?23:10:19[wydb]>?desc?select?*?from?t1?where?col1?like?'品牌%';+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|?id?|?select_type?|?table?|?partitions?|?type??|?possible_keys?|?key??|?key_len?|?ref??|?rows?|?filtered?|?Extra???????|
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|??1?|?SIMPLE??????|?t1????|?NULL???????|?range?|?col1??????????|?col1?|?33??????|?NULL?|????3?|???100.00?|?Using?where?|
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1?row?in?set,?1?warning?(0.00?sec)
從執行計劃上看到,本次查詢使用了我們創建的前綴索引,key_len的值也沒問題。key_len:10 * 3 + 2(變長) + 1(可為null) = 33
3.修改字段,類型:TEXT,字符集:utf8mb4
再次查看執行計劃
root@mysql3307.sock?23:36:10[wydb]>?desc?select?*?from?t1?where?col1?like?'品牌%';+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|?id?|?select_type?|?table?|?partitions?|?type??|?possible_keys?|?key??|?key_len?|?ref??|?rows?|?filtered?|?Extra???????|
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|??1?|?SIMPLE??????|?t1????|?NULL???????|?range?|?col1??????????|?col1?|?43??????|?NULL?|????3?|???100.00?|?Using?where?|
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1?row?in?set,?1?warning?(0.00?sec)
可見,索引使用正常,由于字符集改為utf8mb4,key_len變成了43
key_len:10 * 4 + 2(變長) + 1(可為null) = 43
4.修改字段,類型:BLOB
再次查看執行計劃
root@mysql3307.sock?23:39:53[wydb]>?alter?table?t1?modify?col1?blob;Query?OK,?655?rows?affected?(0.05?sec)
Records:?655??Duplicates:?0??Warnings:?0
root@mysql3307.sock?23:40:05[wydb]>?desc?select?*?from?t1?where?col1?like?'品牌%';
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|?id?|?select_type?|?table?|?partitions?|?type??|?possible_keys?|?key??|?key_len?|?ref??|?rows?|?filtered?|?Extra???????|
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|??1?|?SIMPLE??????|?t1????|?NULL???????|?range?|?col1??????????|?col1?|?13??????|?NULL?|????3?|???100.00?|?Using?where?|
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1?row?in?set,?1?warning?(0.00?sec)
這時,雖然索引仍能用到,但key_len變成了13
key_len:10 + 2(變長) + 1(可為null) = 13
因為BLOB存儲的是二進制數據,沒有字符集概念,索引col1(10)中10為字節數,如果存儲中文字符,注意對齊,否則可能亂碼。
四、修改字段類型對應用程序的影響
1.以python程序為例,進行驗證,編寫如下驗證程序
#?coding:utf-8import?pymysql
if?__name__?==?'__main__':
????conn?=?pymysql.connect('10.1.135.129',?'wy',?'orange',?'wydb',?3307)
????cur?=?conn.cursor()
????sql?=?"select?id,col1,col2,col3,col4,col5?from?t1?where?col1?like?'品牌%'"
????cur.execute(sql)
????rows?=?cur.fetchall()
????for?row?in?rows:
????????print(row)
????????print(row[1].decode('utf8'))
2.修改列類型前,python程序查詢結果
(208,?'品牌',?'',?'aima',?'',?None)(4,?'品牌型號',?'',?'不銹鋼',?'',?None)
(5,?'品牌型號',?'',?'HK-12345',?'',?None)
3.修改列類型為TEXT,python程序查詢結果
(208,?'品牌',?'',?'aima',?'',?None)(4,?'品牌型號',?'',?'不銹鋼',?'',?None)
(5,?'品牌型號',?'',?'HK-12345',?'',?None)
程序不用任何改動,返回結果正常。
3.修改列類型為BLOB,python程序查詢結果
(208,?b'\xe5\x93\x81\xe7\x89\x8c',?'',?'aima',?'',?None)(4,?b'\xe5\x93\x81\xe7\x89\x8c\xe5\x9e\x8b\xe5\x8f\xb7',?'',?'不銹鋼',?'',?None)
(5,?b'\xe5\x93\x81\xe7\x89\x8c\xe5\x9e\x8b\xe5\x8f\xb7',?'',?'HK-12345',?'',?None)
可見,中文字符返回的是bytes,應用程序需要轉碼,把bytes轉為utf8,python中使用decode('utf8')函數實現。
(208,?b'\xe5\x93\x81\xe7\x89\x8c',?'',?'aima',?'',?None)品牌
(4,?b'\xe5\x93\x81\xe7\x89\x8c\xe5\x9e\x8b\xe5\x8f\xb7',?'',?'不銹鋼',?'',?None)
品牌型號
(5,?b'\xe5\x93\x81\xe7\x89\x8c\xe5\x9e\x8b\xe5\x8f\xb7',?'',?'HK-12345',?'',?None)
品牌型號
五、注意事項
非嚴格模式時,如果存儲超過BLOB或TEXT字段最大長度的值,值被截斷,如果被截掉的是非空格字符,會產生一條警告;在嚴格模式下,直接報錯;
BLOB或TEXT字段的列被檢索時,不刪除尾部空格;
為BLOB或TEXT字段創建索引時,必須指定長度;
BLOB或TEXT字段不能設置默認值;
當排序時只使用該列的前max_sort_length個字節,max_sort_length的默認值是1024。
當你想要使超過max_sort_length的字節有意義,對含長值的BLOB或TEXT列使用GROUP BY或ORDER BY的另一種方式是將列值轉換為固定長度的對象。標準方法是使用SUBSTRING函數。
BLOB或TEXT對象的最大大小由其類型確定,但在客戶端和服務器之間實際可以傳遞的最大值由可用內存數量和通信緩存區大小確定。你可以通過更改max_allowed_packet變量的值更改消息緩存區的大小,但必須同時修改服務器和客戶端程序。
參考文檔:MySQL中TEXT與BLOB字段類型的區別
六、結論
??當我們遇到ERROR 1118 (42000)時,將varchar2修改為TEXT,代價最小,索引使用正常,應用程序無需修改。但是TEXT中仍然存儲的是字符,受字符集和校對規則影響;
??BLOB字段中存儲的是二進制數據,不受字符集影響,可存儲任意類型數據,但是應用程序存取數據時需要轉換編碼,創建索引時指定的長度是字節數;
??具體使用哪種類型,應根據實際業務作合理的選擇。
總結
以上是生活随笔為你收集整理的mediumtext 长度_InnoDB表行长度超限问题的处理和思考的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python实现快速排序算法_基础算法:
- 下一篇: c++项目源码_C/C++学习日记:用C