當前位置：首頁 > 运维知识 > 数据库 >内容正文

数据库

MySQL调优（三）：索引基本实现原理及索引优化，哈希索引 / 组合索引 / 簇族索引等

發布時間：2024/2/28 数据库 37 豆豆

生活随笔收集整理的這篇文章主要介紹了 MySQL调优（三）：索引基本实现原理及索引优化，哈希索引 / 组合索引 / 簇族索引等小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

索引基本知識

索引匹配方式

哈希索引

當需要存儲大量的URL，并且根據URL進行搜索查找，如果使用B+樹，存儲的內容就會很大
select id from url where url=""
也可以利用將url使用CRC32做哈希，可以使用以下查詢方式：
select id fom url where url="" and url_crc=CRC32("")
此查詢性能較高原因是使用體積很小的索引來完成查找

組合索引

當包含多個列作為索引，需要注意的是正確的順序依賴于該索引的查詢，同時需要考慮如何更好的滿足排序和分組的需要
案例，建立組合索引a,b,c，不同SQL語句使用索引情況：
where a=3 and b>10 and c=7 只使用索引a，b，因為b是范圍查找，范圍查找的后面就無法做精確匹配了，所以無論后面的c是否加過索引，都不走索引了。
where a=3 and b=10 and c=7 ，就可以走索引a，b，c了。

建立索引的時候，盡量選擇長度較小的列，占用存儲空間較小。

簇族索引、非簇族索引

innodb是簇族索引
myisam是非簇族索引
聚簇索引，是為了減少對磁盤的IO。
為什么按照主鍵的順序插入是最快的方式：因為如果不按順序插入/刪除，會產生過多的頁分裂/合并，影響效率。（類似于大數據的region分割）

覆蓋索引

1、當發起一個被索引覆蓋的查詢時，在explain的extra列可以看到using index的信息，此時就使用了覆蓋索引

mysql> explain select store_id,film_id from inventory\G *************************** 1. row ***************************id: 1select_type: SIMPLEtable: inventorypartitions: NULLtype: index possible_keys: NULLkey: idx_store_id_film_idkey_len: 3ref: NULLrows: 4581filtered: 100.00Extra: Using index 1 row in set, 1 warning (0.01 sec)

2、在大多數存儲引擎中，覆蓋索引只能覆蓋那些只訪問索引中部分列的查詢。不過，可以進一步的進行優化，可以使用innodb的二級索引來覆蓋查詢。

例如：actor使用innodb存儲引擎，并在last_name字段又二級索引，雖然該索引的列不包括主鍵actor_id，但也能夠用于對actor_id做覆蓋查詢

mysql> explain select actor_id,last_name from actor where last_name='HOPPER'\G *************************** 1. row ***************************id: 1select_type: SIMPLEtable: actorpartitions: NULLtype: ref possible_keys: idx_actor_last_namekey: idx_actor_last_namekey_len: 137ref: constrows: 2filtered: 100.00Extra: Using index 1 row in set, 1 warning (0.00 sec)

示例：出現了覆蓋索引，顯示using index

優化小細節

避免where id+1=5這種查詢，應該直接使用where id=4

type列的好壞排序：
system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL

前綴索引實例說明

? 有時候需要索引很長的字符串，這會讓索引變的大且慢，通常情況下可以使用某個列開始的部分字符串，這樣大大的節約索引空間，從而提高索引效率，但這會降低索引的選擇性。

索引的選擇性是指不重復的索引值和數據表記錄總數的比值，范圍從1/#T到1之間。索引的選擇性越高則查詢效率越高，因為選擇性更高的索引可以讓mysql在查找的時候過濾掉更多的行。

? 一般情況下某個列前綴的選擇性也是足夠高的，足以滿足查詢的性能，但是對應BLOB,TEXT,VARCHAR類型的列，必須要使用前綴索引，因為mysql不允許索引這些列的完整長度，使用該方法的訣竅在于要選擇足夠長的前綴以保證較高的選擇性，通過又不能太長。

案例演示：

--創建數據表 create table citydemo(city varchar(50) not null); insert into citydemo(city) select city from city;--重復執行5次下面的sql語句 insert into citydemo(city) select city from citydemo;--更新城市表的名稱 update citydemo set city=(select city from city order by rand() limit 1);--查找最常見的城市列表，發現每個值都出現45-65次， select count(*) as cnt,city from citydemo group by city order by cnt desc limit 10;--查找最頻繁出現的城市前綴，先從3個前綴字母開始，發現比原來出現的次數更多，可以分別截取多個字符查看城市出現的次數 select count(*) as cnt,left(city,3) as pref from citydemo group by pref order by cnt desc limit 10; select count(*) as cnt,left(city,7) as pref from citydemo group by pref order by cnt desc limit 10; --此時前綴的選擇性接近于完整列的選擇性--還可以通過另外一種方式來計算完整列的選擇性，可以看到當前綴長度到達7之后，再增加前綴長度，選擇性提升的幅度已經很小了 select count(distinct left(city,3))/count(*) as sel3, count(distinct left(city,4))/count(*) as sel4, count(distinct left(city,5))/count(*) as sel5, count(distinct left(city,6))/count(*) as sel6, count(distinct left(city,7))/count(*) as sel7, count(distinct left(city,8))/count(*) as sel8 from citydemo;--計算完成之后可以創建前綴索引，只取前7個字節創建索引，節省索引的存儲空間 alter table citydemo add key(city(7));--注意：前綴索引是一種能使索引更小更快的有效方法，但是也包含缺點：mysql無法使用前綴索引做order by 和 group by。

OLTP/OLAP
基數cardinality統計
Hyperloglog算法
基數越小，關聯的時候效率越高

使用索引掃描來做排序

? mysql有兩種方式可以生成有序的結果：通過排序操作或者按索引順序掃描，如果explain出來的type列的值為index,則說明mysql使用了索引掃描來做排序

? 掃描索引本身是很快的，因為只需要從一條索引記錄移動到緊接著的下一條記錄。但如果索引不能覆蓋查詢所需的全部列，那么就不得不每掃描一條索引記錄就得回表查詢一次對應的行，這基本都是隨機IO，因此按索引順序讀取數據的速度通常要比順序地全表掃描慢

? mysql可以使用同一個索引即滿足排序，又用于查找行，如果可能的話，設計索引時應該盡可能地同時滿足這兩種任務。

? 只有當索引的列順序和order by子句的順序完全一致，并且所有列的排序方式都一樣時，mysql才能夠使用索引來對結果進行排序，如果查詢需要關聯多張表，則只有當orderby子句引用的字段全部為第一張表時，才能使用索引做排序。order by子句和查找型查詢的限制是一樣的，需要滿足索引的最左前綴的要求，否則，mysql都需要執行順序操作，而無法利用索引排序

--sakila數據庫中rental表在rental_date,inventory_id,customer_id上有rental_date的索引 --使用rental_date索引為下面的查詢做排序 explain select rental_id,staff_id from rental where rental_date='2005-05-25' order by inventory_id,customer_id\G *************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentalpartitions: NULLtype: ref possible_keys: rental_datekey: rental_datekey_len: 5ref: constrows: 1filtered: 100.00Extra: Using index condition 1 row in set, 1 warning (0.00 sec) --order by子句不滿足索引的最左前綴的要求，也可以用于查詢排序，這是因為所以你的第一列被指定為一個常數--該查詢為索引的第一列提供了常量條件，而使用第二列進行排序，將兩個列組合在一起，就形成了索引的最左前綴 explain select rental_id,staff_id from rental where rental_date='2005-05-25' order by inventory_id desc\G *************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentalpartitions: NULLtype: ref possible_keys: rental_datekey: rental_datekey_len: 5ref: constrows: 1filtered: 100.00Extra: Using where 1 row in set, 1 warning (0.00 sec)--下面的查詢不會利用索引 explain select rental_id,staff_id from rental where rental_date>'2005-05-25' order by rental_date,inventory_id\G *************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentalpartitions: NULLtype: ALL possible_keys: rental_datekey: NULLkey_len: NULLref: NULLrows: 16005filtered: 50.00Extra: Using where; Using filesort--該查詢使用了兩中不同的排序方向，但是索引列都是正序排序的，在排序列能組成最左前綴匹配的情況下，可以全部正序，可以全部逆序，都能走索引，但是不能一個正序，一個逆序。 explain select rental_id,staff_id from rental where rental_date>'2005-05-25' order by inventory_id desc,customer_id asc\G *************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentalpartitions: NULLtype: ALL possible_keys: rental_datekey: NULLkey_len: NULLref: NULLrows: 16005filtered: 50.00Extra: Using where; Using filesort 1 row in set, 1 warning (0.00 sec)--該查詢中引用了一個不再索引中的列 explain select rental_id,staff_id from rental where rental_date>'2005-05-25' order by inventory_id,staff_id\G *************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentalpartitions: NULLtype: ALL possible_keys: rental_datekey: NULLkey_len: NULLref: NULLrows: 16005filtered: 50.00Extra: Using where; Using filesort 1 row in set, 1 warning (0.00 sec)

索引監控

索引優化分析案例

預先準備好數據

SET FOREIGN_KEY_CHECKS=0; DROP TABLE IF EXISTS `itdragon_order_list`; CREATE TABLE `itdragon_order_list` (`id` bigint(11) NOT NULL AUTO_INCREMENT COMMENT '主鍵id，默認自增長',`transaction_id` varchar(150) DEFAULT NULL COMMENT '交易號',`gross` double DEFAULT NULL COMMENT '毛收入(RMB)',`net` double DEFAULT NULL COMMENT '凈收入(RMB)',`stock_id` int(11) DEFAULT NULL COMMENT '發貨倉庫',`order_status` int(11) DEFAULT NULL COMMENT '訂單狀態',`descript` varchar(255) DEFAULT NULL COMMENT '客服備注',`finance_descript` varchar(255) DEFAULT NULL COMMENT '財務備注',`create_type` varchar(100) DEFAULT NULL COMMENT '創建類型',`order_level` int(11) DEFAULT NULL COMMENT '訂單級別',`input_user` varchar(20) DEFAULT NULL COMMENT '錄入人',`input_date` varchar(20) DEFAULT NULL COMMENT '錄入時間',PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=10003 DEFAULT CHARSET=utf8;INSERT INTO itdragon_order_list VALUES ('10000', '81X97310V32236260E', '6.6', '6.13', '1', '10', 'ok', 'ok', 'auto', '1', 'itdragon', '2017-08-28 17:01:49'); INSERT INTO itdragon_order_list VALUES ('10001', '61525478BB371361Q', '18.88', '18.79', '1', '10', 'ok', 'ok', 'auto', '1', 'itdragon', '2017-08-18 17:01:50'); INSERT INTO itdragon_order_list VALUES ('10002', '5RT64180WE555861V', '20.18', '20.17', '1', '10', 'ok', 'ok', 'auto', '1', 'itdragon', '2017-09-08 17:01:49');

逐步開始進行優化：

第一個案例：

select * from itdragon_order_list where transaction_id = "81X97310V32236260E"; --通過查看執行計劃發現type=all,需要進行全表掃描 explain select * from itdragon_order_list where transaction_id = "81X97310V32236260E";--優化一、為transaction_id創建唯一索引create unique index idx_order_transaID on itdragon_order_list (transaction_id); --當創建索引之后，唯一索引對應的type是const，通過索引一次就可以找到結果，普通索引對應的type是ref，表示非唯一性索引賽秒，找到值還要進行掃描，直到將索引文件掃描完為止，顯而易見，const的性能要高于refexplain select * from itdragon_order_list where transaction_id = "81X97310V32236260E";--優化二、使用覆蓋索引，查詢的結果變成 transaction_id,當extra出現using index,表示使用了覆蓋索引explain select transaction_id from itdragon_order_list where transaction_id = "81X97310V32236260E";

第二個案例

--創建復合索引 create index idx_order_levelDate on itdragon_order_list (order_level,input_date);--創建索引之后發現跟沒有創建索引一樣，都是全表掃描，都是文件排序 explain select * from itdragon_order_list order by order_level,input_date;--可以使用force index強制指定索引 explain select * from itdragon_order_list force index(idx_order_levelDate) order by order_level,input_date; --其實給訂單排序意義不大，給訂單級別添加索引意義也不大，因此可以先確定order_level的值，然后再給input_date排序 explain select * from itdragon_order_list where order_level=3 order by input_date; 超強干貨來襲云風專訪：近40年碼齡，通宵達旦的技術人生

總結

以上是生活随笔為你收集整理的MySQL调优（三）：索引基本实现原理及索引优化，哈希索引 / 组合索引 / 簇族索引等的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Redis实战（三）：Redis的Lis
下一篇： JVM从入门到精通（四）：内存屏障与JV