高性能mysql 聚簇索引,高性能MySQL笔记-第5章Indexing for High Performance-005聚集索引...
一、聚集索引介紹
1.什么是聚集索引?
InnoDB’s clustered indexes actually store a B-Tree index and the rows together in the same structure.
2.為什么一張表只能一個聚集索引?
When a table has a clustered index, its rows are actually stored in the index’s leaf pages.The term “clustered” refers to the fact that rows with adjacent key values are stored?close to each other. ?You can have only one clustered index per table, because you can’t?store the rows in two places at once. (However, covering indexes let you emulate mul-
tiple clustered indexes; more on this later.)
3.聚集索引的優點
? You can keep related data close together. For example, when implementing a?mailbox, you can cluster by user_id , so you can retrieve all of a single user’s messages by fetching only a few pages from disk. If you didn’t use clustering, each?message might require its own disk I/O.
? Data access is fast. A clustered index holds both the index and the data together?in one B-Tree, so retrieving rows from a clustered index is normally faster than a?comparable lookup in a nonclustered index.
? Queries that use covering indexes can use the primary key values contained at the?leaf node.
4.聚集索引的缺點
? Clustering gives the largest improvement for I/O-bound workloads. If the data fits?in memory the order in which it’s accessed doesn’t really matter, so clustering?doesn’t give much benefit.
? Insert speeds depend heavily on insertion order. Inserting rows in primary key?order is the fastest way to load data into an InnoDB table. It might be a good idea?to reorganize the table with OPTIMIZE TABLE after loading a lot of data if you didn’t?load the rows in primary key order.
? Updating the clustered index columns is expensive, because it forces InnoDB to?move each updated row to a new location.
? Tables built upon clustered indexes are subject to page splits when new rows are?inserted, or when a row’s primary key is updated such that the row must be moved.A page split happens when a row’s key value dictates that the row must be placed?into a page that is full of data. The storage engine must split the page into two to
accommodate the row. Page splits can cause a table to use more space on disk.
? Clustered tables can be slower for full table scans, especially if rows are less densely?packed or stored nonsequentially because of page splits.
? Secondary (nonclustered) indexes can be larger than you might expect, because?their leaf nodes contain the primary key columns of the referenced rows.
? Secondary index accesses require two index lookups instead of one.
二、聚集索引(用innodb)與非聚集索引(用MyISAM)的區別
表結構
CREATE TABLE layout_test ( col1 intNOT NULL, col2 intNOT NULL, PRIMARY KEY(col1), KEY(col2) );
1.MyISAM的結構
In fact, in MyISAM, there is no structural difference between a primary key and anyother index. A primary key is simply a unique, nonnullable index named PRIMARY .
2.Innodb的結構
At first glance, that might not look very different from Figure 5-5. But look again, andnotice that this illustration shows the whole table, not just the index. Because theclustered index “is” the table in InnoDB, there’s no separate row storage as there is forMyISAM.
Each leaf node in the clustered index contains the primary key value, the transactionID, and rollback pointer InnoDB uses for transactional and MVCC purposes, and therest of the columns (in this case, col2 ). If the primary key is on a column prefix, InnoDBincludes the full column value with the rest of the columns.
Also in contrast to MyISAM, secondary indexes are very different from clustered indexes in InnoDB. Instead of storing “row pointers,” InnoDB’s secondary index leafnodes contain the primary key values, which serve as the “pointers” to the rows. Thisstrategy reduces the work needed to maintain secondary indexes when rows move or
when there’s a data page split. Using the row’s primary key values as the pointer makesthe index larger, but it means InnoDB can move a row without updating pointers to it.
三、用聚集索引時,primary key是否連續的影響
1.
Notice that not only does it take longer to insert the rows with the UUID primary key,but the resulting indexes are quite a bit bigger. Some of that is due to the larger primarykey, but some of it is undoubtedly due to page splits and resultant fragmentation as well.
2.主鍵是否連續為什么會有差別?
連續主鍵的插入
不連續主鍵的插入
插入不連續主鍵的缺點:
? The destination page might have been flushed to disk and removed from the caches,or might not have ever been placed into the caches, in which case InnoDB will haveto find it and read it from the disk before it can insert the new row. This causes alot of random I/O.
? When insertions are done out of order, InnoDB has to split pages frequently tomake room for new rows. This requires moving around a lot of data, and modifyingat least three pages instead of one.
? Pages become sparsely and irregularly filled because of splitting, so the final datais fragmented.
總結
以上是生活随笔為你收集整理的高性能mysql 聚簇索引,高性能MySQL笔记-第5章Indexing for High Performance-005聚集索引...的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 可转债抢权配售技巧?
- 下一篇: mppt多峰追踪MATLAB仿真,基于光