mysql物理优化_mysql物理优化器代价模型分析【原创】
1.引言
mysql的sql server在根據where condition檢索數據的時候,一般會有多種數據檢索的方法,其會根據各種數據檢索方法代價的大小,選擇代價最小的那個數據檢索方法。
比如說這個語句,where col1=x and col2=y and col3 >z ,同時存在inx_col1,inx_col2,inx_col3,inx_col1_col2_col3這四個索引,sql server要解決的問題有1)選擇哪個索引、2)是索引range掃描還是ref掃描、3)table scan的方式是否可行。
mysql會根據以下幾種數據檢索策略選擇代價最小的策略來從數據表中獲取數據,1)各個索引的range scan代價2)各個索引的ref scan代價3)table scan的代價。如何計算這些代價,是本文詳細說明的重點。
總代價cost = cpu cost + io cost。
2 .代價因子
mysql的代價因子在內存中有一份副本,由Server_cost_constants和SE_cost_constants兩個類組成。這兩個類的具體數據成員如下。
Mysql Server代價因子
Server_cost_constants {
m_row_evaluate_cost//行記錄條件謂詞評估代價
m_key_compare_cost //鍵值比較代價
m_memory_temptable_create_cost //內存臨時表創建代價
m_memory_temptable_row_cost //內存臨時表的行代價
m_disk_temptable_create_cost //磁盤臨時表創建代價
m_disk_temptable_row_cost
}
存儲引擎代價因子
SE_cost_constants{
m_memory_block_read_cost//從buffer pool中讀取一個頁面的代價
m_io_block_read_cost //從文件系統中讀取一個頁面的代價,buffer miss的場景
m_memory_block_read_cost_default
m_io_block_read_cost_default
}
mysql的代價因子在系統的持久化系統表中也有一份副本,對應mysql.server_cost和mysql.engine_cost兩個表,這兩個表中的字段與 內存中的類字段相同。DBA可以根據實際的硬件情況測試,測試出最適合的代價因子,然后update系統表中對應的字段。再然后執行flush OPTIMIZER_COSTS命令,將修改反應到內存中數據,這樣新連接上來的mysql session會讀取到內存中數據,然后以新的代價因子計算代價數。
代價因子如何根據實際的硬件環境與負載壓力自適應地調整,是一個重要的研究課題。
3 .統計信息
sql server需要的統計信息是由存儲引擎innodb提供的,調用innodb提供的api可以獲取這些統計信息,本文的后半部分會羅列這些api。innodb的統計信息根據需要可以持久化到系統表中。mysql.innodb_table_stats和mysql.innodb_index_stats存儲了表的統計信息和索引的統計信息。
mysql.innodb_table_stats表中字段說明
database_name 庫名
table_name 表名
n_rows 表中的數據行數
clustered_index_size 聚集索引的頁面數
sum_of_other_index_sizes 其他非主鍵索引的頁面數
last_update 最后更新這張表的時間
mysql.innodb_index_stats表中字段說明
database_name 庫名
table_name 表名
index_name 索引名
stat_name 統計項名稱
stat_value 統計項值
sample_size 采樣的頁面數
last_update 最后更新這張表的時間
其中stat_name 統計項名稱包括:
n_diff_pfxNN 為不同前綴列的cardinality,即不同前綴字段的 distinct value個數
n_leaf_page 索引葉子節點頁面數目
size 索引頁面數目
4.代價的計算公式
cpu代價計算
double row_evaluate_cost(doublerows)
{return rows * m_server_cost_constants->row_evaluate_cost();
}
table scan IO代價計算
Cost_estimate handler::table_scan_cost()
{double io_cost= scan_time() * table->cost_model()->page_read_cost(1.0);
}
ref and range scan IO代價計算
聚集索引掃描IO代價計算公式
Cost_estimate handler::read_cost(uint index, double ranges, doublerows)
{double io_cost= read_time(index, static_cast(ranges),
static_cast(rows)) *table->cost_model()->page_read_cost(1.0);
}
二級索引覆蓋掃描(不需要回表)IO代價計算公式
Cost_estimate handler::index_scan_cost(uint index, double ranges, doublerows)
{double io_cost= index_only_read_time(index, rows) *table->cost_model()->page_read_cost_index(index, 1.0);
}
二級索引非覆蓋掃描(需要回表)IO代價計算公式
min( table→cost_model()→page_read_cost(tmp_fanout), tab→worst_seeks )
估算讀取pages個聚集索引頁面所花費的代價,page數乘以代價因子
double Cost_model_table::page_read_cost(double pages)
估算讀取pages個指定index索引頁面所花費的代價數。
double Cost_model_table::page_read_cost_index(uint index, double pages)
5. innodb統計信息api
全表掃描聚集索引時,聚集索引(主鍵)占用的所有頁面數
double ha_innobase::scan_time()
估算在聚集索引上,掃描rows條記錄,需要讀取的頁面數
double ha_innobase::read_time(uint index, double ranges, double rows)
估算在指定keynr索引進行覆蓋掃描(不需要回表),掃描records條記錄,需要讀取的索引頁面數
double handler::index_only_read_time(uint keynr, double records)
估算指定keynr索引在范圍(min_key,max_key)中的記錄數量
ha_innobase::records_in_range(uint keynr, /*!< in: index number*/key_range*min_key, /*!< in: start key value of the
key_range *max_key) /*!< in: range end key val, may
)
估算聚集索引內存中頁面數占其所有頁面數的比率
double handler::table_in_memory_estimate()
估算二級索引內存中頁面數占其所有頁面數的比率
double handler::index_in_memory_estimate(uint keyno)
6.開啟優化器跟蹤
set session optimizer_trace="enabled=on";
explain your sqlselect * from information_schema.optimizer_trace;
7.優化器跟蹤示例
"rows_estimation": [
{"table": "`tab`","range_analysis": {"table_scan": {"rows": 5,"cost": 4.1},"potential_range_indexes": [
{"index": "PRIMARY","usable": false,"cause": "not_applicable"},
{"index": "inx_clo2","usable": true,"key_parts": ["clo2","clo1"]
},
{"index": "inx_clo3","usable": true,"key_parts": ["clo3","clo1"]
},
{"index": "inx_clo2_clo3","usable": true,"key_parts": ["clo2","clo3","clo1"]
}
],"best_covering_index_scan": {"index": "inx_clo2_clo3","cost": 2.0606,"chosen": true},"setup_range_conditions": [
],"group_index_range": {"chosen": false,"cause": "not_group_by_or_distinct"},"analyzing_range_alternatives": {"range_scan_alternatives": [
{"index": "inx_clo2","ranges": ["hu <= clo2 <= hu"],"index_dives_for_eq_ranges": true,"rowid_ordered": true,"using_mrr": false,"index_only": false,"rows": 2,"cost": 3.41,"chosen": false,"cause": "cost"},
{"index": "inx_clo3","ranges": ["huan <= clo3 <= huan"],"index_dives_for_eq_ranges": true,"rowid_ordered": true,"using_mrr": false,"index_only": false,"rows": 1,"cost": 2.21,"chosen": false,"cause": "cost"},
{"index": "inx_clo2_clo3","ranges": ["hu <= clo2 <= hu AND huan <= clo3 <= huan"],"index_dives_for_eq_ranges": true,"rowid_ordered": true,"using_mrr": false,"index_only": true,"rows": 1,"cost": 1.21,"chosen": true}
],"analyzing_roworder_intersect": {"intersecting_indexes": [
{"index": "inx_clo2_clo3","index_scan_cost": 1,"cumulated_index_scan_cost": 1,"disk_sweep_cost": 0,"cumulated_total_cost": 1,"usable": true,"matching_rows_now": 1,"isect_covering_with_this_index": true,"chosen": true}
],"clustered_pk": {"clustered_pk_added_to_intersect": false,"cause": "no_clustered_pk_index"},"chosen": false,"cause": "too_few_indexes_to_merge"}
},"chosen_range_access_summary": {"range_access_plan": {"type": "range_scan","index": "inx_clo2_clo3","rows": 1,"ranges": ["hu <= clo2 <= hu AND huan <= clo3 <= huan"]
},"rows_for_plan": 1,"cost_for_plan": 1.21,"chosen": true}
}
}
]
},
{"considered_execution_plans": [
{"plan_prefix": [
],"table": "`tab`","best_access_path": {"considered_access_paths": [
{"access_type": "ref","index": "inx_clo2","rows": 2,"cost": 2.4,"chosen": true},
{"access_type": "ref","index": "inx_clo3","rows": 1,"cost": 1.2,"chosen": true},
{"access_type": "ref","index": "inx_clo2_clo3","rows": 1,"cost": 1.2,"chosen": false},
{"rows_to_scan": 1,"access_type": "range","range_details": {"used_index": "inx_clo2_clo3"},"resulting_rows": 1,"cost": 1.41,"chosen": false}
]
},"condition_filtering_pct": 40,"rows_for_plan": 0.4,"cost_for_plan": 1.2,"chosen": true}
]
},
{"attaching_conditions_to_tables": {"original_condition": "((`tab`.`clo2` = 'hu') and (`tab`.`clo3` = 'huan'))","attached_conditions_computation": [
],"attached_conditions_summary": [
{"table": "`tab`","attached": "(`tab`.`clo2` = 'hu')"}
]
}
},
{"refine_plan": [
{"table": "`tab`"}
]
}
]
View Code
總結
以上是生活随笔為你收集整理的mysql物理优化_mysql物理优化器代价模型分析【原创】的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: mysql从入门到转行图片_数据小白转行
- 下一篇: scrapy 安装_安装scrapy时出