mysql online ddl和pt_online ddl与pt-osc详解
Ⅰ、背景
優化sql的過程中發現表上少一個索引,直接加一個?會不會hang住?不加?sql又跑不好,由此引出一個問題——ddl操作怎么做?
Ⅱ、閑扯三兩句
5.6版本之前的MySQL創建索引不支持online,會對表加一個讀鎖(S lock),只能select,insert會阻塞,5.6開始,MySQL原生支持了在線索引添加,在添加索引過程中,應用程序對表依然可讀可寫
online ddl的這段時間內,對表做的操作會先記錄到alter table的日志里,這個日志是內存的,如果內存大小太小記不下來就會報錯
show variables like 'innodb%max%';
innodb_online_alter_log_max_size | 134217728
如果線上更新操作比較多,調大這個值 set global innodb_online_alter_log_max_size = 128M,這是個全局變量,在my.cnf中也配上
Ⅲ、老式DDL
3.1 鎖模式與算法解析
1、兩個參數:lock和algorithm
鎖模式:
模式含義
default
根據事務最大并發判斷用什么模式
none
不加任何鎖,不阻塞
shared
共享模式,和5.1的fast index creation一樣,可讀,但不支持dml
exclusive
排他模式,任何操作都不支持
算法:
算法含義
default
根據old_alter_table決定用哪個算法,off為用新算法,即inplace
inplace
共享鎖,只支持增加和刪除索引兩種操作
copy
需要拷貝數據,效率低
不管用什么模式,online ddl開始之前都會有一個短時間的排他鎖,結束之前也一樣,所以說,操作之前需要確保沒有大事務執行,否則會出現嚴重阻塞
2、兩種算法添加索引步驟對比(5.5版本)
-copyinplace
1
新建帶索引的臨時表
創建索引數據字典(只能是二級索引,如果是主鍵指定inplace也會轉為copy)
2
鎖原表,禁止DML,允許查詢
加共享鎖,禁止DML,允許查詢
3
將原表數據拷貝到臨時表
讀取聚簇索引,構造新的索引項,排序并插入新索引
4
升級shared鎖為exclusive,禁止讀寫,做rename(修改數據字典,很快)
等待打開當前表的所有只讀事務提交
5
完成創建索引操作
創建索引結束
3、語法:
alter table tb_name ...
lock = xxx,algorithm = xxx
注意:多個ddl操作建議放到一條語句種執行,效率比分開執行高
tips:
以上分析是針對5.5及之前的情況,即那時候只有增加、刪除索引不需要拷貝原表,但也不能操作DML
Ⅳ、現代online ddl的分類與實現細節
4.1 分類
online ddl包含copy和inplace兩種
修改列類型和刪除主鍵用copy
inplace又分為rebuild和no-rebuild兩種
rebuild需要重建表,修改記錄格式,添加、刪除列、修改默認值都用rebuild
no-rebuild只需要修改元數據,添加、刪除索引、修改列名則用no-rebuild
rebuild方式比no-rebuild方式實質多了一個ddl執行階段
4.2 實現細節(三階段)
先檢測一些命名、長度等限制
-prepareddlcommit
1
server層創建臨時frm
降級exclusive-mdl鎖,允許讀寫(copy不可寫)
升級exclusive-mdl鎖,禁止讀寫
2
持有exclusive-mdl鎖,禁止讀寫
掃描原表的聚簇索引每條記錄
應用最后row_log種產生的日志
3
根據alter類型,確定執行方式(copy,inplace-rebuild,inplace-norebuild)
遍歷新表的聚簇索引和二級索引
更新innodb的數據字典
4
更新數據字典的內存對象
根據記錄構造對應的索引項
提交事務(刷新事務的redo日志)
5
分配row_log對象記錄增量
將索引項插入sort_buffer塊
修改統計信息
6
innodb層生成臨時ibd文件(rebuild情況下)
將sort_buffer塊插入新的索引
rename臨時idb、frm文件
7
數據字典上提交事務、釋放鎖
處理ddl執行過程種產生的增量(rebuild情況下)
變更完成
4.3 注意三個參數
參數-
old_alter_table
默認off即用inplace模式
tmpdir
創建索引時排序的內存不夠則在此目錄做
innodb_online_alter_log_max_size
存row_log
tips:
②online ddl中inplace是優選項,ALGORITHM=COPY定會拷貝表,只讀,但ALGORITHM=INPLACE也可能拷貝表,但可以并發DML(因為有row_log)
③5.6依然不支持online的ddl操作:修改列的數據類型,刪除主鍵,變更表字符集
④inplace對dml的支持比較好,但消耗卻比copy大
online ddl關鍵點小結
①數據完整性--->row_log
②online和數據一致性--->propare和commit時短暫mdl,幾乎全程online
③server和innodb一致--->prepare時server生成frm,innodb生成臨時ibd,ddl時原表拷貝到ibd,row_log應用到ibd,commit時innodb修改數據字典,提交,最后innodb和server重命名ibd和rfm
Ⅴ、pt-osc
5.1 為什么要用pt-osc
問題:
在線索引添加存在的一個問題——主從延時(MySQL邏輯復制,oracle物理復制不存在這個問題)
原因:
alter table是執行完之后才告訴從機要執行(事務),從庫再順序執行。
如果是copy的那種online ddl,執行到這個ddl,其他并行的dml語句則要等待這個ddl執行完畢后才能繼續(看上文原理),如下圖:
主從延遲的產生:
+------------------------+
| master | o_ddl_5min
+------------------------+
| |
|log| 同步的是二進制日志,要等事務執行完之后才提交過去,和物理日志不同
| |
+------------------------+
| slave | o_ddl_5min
+------------------------+
因此,即使5.7現在對越來越多的ddl操作讀寫不阻塞了,真正在線上也很少用alter table這種方式去執行ddl操作
目前我們常用的一個工具是pt-osc
這個工具做在線ddl,主從延遲非常小,它不是直接操作的,是通過觸發器的機制來慢慢做,還有專門控制延遲的參數
5.2 安裝與操作演示
yum install -y perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-Time-HiRes perl-DBI perl-DBD-MySQL
cd /usr/local/src
wget https://www.percona.com/downloads/percona-toolkit/3.0.4/binary/tarball/percona-toolkit-3.0.4_x86_64.tar.gz
tar zxvf percona-toolkit-3.0.4_x86_64.tar.gz
cd percona-toolkit-3.0.4
perl Makefile.PL
make
make install
pt-online-schema-change --alter "convert to character set utf8b4" D=test,t=a
顯示操作步驟,真正執行要加 --excute
pt-online-schema-change --alter "alter table add index index_a (a)" D=test,t=a --excute
整個過程拆成很多小的步驟,一個一個傳到從上去,所以延遲比較小,缺點是時間長
tips:
percona toolkit中最有用的就是pt-online-schema-change,其他工具官方工具包utlities里面都有了,盡量用官方的,另外官方也在做osc了
5.3 原理淺析
方案:
步驟操作
step1
sysbench導入測試數據到test庫sbtest1中
step2
開啟general_log,并輸出到mysql.general_log表
step3
osc給sbtest1表的c字段加一個索引(可以把execute換做--dry-run)
step4
分析glog
step1:略
step2:
(root@localhost) [(none)]> truncate mysql.general_log;
Query OK, 0 rows affected (1.65 sec)
(root@localhost) [(none)]> set global general_log = 1;
Query OK, 0 rows affected (0.00 sec)
(root@localhost) [(none)]> set global log_output = 'table';
Query OK, 0 rows affected (0.01 sec)
step 3:
pt-online-schema-change --alter "add index index_c (c)" --socket=/tmp/mysql.sock --user=root --password=123 D=test,t=sbtest1 --execute
No slaves found. See --recursion-method if host VM_221_162_centos has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `test`.`sbtest1`...
Creating new table...
Created new table test._sbtest1_new OK.
Altering new table...
Altered `test`.`_sbtest1_new` OK.
2017-11-30T18:28:19 Creating triggers...
2017-11-30T18:28:19 Created triggers OK.
2017-11-30T18:28:19 Copying approximately 493200 rows...
2017-11-30T18:28:41 Copied rows OK.
2017-11-30T18:28:41 Analyzing new table...
2017-11-30T18:28:41 Swapping tables...
2017-11-30T18:28:41 Swapped original and new tables OK.
2017-11-30T18:28:41 Dropping old table...
2017-11-30T18:28:41 Dropped old table `test`.`_sbtest1_old` OK.
2017-11-30T18:28:41 Dropping triggers...
2017-11-30T18:28:41 Dropped triggers OK.
Successfully altered `test`.`sbtest1`.
上面已經可以看出個大概過程了
step 4:
這一步詳細分5塊分析如下:
(root@localhost) [(none)]> set global log_output = 'file';
Query OK, 0 rows affected (0.00 sec)
(root@localhost) [(none)]> set global general_log = 0;
Query OK, 0 rows affected (0.01 sec)
(root@localhost) [mysql]> select argument from mysql.general_log;
root@localhost on test using Socket
set autocommit=1
SHOW VARIABLES LIKE 'innodb\_lock_wait_timeout'
SET SESSION innodb_lock_wait_timeout=1
SHOW VARIABLES LIKE 'lock\_wait_timeout'
SET SESSION lock_wait_timeout=60
SHOW VARIABLES LIKE 'wait\_timeout'
SET SESSION wait_timeout=10000
SELECT @@SQL_MODE
SET @@SQL_QUOTE_SHOW_CREATE = 1/*!40101, @@SQL_MODE='NO_AUTO_VALUE_ON_ZERO,ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'*/
SELECT @@server_id /*!50038 , @@hostname*/
說明:
1、設置session級的變量
SET SESSION innodb_lock_wait_timeout=1
SET SESSION lock_wait_timeout=60
SET SESSION wait_timeout=10000
-----------------------------------------
SHOW VARIABLES LIKE 'version%'
SHOW ENGINES
SHOW VARIABLES LIKE 'innodb_version'
SHOW VARIABLES LIKE 'innodb_stats_persistent'
SELECT @@SERVER_ID
SHOW GRANTS FOR CURRENT_USER()
SHOW FULL PROCESSLIST
SHOW SLAVE HOSTS
SHOW GLOBAL STATUS LIKE 'Threads_running'
SHOW GLOBAL STATUS LIKE 'Threads_running'
SELECT CONCAT(@@hostname, @@port)
SHOW TABLES FROM `test` LIKE 'sbtest1'
SELECT VERSION()
SHOW TRIGGERS FROM `test` LIKE 'sbtest1'
/*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
USE `test`
SHOW CREATE TABLE `test`.`sbtest1`
/*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
EXPLAIN SELECT * FROM `test`.`sbtest1` WHERE 1=1
SELECT table_schema, table_name FROM information_schema.key_column_usage WHERE referenced_table_schema='test' AND referenced_table_name='sbtest1'
SHOW VARIABLES LIKE 'wsrep_on'
/*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
說明:
1、查看變量,當前用戶的權限,slave信息,版本信息等
2、檢查sbtest1是否存在觸發器
3、執行計劃
4、檢查sbtest1是否存在外鍵關聯
-----------------------------------------
USE `test`
SHOW CREATE TABLE `test`.`sbtest1`
/*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
CREATE TABLE `test`.`_sbtest1_new` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=500001 DEFAULT CHARSET=latin1
ALTER TABLE `test`.`_sbtest1_new` add index index_c (c)
/*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
USE `test`
SHOW CREATE TABLE `test`.`_sbtest1_new`
/*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
SELECT TRIGGER_SCHEMA, TRIGGER_NAME, DEFINER, ACTION_STATEMENT, SQL_MODE, CHARACTER_SET_CLIENT, COLLATION_CONNECTION, EVENT_MANIPULATION, ACTION_TIMING FROM INFORMATION_SCHEMA.TRIGGERS WHERE EVENT_MANIPULATION = 'DELETE' AND ACTION_TIMING = 'AFTER' AND TRIGGER_SCHEMA = 'test' AND EVENT_OBJECT_TABLE = 'sbtest1'
SELECT TRIGGER_SCHEMA, TRIGGER_NAME, DEFINER, ACTION_STATEMENT, SQL_MODE, CHARACTER_SET_CLIENT, COLLATION_CONNECTION, EVENT_MANIPULATION, ACTION_TIMING FROM INFORMATION_SCHEMA.TRIGGERS WHERE EVENT_MANIPULATION = 'UPDATE' AND ACTION_TIMING = 'AFTER' AND TRIGGER_SCHEMA = 'test' AND EVENT_OBJECT_TABLE = 'sbtest1'
SELECT TRIGGER_SCHEMA, TRIGGER_NAME, DEFINER, ACTION_STATEMENT, SQL_MODE, CHARACTER_SET_CLIENT, COLLATION_CONNECTION, EVENT_MANIPULATION, ACTION_TIMING FROM INFORMATION_SCHEMA.TRIGGERS WHERE EVENT_MANIPULATION = 'INSERT' AND ACTION_TIMING = 'AFTER' AND TRIGGER_SCHEMA = 'test' AND EVENT_OBJECT_TABLE = 'sbtest1'
SELECT TRIGGER_SCHEMA, TRIGGER_NAME, DEFINER, ACTION_STATEMENT, SQL_MODE, CHARACTER_SET_CLIENT, COLLATION_CONNECTION, EVENT_MANIPULATION, ACTION_TIMING FROM INFORMATION_SCHEMA.TRIGGERS WHERE EVENT_MANIPULATION = 'DELETE' AND ACTION_TIMING = 'BEFORE' AND TRIGGER_SCHEMA = 'test' AND EVENT_OBJECT_TABLE = 'sbtest1'
SELECT TRIGGER_SCHEMA, TRIGGER_NAME, DEFINER, ACTION_STATEMENT, SQL_MODE, CHARACTER_SET_CLIENT, COLLATION_CONNECTION, EVENT_MANIPULATION, ACTION_TIMING FROM INFORMATION_SCHEMA.TRIGGERS WHERE EVENT_MANIPULATION = 'UPDATE' AND ACTION_TIMING = 'BEFORE' AND TRIGGER_SCHEMA = 'test' AND EVENT_OBJECT_TABLE = 'sbtest1'
SELECT TRIGGER_SCHEMA, TRIGGER_NAME, DEFINER, ACTION_STATEMENT, SQL_MODE, CHARACTER_SET_CLIENT, COLLATION_CONNECTION, EVENT_MANIPULATION, ACTION_TIMING FROM INFORMATION_SCHEMA.TRIGGERS WHERE EVENT_MANIPULATION = 'INSERT' AND ACTION_TIMING = 'BEFORE' AND TRIGGER_SCHEMA = 'test' AND EVENT_OBJECT_TABLE = 'sbtest1'
CREATE TRIGGER `pt_osc_test_sbtest1_del` AFTER DELETE ON `test`.`sbtest1` FOR EACH ROW DELETE IGNORE FROM `test`.`_sbtest1_new` WHERE `test`.`_sbtest1_new`.`id` <=> OLD.`id`
CREATE TRIGGER `pt_osc_test_sbtest1_upd` AFTER UPDATE ON `test`.`sbtest1` FOR EACH ROW BEGIN DELETE IGNORE FROM `test`.`_sbtest1_new` WHERE !(OLD.`id` <=> NEW.`id`) AND `test`.`_sbtest1_new`.`id` <=> OLD.`id`;REPLACE INTO `test`.`_sbtest1_new` (`id`, `k`, `c`, `pad`) VALUES (NEW.`id`, NEW.`k`, NEW.`c`, NEW.`pad`);END
CREATE TRIGGER `pt_osc_test_sbtest1_ins` AFTER INSERT ON `test`.`sbtest1` FOR EACH ROW REPLACE INTO `test`.`_sbtest1_new` (`id`, `k`, `c`, `pad`) VALUES (NEW.`id`, NEW.`k`, NEW.`c`, NEW.`pad`)
說明:
1、根據原表的表結構結創建一張新表
2、對新表上的c字段加索引,這里依然用的是alter
3、檢查原表上觸發器情況,5.6開始同一張表上不能存在同一個動作的觸發器
4、針對新表創建三個觸發器,DELETE,UPDATE和INSERT(重點看下三個觸發器內容)
-----------------------------------------
EXPLAIN SELECT * FROM `test`.
總結
以上是生活随笔為你收集整理的mysql online ddl和pt_online ddl与pt-osc详解的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 基金定投有没有亏本的风险 投资时要注意
- 下一篇: 信用卡被盗刷了,该怎么办?