failover.mysql_mysqlfailover测试
mysqlfailover是mysql官方用python語言寫的一款工具,包含在mysql utilities工具集中。主要作用是保障mysql高可用。他會定時檢測節(jié)點(diǎn)狀態(tài),當(dāng)master節(jié)點(diǎn)不可用時,會自動轉(zhuǎn)移到從節(jié)點(diǎn)上,同時剩余的從節(jié)點(diǎn)都會指向轉(zhuǎn)以后的節(jié)點(diǎn)。如何保證數(shù)據(jù)一致,在下面分析中會有說明。
mysqlfailover使用前提:
1、必須開啟GTID模式,在GTID模式下,復(fù)制延遲已經(jīng)減小到最低。用壓測工具會有3秒左右的延遲。這取決于設(shè)置多少SQL線程。如果秒插1萬,可以設(shè)置為16。
2、配置文件中必須添加:
report-host=
report-port=
master-info-repository=TABLE
relay-log-info-repository=TABLE
用于從節(jié)點(diǎn)可以被檢測到。
3、權(quán)限:
必須要有with grant option權(quán)限。
安裝也非常簡單。
下載好mysql utilities工具集:https://downloads.mysql.com/archives/utilities/
unzip?mysql-utilities-1.6.5.zip
cd?mysql-utilities-1.6.5
python ./setup.py build
python ./setup.py install
到此為止安裝完成。
使用:
mysqlfailover --master=failover:123456@'192.168.0.106':3306 --discover-slaves-login=failover:123456?--daemon=start?--log=/data/failover.log
建立好主從服務(wù)。這里略。
檢測事物是否完整轉(zhuǎn)移:
這里使用sysbentch工具來進(jìn)行批量插入。
sysbench --test=oltp --mysql-db=test --mysql-user=root --mysql-password=123456 --oltp-table-size=1000000000 --oltp-num-tables=15 prepare 批量插入
sysbench 0.4.12.10: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Creating table 'sbtest1'...
Creating table 'sbtest5'...
Creating table 'sbtest4'...
Creating table 'sbtest8'...
Creating table 'sbtest9'...
Creating table 'sbtest6'...
Creating table 'sbtest2'...
Creating table 'sbtest'...
Creating table 'sbtest3'...
Creating table 'sbtest14'...
Creating table 'sbtest10'...
Creating table 'sbtest12'...
Creating table 'sbtest11'...
Creating table 'sbtest7'...
Creating table 'sbtest13'...
Creating 1000000000 records in table 'sbtest11'...
Creating 1000000000 records in table 'sbtest6'...
Creating 1000000000 records in table 'sbtest4'...
Creating 1000000000 records in table 'sbtest5'...
Creating 1000000000 records in table 'sbtest8'...
Creating 1000000000 records in table 'sbtest14'...
Creating 1000000000 records in table 'sbtest3'...
Creating 1000000000 records in table 'sbtest13'...
Creating 1000000000 records in table 'sbtest9'...
Creating 1000000000 records in table 'sbtest10'...
Creating 1000000000 records in table 'sbtest1'...
Creating 1000000000 records in table 'sbtest12'...
Creating 1000000000 records in table 'sbtest'...
Creating 1000000000 records in table 'sbtest7'...
Creating 1000000000 records in table 'sbtest2'...
等待幾分鐘后:
kill -9 17448
kill -9 18350
之后,該工具自動轉(zhuǎn)移輸出,可以看到已經(jīng)轉(zhuǎn)移到了叢機(jī)上:
Q-quit R-refresh H-health G-GTID Lists U-UUIDs
Failed to reconnect to the master after 3 attemps.
Failover starting in 'auto' mode...
# Checking eligibility of slave 192.168.0.109:3306 for candidate.
# GTID_MODE=ON ... Ok
# Replication user exists ... Ok
# Candidate slave 192.168.0.109:3306 will become the new master.
# Checking slaves status (before failover).
# Preparing candidate for failover.
WARNING: IP lookup by name failed for 44,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.109,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.112,reason: Unknown host
# Missing transactions found on 192.168.0.112:3306. SELECT gtid_subset() = 0
# LOCK STRING: FLUSH TABLES WITH READ LOCK
# Read only is ON for 192.168.0.112:3306.
# Connecting candidate to 192.168.0.112:3306 as a temporary slave to retrieve unprocessed GTIDs.
# Change master command for 192.168.0.109:3306
# CHANGE MASTER TO MASTER_HOST = '192.168.0.112', MASTER_USER = 'backup', MASTER_PASSWORD = '123456', MASTER_PORT = 3306, MASTER_AUTO_POSITION=1
# Read only is OFF for 192.168.0.112:3306.
# UNLOCK STRING: UNLOCK TABLES
# Waiting for candidate to catch up to slave 192.168.0.112:3306.
# Slave 192.168.0.109:3306:
# QUERY = SELECT WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS('c142ca67-b898-11e8-86e8-000c29367e64:1', 300)
# Return Code = 3
# Slave 192.168.0.109:3306:
# QUERY = SELECT WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS('c777e02f-b898-11e8-86a0-000c29c6f346:1-4', 300)
# Return Code = 0
# Creating replication user if it does not exist.
# Stopping slaves.
# Performing STOP on all slaves.
WARNING: IP lookup by name failed for 44,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.109,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.112,reason: Unknown host
# Executing stop on slave 192.168.0.109:3306 WARN - slave is not configured with this master
# Executing stop on slave 192.168.0.109:3306 Ok
WARNING: IP lookup by address failed for 192.168.0.106,reason: Unknown host
# Executing stop on slave 192.168.0.112:3306 WARN - slave is not configured with this master
# Executing stop on slave 192.168.0.112:3306 Ok
WARNING: IP lookup by name failed for 44,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.109,reason: Unknown host
# Switching slaves to new master.
# Change master command for 192.168.0.112:3306
# CHANGE MASTER TO MASTER_HOST = '192.168.0.109', MASTER_USER = 'backup', MASTER_PASSWORD = '123456', MASTER_PORT = 3306, MASTER_AUTO_POSITION=1
# Disconnecting new master as slave.
# Execute on 192.168.0.109:3306: RESET SLAVE ALL
# Starting slaves.
# Performing START on all slaves.
# Executing start on slave 192.168.0.112:3306 Ok
# Checking slaves for errors.
# 192.168.0.112:3306 status: Ok
# Failover complete.
# Discovering slaves for master at 192.168.0.109:3306
Failover console will restart in 5 seconds.
# Attempting to contact 192.168.0.109 ... Success
# Attempting to contact 192.168.0.112 ... Success
MySQL Replication Failover Utility
Failover Mode = auto Next Interval = Sat Sep 15 14:15:30 2018
Master Information
------------------
Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB
mysql-bin.000001 657
GTID Executed Set
b5c5054c-b898-11e8-8670-000c299e1daf:1 [...]
# Attempting to contact 192.168.0.109 ... Success
# Attempting to contact 192.168.0.112 ... Success
Replication Health Status
+----------------+-------+---------+--------+------------+---------+-------------+-------------------+-----------------+------------+-------------+--------------+------------------+---------------+-----------+----------------+------------+---------------+
| host | port | role | state | gtid_mode | health | version | master_log_file | master_log_pos | IO_Thread | SQL_Thread | Secs_Behind | Remaining_Delay | IO_Error_Num | IO_Error | SQL_Error_Num | SQL_Error | Trans_Behind |
+----------------+-------+---------+--------+------------+---------+-------------+-------------------+-----------------+------------+-------------+--------------+------------------+---------------+-----------+----------------+------------+---------------+
| 192.168.0.109 | 3306 | MASTER | UP | ON | OK | 5.7.22-log | mysql-bin.000001 | 657 | | | | | | | | | |
| 192.168.0.112 | 3306 | SLAVE | UP | ON | OK | 5.7.22-log | mysql-bin.000001 | 657 | Yes | Yes | 0 | No | 0 | | 0 | | 0 |
+----------------+-------+---------+--------+------------+---------+-------------+-------------------+-----------------+------------+-------------+--------------+------------------+---------------+-----------+----------------+------------+---------------+
分析:
當(dāng)程序檢測到master服務(wù)停止后:
1、檢查指定的候選服務(wù)器是否正常,檢查GTID模式是否開啟
2、鎖表,防止事物提交帶來的數(shù)據(jù)不一致問題。
3、如果開啟了read_only模式,則會自動將其關(guān)閉,并且先change master to到另一臺從機(jī)上以保證數(shù)據(jù)一致
4、解鎖表,保證候選服務(wù)器和另一臺從機(jī)的事物一致
5、檢測候選服務(wù)器的事物號,然后停止全部從機(jī):stop slave;
6、切換到新master,也就是候選服務(wù)器,將所有從機(jī)指向候選服務(wù)器。斷開與原master的連接,執(zhí)行reset slave語句
7、在從機(jī)開啟start slave,開始復(fù)制,這時從機(jī)都已經(jīng)指向了新master。故障轉(zhuǎn)移完成。
現(xiàn)在在主機(jī)上輸出二進(jìn)制日志,看最后一次插入是哪個事物:
mysqlbinlog --base64-output=decode-rows -v mysql-bin.000005 > ~/bin.log
vim ~/bin.log
截取最后一部分:
### INSERT INTO `test`.`sbtest8`
### SET
### @1=289999
### @2=0
### @3=''
### @4='qqqqqqqqqqwwwwwwwwwweeeeeeeeeerrrrrrrrrrtttttttttt'
### INSERT INTO `test`.`sbtest8`
### SET
### @1=290000
### @2=0
### @3=''
### @4='qqqqqqqqqqwwwwwwwwwweeeeeeeeeerrrrrrrrrrtttttttttt'
# at 265373582
#180901 15:41:10 server id 1 end_log_pos 265373613 CRC32 0xa53bca62 Xid = 7014
COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
可以看到在主庫上最后一次插入的操作是在test庫下的sbtest8表,第一列值為290000,也就是id列。
現(xiàn)在切換到從庫上進(jìn)入sbtest8這張表,看看這條事物是否已經(jīng)復(fù)制到了從庫:
mysql> use test
Database changed
mysql> select * from sbtest8 where id = '290000';
+--------+---+---+----------------------------------------------------+
| id | k | c | pad |
+--------+---+---+----------------------------------------------------+
| 290000 | 0 | | qqqqqqqqqqwwwwwwwwwweeeeeeeeeerrrrrrrrrrtttttttttt |
+--------+---+---+----------------------------------------------------+
1 row in set (0.00 sec)
可以看到已經(jīng)有了數(shù)據(jù),看看是不是最后一條,從庫是否回滾了未提交的事物:
mysql> select * from sbtest8 where id = '290001';
Empty set (0.00 sec)
mysql> select * from sbtest8 order by id desc limit 1;
+--------+---+---+----------------------------------------------------+
| id | k | c | pad |
+--------+---+---+----------------------------------------------------+
| 290000 | 0 | | qqqqqqqqqqwwwwwwwwwweeeeeeeeeerrrrrrrrrrtttttttttt |
+--------+---+---+----------------------------------------------------+
1 row in set (0.00 sec)
mysql> \q
Bye
可以看出id=290000確實(shí)是最后一條事物,如果有未提交的事物可能已經(jīng)回滾,證明主庫與復(fù)制到從庫的事物不會丟失。
最后,可以使用mysqldiff工具來檢查主從之間的不一致:
[root@node2 data]# mysqldiff --server1=failover:123456@192.168.0.109:3306 --server2=failover:123456@192.168.0.112:3306 --difftype=sql test:test
# WARNING: Using a password on the command line interface can be insecure.
# server1 on 192.168.0.109: ... connected.
# server2 on 192.168.0.112: ... connected.
# Comparing `test` to `test` [PASS]
# Comparing `test`.`sbtest` to `test`.`sbtest` [PASS]
# Comparing `test`.`sbtest1` to `test`.`sbtest1` [PASS]
# Comparing `test`.`sbtest10` to `test`.`sbtest10` [PASS]
# Comparing `test`.`sbtest11` to `test`.`sbtest11` [PASS]
# Comparing `test`.`sbtest12` to `test`.`sbtest12` [PASS]
# Comparing `test`.`sbtest13` to `test`.`sbtest13` [PASS]
# Comparing `test`.`sbtest14` to `test`.`sbtest14` [PASS]
# Comparing `test`.`sbtest2` to `test`.`sbtest2` [PASS]
# Comparing `test`.`sbtest3` to `test`.`sbtest3` [PASS]
# Comparing `test`.`sbtest4` to `test`.`sbtest4` [PASS]
# Comparing `test`.`sbtest5` to `test`.`sbtest5` [PASS]
# Comparing `test`.`sbtest6` to `test`.`sbtest6` [PASS]
# Comparing `test`.`sbtest7` to `test`.`sbtest7` [PASS]
# Comparing `test`.`sbtest8` to `test`.`sbtest8` [PASS]
# Comparing `test`.`sbtest9` to `test`.`sbtest9` [PASS]
# Success. All objects are the same.
說明在延遲的情況下,事物并沒有丟失。
注意:
mysqlfailover程序適合于只做純粹的單點(diǎn)寫入復(fù)制架構(gòu)。
不適合于從機(jī)當(dāng)測試庫或從機(jī)做審計做其他服務(wù)器等操作。必須要嚴(yán)格保證所有從庫沒有任何的寫入。
在使用MySQLfailover時,最好在所有從庫開啟read_only參數(shù),以保證數(shù)據(jù)一致性。
在多從拓補(bǔ)中,如果master掛掉后,要將master再重新加入到原來的拓補(bǔ)中,并且還是將舊master設(shè)置為主。server1為舊master,server2為故障轉(zhuǎn)移后的master。
1、停止mysqlfailover故障轉(zhuǎn)移工具。并且啟動舊master實(shí)例。server1
2、將舊master服務(wù)器設(shè)置為現(xiàn)在的master的從服務(wù)器,用以檢查事物完整性和二進(jìn)制日志完整性:
mysqlreplicate --master=failover:123456@192.168.88.196:3307 --slave=failover:123456@192.168.88.194:3307 --rpl-user=backup:123456
3、用mysqlrpladmin 工具將舊master設(shè)置為整個拓補(bǔ)的新主:
mysqlrpladmin --master=failover:123456@192.168.88.196:3307 --new-master=failover:123456@192.168.88.194:3307 --discover-slaves-login=failover:123456 --demote-master switchover
4、恢復(fù)mysqlfailover工具啟動,這里要使用--force選項來啟動。
未經(jīng)允許,謝絕轉(zhuǎn)載
總結(jié)
以上是生活随笔為你收集整理的failover.mysql_mysqlfailover测试的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: sqoop2 java api实现_Sq
- 下一篇: java applet配置_配置Java