redis之分片集群
寫在前面
當redis單實例存儲的數據過多時,比如說20G,就會出現因為生成RDB快照較長時間比如500ms阻塞主線程的問題,在這一段時間里,因為主線程被阻塞,所以Redis實例無法正常的對外提供服務,出現這個問題的原因是,需要生成RDB的快照過大,這個時候我們就需要分片,如果是在redis3.0之前我們想要采用這種方案的話,必須使用應用程序實現,但是在redis3.0提供了redis cluster的工具,用來實現基于數據分片方式的集群,本文我們要學習的也正是通過這種方式來實現的,下面我們就開始吧!
1:實戰
1.1:部署6個Redis實例
在這里 下載linux的3.2.0版本的安裝包,假設現在我們部署了6個redis實例,我本地是在一臺機器上部署了6個實例,分別在6個文件夾下,各自配置文件不同,結構如下:
[root@localhost redis-cluster]# ll | grep redis0 drwxr-xr-x. 2 root root 250 Oct 20 16:22 redis01 drwxr-xr-x. 2 root root 253 Oct 20 16:22 redis02 drwxr-xr-x. 2 root root 253 Oct 20 16:22 redis03 drwxr-xr-x. 2 root root 250 Oct 20 16:22 redis04 drwxr-xr-x. 2 root root 250 Oct 20 16:22 redis05 drwxr-xr-x. 2 root root 250 Oct 20 16:22 redis06各自配置文件如下:
[root@localhost redis-cluster]# cat redis01/redis.conf daemonize yes port 7001 cluster-enabled yes cluster-config-file nodes_1.conf cluster-node-timeout 15000 appendonly yes [root@localhost redis-cluster]# cat redis02/redis.conf daemonize yes port 7002 cluster-enabled yes cluster-config-file nodes_7002.conf cluster-node-timeout 15000 appendonly yes [root@localhost redis-cluster]# cat redis03/redis.conf daemonize yes port 7003 cluster-enabled yes cluster-config-file nodes_7003.conf cluster-node-timeout 15000 appendonly yes [root@localhost redis-cluster]# cat redis04/redis.conf daemonize yes port 7004 cluster-enabled yes cluster-config-file nodes_4.conf cluster-node-timeout 15000 appendonly yes [root@localhost redis-cluster]# cat redis05/redis.conf daemonize yes port 7005 cluster-enabled yes cluster-config-file nodes_5.conf cluster-node-timeout 15000 appendonly yes [root@localhost redis-cluster]# cat redis06/redis.conf daemonize yes port 7006 cluster-enabled yes cluster-config-file nodes_6.conf cluster-node-timeout 15000 appendonly yes注意其中的port和cluster-config-file不一樣,注意其中cluster-config-file配置的文件創建出來即可,不需要填寫任何內容(后續會存儲分配的哈希槽等信息)。
1.2:準備redis-trib工具
復制redis解壓文件src下的redis-trib.rb文件到集群目錄(我這里是/root/study/redis-cluster),然后我們需要安裝ruby環境,如下:
[root@localhost redis-cluster]# yum install ruby [root@localhost redis-cluster]# yum install rubygems接著安裝redis-trib.rb運行依賴的ruby的包redis-3.2.2.gem,這里 下載,如下:
[root@localhost redis-cluster]# gem install redis-3.2.2.gem1.3:準備redis實例啟動腳本
因為我們這里有6個Redis實例,一個一個啟動比較麻煩,所以使用一個腳本來啟動,這里如下:
start-all.sh#! /bin/bashcd redis01 ./redis-server redis.conf cd .. cd redis02 ./redis-server redis.conf cd .. cd redis03 ./redis-server redis.conf cd .. cd redis04 ./redis-server redis.conf cd .. cd redis05 ./redis-server redis.conf cd .. cd redis06 ./redis-server redis.conf cd ..可根據實際情況進行調整。接下來使用其來啟動redis實例:
[root@localhost redis-cluster]# ./start-all.sh [root@localhost redis-cluster]# ps -ef | grep redis-server root 12560 1 0 16:21 ? 00:00:05 ./redis-server *:7001 [cluster] root 12562 1 0 16:21 ? 00:00:05 ./redis-server *:7002 [cluster] root 12566 1 0 16:21 ? 00:00:05 ./redis-server *:7003 [cluster] root 12570 1 0 16:21 ? 00:00:05 ./redis-server *:7004 [cluster] root 12574 1 0 16:21 ? 00:00:05 ./redis-server *:7005 [cluster] root 12578 1 0 16:21 ? 00:00:05 ./redis-server *:7006 [cluster]這樣我們的各個redis實例就啟動成功了,接下來就可以使用redis-trib工具來創建分片集群了。
1.4:使用redis-trib.rb創建集群
./redis-trib.rb create --replicas 1 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006--replicas 1意思是每個分片實例都有一個從節點,這里我們是6個,所以是,3個master,每個 master 1 個slave,如下運行信息:
[root@localhost redis-cluster]# ./redis-trib.rb create --replicas 1 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006 >>> Creating cluster >>> Performing hash slots allocation on 6 nodes... Using 3 masters: 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 Adding replica 127.0.0.1:7004 to 127.0.0.1:7001 Adding replica 127.0.0.1:7005 to 127.0.0.1:7002 Adding replica 127.0.0.1:7006 to 127.0.0.1:7003 M: 72bc0b6333861695f8037e06b834e4efb341af40 127.0.0.1:7001slots:0-5460 (5461 slots) master M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:5461-10922 (5462 slots) master M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:10923-16383 (5461 slots) master S: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004replicates 72bc0b6333861695f8037e06b834e4efb341af40 S: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005replicates 64303a6eb5c75659f5f82ad774adba6bc1084dca S: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006replicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 Can I set the above configuration? (type 'yes' to accept): yes >>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join.... >>> Performing Cluster Check (using node 127.0.0.1:7001) M: 72bc0b6333861695f8037e06b834e4efb341af40 127.0.0.1:7001slots:0-5460 (5461 slots) master M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:5461-10922 (5462 slots) master M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:10923-16383 (5461 slots) master M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots: (0 slots) masterreplicates 72bc0b6333861695f8037e06b834e4efb341af40 M: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005slots: (0 slots) masterreplicates 64303a6eb5c75659f5f82ad774adba6bc1084dca M: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006slots: (0 slots) masterreplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.如果是輸出[OK] All 16384 slots covered.則說明分片集群創建成功了,接下來我們可以看下集群狀態:
可以看到6個節點都成功加入了分片集群。
1.5:測試數據操作
- 寫入數據
可以看到當輸出Redirected to slot...類似的信息時就說明當前寫入的數據所屬的哈希槽不在當前實例,則內部就會自動重定向到對應哈希槽所在的集群節點,并完成寫入,如果是只輸出OK則說明當前的寫入key對應的哈希槽就在所操作的實例上。
- 獲取數據
同寫入。
1.6:測試主節點掛掉的情況
我們先看下當前的集群狀態:
[root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7001 >>> Performing Cluster Check (using node 127.0.0.1:7001) M: 72bc0b6333861695f8037e06b834e4efb341af40 127.0.0.1:7001slots:0-5460 (5461 slots) master1 additional replica(s) S: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:10923-16383 (5461 slots) master1 additional replica(s) M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:5461-10922 (5462 slots) master1 additional replica(s) S: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005slots: (0 slots) slavereplicates 64303a6eb5c75659f5f82ad774adba6bc1084dca S: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots: (0 slots) slavereplicates 72bc0b6333861695f8037e06b834e4efb341af40 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.可以看到此時的master是127.0.0.1:7001,127.0.0.1:7003,127.0.0.1:7002,此時我們將127.0.0.1:7001停掉:
[root@localhost redis-cluster]# ps -ef | grep redis-server | grep 7001 root 12560 1 0 16:21 ? 00:00:06 ./redis-server *:7001 [cluster] [root@localhost redis-cluster]# kill -9 12560此時看下集群狀態:
[root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7002 >>> Performing Cluster Check (using node 127.0.0.1:7002) M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:5461-10922 (5462 slots) master1 additional replica(s) M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots:0-5460 (5461 slots) master0 additional replica(s) S: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 S: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005slots: (0 slots) slavereplicates 64303a6eb5c75659f5f82ad774adba6bc1084dca M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:10923-16383 (5461 slots) master1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.127.0.0.1:7001原來的從節點127.0.0.1:7004已經被提升為主節點了,7001因為停止了,所以已經不在集群中了,當其重新上線后會成為7004的從節點,如下啟動7001后查看集群狀態:
[root@localhost redis-cluster]# cd redis01/ [root@localhost redis01]# ./redis-server redis.conf集群狀態:
1.7:測試新增節點
如果是新增節點的話就會涉及到哈希槽的重新分配,我們來測試下這種情況如何處理。
1.7.1:部署新節點
- 拷貝一份redis目錄redis07
- 修改redis.conf
主要修改port和cluster-config-file。
- 啟動8007新實例
此時可以看到信息:7007 [cluster],這里顯示cluster的原因是在redis.conf中配置了cluster-enabled yes,即以Redis cluster模式運行,并不代表已經加到了集群中,如下測試:
[root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7001 | grep 7007 | wc -l 0接下來將7007加入到集群中。
1.7.2:7007加入到集群
注意要將db0中的所有信息刪除,不然加入集群會提示[ERR] Node ... is not empty. ... or contains some key in database 0.之類的錯誤信息。
[root@localhost redis-cluster]# ./redis01/redis-cli -p 7007 127.0.0.1:7007> flushdb [root@localhost redis-cluster]# ./redis-trib.rb add-node 127.0.0.1:7007 127.0.0.1:7001 >>> Adding node 127.0.0.1:7007 to cluster 127.0.0.1:7001 >>> Performing Cluster Check (using node 127.0.0.1:7001) S: 72bc0b6333861695f8037e06b834e4efb341af40 127.0.0.1:7001slots: (0 slots) slavereplicates e5d87ee91b8a9cf440122354fd55521cf51c2548 M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots:0-5460 (5461 slots) master1 additional replica(s) S: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005slots: (0 slots) slavereplicates 64303a6eb5c75659f5f82ad774adba6bc1084dca M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:10923-16383 (5461 slots) master1 additional replica(s) M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:5461-10922 (5462 slots) master1 additional replica(s) S: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. >>> Send CLUSTER MEET to node 127.0.0.1:7007 to make it join the cluster. [OK] New node added correctly.當輸出信息[OK] New node added correctly.就是加入集群成功了。此時查看集群狀態:
[root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7007 >>> Performing Cluster Check (using node 127.0.0.1:7007) S: a2d36b809a11205a095ce4da3f1b84b42a38f0f2 127.0.0.1:7007slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots:0-5460 (5461 slots) master1 additional replica(s) S: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005slots: (0 slots) slavereplicates 64303a6eb5c75659f5f82ad774adba6bc1084dca M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:10923-16383 (5461 slots) master2 additional replica(s) S: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:5461-10922 (5462 slots) master1 additional replica(s) S: 72bc0b6333861695f8037e06b834e4efb341af40 127.0.0.1:7001slots: (0 slots) slavereplicates e5d87ee91b8a9cf440122354fd55521cf51c2548 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.其中關于7007的信息:
S: a2d36b809a11205a095ce4da3f1b84b42a38f0f2 127.0.0.1:7007slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31說明其被加入到集群中并分配為slave節點,但不負責處理任何哈希槽數據,接下來看下如何重新分配哈希槽。
1.8:重新分配哈希槽
假定是7003增加了內存,我們想要給7003分配更多的哈希槽,此時可以使用redis-trib.rb reshard 集群某節點地址(標識集群)來重新分配哈希槽:
[root@localhost redis-cluster]# ./redis-trib.rb reshard 127.0.0.1:7003 // 注意這里只是代表了集群,而非確定給7003分配哈希槽 >>> Performing Cluster Check (using node 127.0.0.1:7003) M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:10923-16383 (5461 slots) master2 additional replica(s) M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots:0-5460 (5461 slots) master1 additional replica(s) S: 72bc0b6333861695f8037e06b834e4efb341af40 127.0.0.1:7001slots: (0 slots) slavereplicates e5d87ee91b8a9cf440122354fd55521cf51c2548 M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:5461-10922 (5462 slots) master1 additional replica(s) S: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 S: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005slots: (0 slots) slavereplicates 64303a6eb5c75659f5f82ad774adba6bc1084dca S: a2d36b809a11205a095ce4da3f1b84b42a38f0f2 127.0.0.1:7007slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. How many slots do you want to move (from 1 to 16384)?How many slots do you want to move (from 1 to 16384)? 這里是詢問想要遷移多少個哈希槽,我們為7003在原有哈希槽基礎上繼續多分配5096個哈希槽,所以輸入如下:
How many slots do you want to move (from 1 to 16384)? 4096 What is the receiving node ID?What is the receiving node ID? 是詢問接收新哈希槽的實例ID,這里我們是7003,所以輸入0c0157c84d66d9cbafe01324c5da352a5f15ab31:
How many slots do you want to move (from 1 to 16384)? 5096 What is the receiving node ID? 0c0157c84d66d9cbafe01324c5da352a5f15ab31 Please enter all the source node IDs.Type 'all' to use all the nodes as source nodes for the hash slots.Type 'done' once you entered all the source nodes IDs. Source node #1:詢問Source node #1是從哪里獲取這5096個哈希槽,因為我們是重新分配,所以可以輸入all代表從集群中的所有節點獲取哈希槽,如下:
Ready to move 5096 slots.Source nodes:M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots:0-5460 (5461 slots) master1 additional replica(s)M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:5461-10922 (5462 slots) master1 additional replica(s)Destination node:M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:10923-16383 (5461 slots) master2 additional replica(s)Resharding plan:Moving slot 5461 from 64303a6eb5c75659f5f82ad774adba6bc1084dcaMoving slot 5462 from 64303a6eb5c75659f5f82ad774adba6bc1084dcaMoving slot 5463 from 64303a6eb5c75659f5f82ad774adba6bc1084dca...省略哈希槽移動信息... Do you want to proceed with the proposed reshard plan (yes/no)?最后詢問我們是否同意重新分片的方案,輸入yes則開始重新分片,結束后,查看結果:
[root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7001 >>> Performing Cluster Check (using node 127.0.0.1:7001) S: 72bc0b6333861695f8037e06b834e4efb341af40 127.0.0.1:7001slots: (0 slots) slavereplicates e5d87ee91b8a9cf440122354fd55521cf51c2548 S: a2d36b809a11205a095ce4da3f1b84b42a38f0f2 127.0.0.1:7007slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots:2547-5460 (2914 slots) master1 additional replica(s) S: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005slots: (0 slots) slavereplicates 64303a6eb5c75659f5f82ad774adba6bc1084dca M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003slots:0-2546,5461-8009,10923-16383 (10557 slots) master2 additional replica(s) M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002slots:8010-10922 (2913 slots) master1 additional replica(s) S: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006slots: (0 slots) slavereplicates 0c0157c84d66d9cbafe01324c5da352a5f15ab31 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.其中7003的哈希槽信息是slots:0-2546,5461-8009,10923-16383 (10557 slots) master,加上之前已經分配的哈希槽,以及新增分配的5096個哈希槽,現在共有10557個哈希槽。
1.9:刪除節點
我們使用命令redis-trib.rb del-node {集群任意以實例地址(用來確定集群)} {要刪除節點ID},刪除節點分為刪除master節點和刪除slave節點,分別來看下。
1.9.1:刪除master節點
我們先來看下當前的集群的master節點都有哪些:
[root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7001 | grep 'M:' M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004 M: 0c0157c84d66d9cbafe01324c5da352a5f15ab31 127.0.0.1:7003 M: 64303a6eb5c75659f5f82ad774adba6bc1084dca 127.0.0.1:7002我們來測試刪除M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004,如下:
[root@localhost redis-cluster]# ./redis-trib.rb del-node 127.0.0.1:7001 e5d87ee91b8a9cf440122354fd55521cf51c2548 >>> Removing node e5d87ee91b8a9cf440122354fd55521cf51c2548 from cluster 127.0.0.1:7001 [ERR] Node 127.0.0.1:7004 is not empty! Reshard data away and try again.提示[ERR] Node 127.0.0.1:7004 is not empty! Reshard data away and try again.說明,當前該節點還有哈希槽信息,我們需要重哈希移走所有的數據才行,接下來執行操作,將7004(Id:e5d87ee91b8a9cf440122354fd55521cf51c2548)的哈希槽slots:2547-5460 (2914 slots)移動到7003,Id(0c0157c84d66d9cbafe01324c5da352a5f15ab31):
[root@localhost redis-cluster]# ./redis-trib.rb reshard 127.0.0.1:7003 >>> Performing Cluster Check (using node 127.0.0.1:7003) ... [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. How many slots do you want to move (from 1 to 16384)? 2914 # 移動哈希槽的個數 What is the receiving node ID? 0c0157c84d66d9cbafe01324c5da352a5f15ab31 # 接收哈希槽的實例ID Please enter all the source node IDs.Type 'all' to use all the nodes as source nodes for the hash slots.Type 'done' once you entered all the source nodes IDs. Source node #1:e5d87ee91b8a9cf440122354fd55521cf51c2548 # 移除哈希槽的實例ID Source node #2:done # 以上就是所有的移除哈希槽的實例ID...Moving slot 5458 from e5d87ee91b8a9cf440122354fd55521cf51c2548Moving slot 5459 from e5d87ee91b8a9cf440122354fd55521cf51c2548Moving slot 5460 from e5d87ee91b8a9cf440122354fd55521cf51c2548 Do you want to proceed with the proposed reshard plan (yes/no)? yes # 同意移動哈希槽方案 Moving slot 5367 from 127.0.0.1:7004 to 127.0.0.1:7003: ... Moving slot 5369 from 127.0.0.1:7004 to 127.0.0.1:7003:這樣,7004的哈希槽就全部移動到7003了,如下:
[root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7002 >>> Performing Cluster Check (using node 127.0.0.1:7002) ... M: e5d87ee91b8a9cf440122354fd55521cf51c2548 127.0.0.1:7004slots: (0 slots) master0 additional replica(s) ... [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.此時我們就可以刪除節點7004了:
[root@localhost redis-cluster]# ./redis-trib.rb del-node 127.0.0.1:7003 e5d87ee91b8a9cf440122354fd55521cf51c2548 >>> Removing node e5d87ee91b8a9cf440122354fd55521cf51c2548 from cluster 127.0.0.1:7003 >>> Sending CLUSTER FORGET messages to the cluster... >>> SHUTDOWN the node. [root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7003 | grep '7004' | wc -l 0不僅從集群中刪除節點,實例也直接停止了:
[root@localhost redis-cluster]# ps -ef | grep redis | grep 7004 | wc -l 01.9.2:刪除slave節點
我們先來看下當前的集群的slave節點都有哪些:
[root@localhost redis-cluster]# ./redis-trib.rb check 127.0.0.1:7001 | grep 'S:' S: 72bc0b6333861695f8037e06b834e4efb341af40 127.0.0.1:7001 S: a2d36b809a11205a095ce4da3f1b84b42a38f0f2 127.0.0.1:7007 S: 6f541da782a73c3846fb156de98870ced01364d9 127.0.0.1:7005 S: 09703f108c1794653e7f7f3eaa4c4baa2ded0d49 127.0.0.1:7006刪除127.0.0.1:7005,如下:
[root@localhost redis-cluster]# ./redis-trib.rb del-node 127.0.0.1:7003 6f541da782a73c3846fb156de98870ced01364d9 >>> Removing node 6f541da782a73c3846fb156de98870ced01364d9 from cluster 127.0.0.1:7003 >>> Sending CLUSTER FORGET messages to the cluster... >>> SHUTDOWN the node.成功刪除,并停止了實例。
2:理論
一般的,使用redis的讀寫分離主從架構 并配合哨兵 就能滿足業務需求了,但是這只是在少量數據的前提下,當redis數據量到達一定程度之后,生成rdb 造成主線程阻塞的時長就會顯著增加,此期間,redis集群是無法對外提供服務的,我本地測試,當數據量到達15G時,這個fork生成RDB造成的阻塞時長達到800ms,如下:
[root@localhost redis-cluster]# ./redis01/redis-cli -p 7003 -c info | grep latest latest_fork_usec:812這顯然是不可接受的,因為在這將近一秒的時間里,會直接影響到用戶,出現這種現象根本的原因就是數據量大,我們只需要想辦法讓數據量變小就行了,那么怎變小呢,將數據分成多份就行了,即采用分片的模式來處理這種問題,單例和分片的架構圖對比如下:
但是分片之后也有一些問題需要我們解決,主要如下:
1:如何確定數據應該保存在哪個節點 2:如何知道要保存的數據在哪個節點 3:當新增節點時如何重新劃分數據分布,并遷移數據 4:當刪除節點時如何遷移被刪除節點數據以上的問題在進行分片之后都需要解決,針對這些問題,redis3.0使用了redis cluster的方案給出了答案,redis cluster定義了0~16383共16384個哈希槽,然后將這些哈希槽分配到集群的節點上,當存儲一個key時,通過crc16(key)%16384,得到的結果就是其對應的哈希槽,哈希槽所分配的節點就是其應該存儲的位置,這個關系如下圖:
另外客戶端程序會維護slot->node的一個對應關系,這樣客戶端就可以知道要操作的key是在哪個node,就直接向對應的node發起請求就行了,另外如果是增加節點,刪除節點進行reshard導致哈希槽重新分布的話,客戶端維護的slot->node的信息就會部分生效,此時如果根據某key獲取的node并非最新的node,此時node會返回MOVED {crc16(key)%16384} node地址,然后客戶端會向這個新地址重新發起請求(重定向機制),并更新本地的slot->node信息。
3:集群規模是越大越好嗎?
集群規模并非越大越好,官方建議最大不要超過1000個節點,主要原因就是PING/PONG消息會占用大量的寬帶資源,進而直接影響服務器的數據處理性能。1000個節點的PING/PONG消息,包括了自身的狀態信息,部分其他實例的狀態信息,slot分配信息等,大小大概是在24K左右,每1秒鐘就會隨機選擇十分之一即10臺機器來發送PING消息,此時總大小是240K,因為PING/PONG消息的傳遞是通過gossip 協議的,所以每個實例都會有相同的動作。每秒占用幾百k的帶寬資源其實還好,畢竟現在帶寬一般都是有十幾兆,甚至幾十兆的,但是還需要考慮另一種情況就是,當PONG消息超時后實例接下來的動作。
因為PONG消息超時(cluster-node-timeout/2,默認15秒)后,可能意味著目標節點出現問題,甚至是宕機了,如果是真的宕機了,則集群需要迅速發現問題,并對其執行一些動作,如主從切換,來保證集群的高可用性,所以,對于這些節點,實例會以100ms的頻率,發送ping消息,如果是此時因為網絡波動導致有大量的實例超時的話,則占用寬帶的量就會不可預估,發送PING/PONG的頻率也會更高。
所以在實際應用中我們要控制集群的大小,避免出現PING/PONG消息過多,過大導致網絡阻塞,降低服務器性能。
寫在后面
參考文章列表:
Redis Cluster集群的搭建與實踐 。
redis之路(七):Redis單機多節點集群 。
分布式環境一致性協議gossip 。
分布式之gossip共識算法分析 。
總結
以上是生活随笔為你收集整理的redis之分片集群的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 大学生职业生涯规划书
- 下一篇: anybackup mysql_AnyB