浅述 Docker 的容器编排
作者 |?天元浪子
來源 | CSDN博客
概述
作為容器引擎,Docker為容器化的應用程序提供了開放標準,使得開發者可以用管理應用程序的方式來管理基礎架構,實現快速交付、測試和部署代碼。隨著容器的大量使用,又產生了如何協調、調度和管理容器的問題,Docker的容器編排應運而生。所謂容器編排,通俗一點可以理解為集群管理。
Docker的容器編排工具有很多,最出名的當屬Compose、Machine和Swarm,合稱Docker三劍客。其中Compose和Machine是第三方的,而Swarm則是Docker官方的容器編排工具,已經被集成在Docker中了。
Swarm由三大部分組成:
swarm:集群管理
node:節點管理
service:服務管理
集群與節點管理
使用?docker swarm 命令,可以創建或加入集群,Docker集群中的節點分為manager和worker兩種。這兩種節點,都可以運行Docker容器,但只有manager節點,擁有管理功能。
一個集群中,即便只有manager節點也可以正常的工作。
2.1 創建集群
我測試的環境有兩臺機器,ip地址分別為192.168.1.220和192.168.1.116。下面在192.168.1.220上創建集群:
# docker swarm init Swarm initialized: current node (ppmurem8j7mdbmgpdhssjh0h9) is now a manager.To add a worker to this swarm, run the following command:docker swarm join --token SWMTKN-1-3e4l8crbt04xlqfxwyw6nuf9gtcwpw72zggtayzy8clyqmvb5h-7o6ww4ftwm38dz7ydbolsz3kd 192.168.1.220:2377To?add?a?manager?to?this?swarm,?run?'docker?swarm?join-token?manager'?and?follow?the?instructions.執行?docker swarm init 后,集群就被創建好了。當前的機器,自動成為集群的manager節點,并且輸出了其他機器加入集群的方式,
即:docker swarm join --token SWMTKN-1-3e4l8crbt04xlqfxwyw6nuf9gtcwpw72zggtayzy8clyqmvb5h-7o6ww4ftwm38dz7ydbolsz3kd 192.168.1.220:2377。
使用這個token加入的節點,是worker節點,如果想加入一個新的manager節點,可以執行 docker swarm join-token manager,它也會輸出一串類似的命令,執行就可以以manager的方式加入。如果忘記加入的命令,也可以使用docker swarm join-token worker 進行查看。
2.2 加入集群
下面在192.168.1.116上執行加入命令:
# docker swarm join --token SWMTKN-1-12dlq70adr3z38mlkltc288rdzevtjn73xse7d0qndnjmx45zs-b1kwenzmrsqb4o5nvni5rafcr 192.168.1.220:2377 This?node?joined?a?swarm?as?a?worker.這里發生了一個小插曲,在我創建集群的兩臺機器的時區不一致,導致在加入worker節點時報錯:
Error?response?from?daemon:?error?while?validating?Root?CA?Certificate:?x509:?certificate?has?expired?or?is?not?yet?valid在更新了220的時區后,依然無法加入。于是,我刪除了集群又重新創建,就可以了。沒有嘗試使用docker swarm update是不是也可以。
加入了集群后,可以在manager節點上,查詢集群的節點:
# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 9b4cmakc4hpc9ra4rruy5x5yo * localhost.localdomain Ready Active Leader 20.10.3 hz50cnwrbk4vxa7h0g23ccil9?????zhangmh-virtual-machine???Ready?????Active??????????????????????????20.10.12.3 退出集群
在192.168.1.116上執行下面命令,可以退出集群:
# docker swarm leave Node?left?the?swarm.再次查看節點:
# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 9b4cmakc4hpc9ra4rruy5x5yo * localhost.localdomain Ready Active Leader 20.10.3 hz50cnwrbk4vxa7h0g23ccil9?????zhangmh-virtual-machine???Down??????Active??????????????????????????20.10.1發現剛退出的這個節點還在,只是狀態變成了Down。需要在manager節點中刪除:
# docker node rm hz50cnwrbk4vxa7h0g23ccil9 hz50cnwrbk4vxa7h0g23ccil9 # docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 9b4cmakc4hpc9ra4rruy5x5yo * localhost.localdomain Ready Active Leader 20.10.3 xby86ffkqw3axyfkwd4s7nubz?????zhangmh-virtual-machine???Ready?????Active??????????????????????????20.10.1這樣才真正刪除了節點。
如果退出的節點是manager節點,需要強制退出,即:docker swarm leave -f。
2.4 將節點提升為 manager 節點
只有一個manager的集群是不穩定的,當manager節點崩潰時,整個集群就群龍無首了。Docker認為,一個集群中應該至少有三個manager節點,并且有一半以上的manager節點是可達的,才能保證集群的正常運行。當集群中只有兩個manager節點,且有一個節點出現問題時,整個集群還是處于不可用的狀態。
當然,對于我們測試,是沒有必要的,我們只需要使用兩個manager節點,測試一下是否可以主從切換就可以了。使用下面的命令,可以直接將workder節點提升為manager節點:
# docker node promote xby86ffkqw3axyfkwd4s7nubz Node xby86ffkqw3axyfkwd4s7nubz promoted to a manager in the swarm. # docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 9b4cmakc4hpc9ra4rruy5x5yo * localhost.localdomain Ready Active Leader 20.10.3 xby86ffkqw3axyfkwd4s7nubz?????zhangmh-virtual-machine???Ready?????Active?????????Reachable????????20.10.1OK,現在有兩個manager節點了,192.168.1.220的狀態為leader,即當前是領導節點,192.168.1.116的狀態為Reachable,是可達的。下面關閉192.168.1.220節點的Docker服務:
# systemctl stop docker Warning: Stopping docker.service, but it can still be activated by:docker.socket關閉時輸出了一個警告,意思是Docker服務已經被關閉了,但它仍然可被docker.socket服務喚醒。再次查看節點狀態:
# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 9b4cmakc4hpc9ra4rruy5x5yo * localhost.localdomain Ready Active Reachable 20.10.3 xby86ffkqw3axyfkwd4s7nubz?????zhangmh-virtual-machine???Ready?????Active?????????Leader???????????20.10.1可以看到192.168.1.116已經成為了Leader,并且,192.168.1.220也已經被喚醒。
由此可見,Docker集群的穩定是相當不錯的。
服務管理
集群中各節點都配置好后,就可以創建服務了。Docker的服務其實就是啟動容器,并且賦予了容器副本和負載均衡的能力。以之前創建的ws:1.0為例,創建5個副本:
# docker service create --replicas 5 --name ws -p 80:8000 ws:1.0 image ws:1.0 could not be accessed on a registry to record its digest. Each node will access ws:1.0 independently, possibly leading to different nodes running different versions of the image.1nj3o38slbo2zwt5p69l1qi5t overall progress: 5 out of 5 tasks 1/5: running [==================================================>] 2/5: running [==================================================>] 3/5: running [==================================================>] 4/5: running [==================================================>] 5/5: running [==================================================>] verify:?Service?converged服務已經創建并運行了,使用瀏覽器訪問192.168.1.220和192.168.1.116的80端口都可以訪問。
使用?docker service ls?命令可以查看ws服務:
# docker service ls ID NAME MODE REPLICAS IMAGE PORTS 1nj3o38slbo2???ws????????replicated???5/5????????ws:1.0????*:80->8000/tcp使用?docker service ps ws?命令可查看ws服務的進程:
# docker service ps ws ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS jpckj0mn24ae ws.1 ws:1.0 zhangmh-virtual-machine Running Running 6 minutes ago yrrdn4ntb089 ws.2 ws:1.0 localhost.localdomain Running Running 6 minutes ago mdjxadbmlmhs ws.3 ws:1.0 zhangmh-virtual-machine Running Running 6 minutes ago kqdwfrddbaxd ws.4 ws:1.0 localhost.localdomain Running Running 6 minutes ago is2iimz1v4eb???ws.5??????ws:1.0????zhangmh-virtual-machine???Running?????????Running?6?minutes?ago可以看到有兩個進程運行在192.168.1.220上,三個進程運行在192.168.1.116上。我在瀏覽器上訪問了幾次之后 ,使用?docker service logs ws?命令查看服務的日志:
# docker service logs ws ws.5.is2iimz1v4eb@zhangmh-virtual-machine | [I 210219 01:57:23 web:2239] 200 GET / (10.0.0.2) 3.56ms ws.5.is2iimz1v4eb@zhangmh-virtual-machine | [W 210219 01:57:23 web:2239] 404 GET /favicon.ico (10.0.0.2) 0.97ms ws.5.is2iimz1v4eb@zhangmh-virtual-machine | [I 210219 01:57:28 web:2239] 200 GET / (10.0.0.4) 0.82ms ws.5.is2iimz1v4eb@zhangmh-virtual-machine | [W 210219 01:57:28 web:2239] 404 GET /favicon.ico (10.0.0.4) 0.79ms ws.1.jpckj0mn24ae@zhangmh-virtual-machine | [I 210219 02:01:45 web:2239] 304 GET / (10.0.0.2) 1.82ms ws.1.jpckj0mn24ae@zhangmh-virtual-machine | [I 210219 02:01:59 web:2239] 304 GET / (10.0.0.2) 0.49ms ws.1.jpckj0mn24ae@zhangmh-virtual-machine | [I 210219 02:02:01 web:2239] 304 GET / (10.0.0.2) 2.05ms ws.1.jpckj0mn24ae@zhangmh-virtual-machine | [I 210219 02:02:02 web:2239] 304 GET / (10.0.0.2) 0.89ms ws.1.jpckj0mn24ae@zhangmh-virtual-machine | [I 210219 02:02:02 web:2239] 304 GET / (10.0.0.2) 1.13ms ws.1.jpckj0mn24ae@zhangmh-virtual-machine | [I 210219 02:02:03 web:2239] 304 GET / (10.0.0.2) 0.92ms ws.1.jpckj0mn24ae@zhangmh-virtual-machine | [I 210219 02:02:03 web:2239] 304 GET / (10.0.0.2) 2.19ms ws.1.jpckj0mn24ae@zhangmh-virtual-machine????|?[I?210219?02:02:20?web:2239]?304?GET?/?(10.0.0.2)?1.00ms可以看到即使我訪問的是192.168.1.220,而實際訪問的扔然是192.168.1.116上的進程。
如果把192.168.1.116關機,其上運行的進程會自動轉移到192.168.1.220的節點中,因為192.168.1.116現在是manager節點,如果停止,集群會進入不可用的狀態,所以,需要先將其降級為worker節點:
# docker node demote xby86ffkqw3axyfkwd4s7nubz Manager?xby86ffkqw3axyfkwd4s7nubz?demoted?in?the?swarm.然后,將192.168.1.116關機。
# docker service ps ws ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS jrj9ben9vr5c ws.1 ws:1.0 localhost.localdomain Running Running 57 minutes ago yrrdn4ntb089 ws.2 ws:1.0 localhost.localdomain Running Running about an hour ago opig9zrmp261 ws.3 ws:1.0 localhost.localdomain Running Running 57 minutes ago kqdwfrddbaxd ws.4 ws:1.0 localhost.localdomain Running Running about an hour ago hiz8730pl3je???ws.5???????ws:1.0????localhost.localdomain?????Running?????????Running?57?minutes?ago可以看到5個進程都轉移到192.168.1.220上運行了。
# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bc4c457ce769 ws:1.0 "/bin/sh -c 'python …" 3 hours ago Up 3 hours ws.5.hiz8730pl3je7qvo2lv6k554b c846ac1c4d91 ws:1.0 "/bin/sh -c 'python …" 3 hours ago Up 3 hours ws.3.opig9zrmp2619t4e1o3ntnj2w 214daa36c138 ws:1.0 "/bin/sh -c 'python …" 3 hours ago Up 3 hours ws.1.jrj9ben9vr5c3biuc90xtoffh 17842db9dc47 ws:1.0 "/bin/sh -c 'python …" 3 hours ago Up 3 hours ws.4.kqdwfrddbaxd5z78uo3zsy5sd 47185ba9a4fd ws:1.0 "/bin/sh -c 'python …" 3 hours ago Up 3 hours ws.2.yrrdn4ntb089t6i66w8xvq8r9 # docker kill bc4c457ce769 bc4c457ce769殺死第5個進程后,等待幾秒再查看進程:
# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 416b55e8d174 ws:1.0 "/bin/sh -c 'python …" About a minute ago Up About a minute ws.5.fvpm334t2zqbj5l50tyx5glr6 c846ac1c4d91 ws:1.0 "/bin/sh -c 'python …" 3 hours ago Up 3 hours ws.3.opig9zrmp2619t4e1o3ntnj2w 214daa36c138 ws:1.0 "/bin/sh -c 'python …" 3 hours ago Up 3 hours ws.1.jrj9ben9vr5c3biuc90xtoffh 17842db9dc47 ws:1.0 "/bin/sh -c 'python …" 3 hours ago Up 3 hours ws.4.kqdwfrddbaxd5z78uo3zsy5sd 47185ba9a4fd???ws:1.0????"/bin/sh?-c?'python?…"???3?hours?ago??????????Up?3?hours????????????????????ws.2.yrrdn4ntb089t6i66w8xvq8r9第5個進程又被啟動。
Docker服務的副本數量是可以動態調整的,比如系統負載過高,需要添加副本時,只需要執行:
# docker service scale ws=6 ws scaled to 6 overall progress: 6 out of 6 tasks 1/6: running [==================================================>] 2/6: running [==================================================>] 3/6: running [==================================================>] 4/6: running [==================================================>] 5/6: running [==================================================>] 6/6: running [==================================================>] verify:?Service?converged這樣,就增加了一個副本。
服務創建好以后,就可以隨著Docker的系統服務被啟動,只要執行:
systemctl?enable?docker剛才創建的集群和服務都會開機啟動,不用擔心機器重啟導致程序運行不正常。
共享數據卷
首先,使用?docker volume create?命令創建一個數據卷:
# docker volume create ws_volume ws_volume創建完成后,使用?docker volume ls?命令可查看現有的數據卷:
# docker volume ls DRIVER VOLUME NAME local?????ws_volume使用?docker inspect?命令可查看數據卷的詳細信息:
# docker inspect ws_volume [{"CreatedAt": "2021-02-19T14:09:58+08:00","Driver": "local","Labels": {},"Mountpoint": "/var/lib/docker/volumes/ws_volume/_data","Name": "ws_volume","Options": {},"Scope": "local"} ]在創建service時,可使用?--mount?參數將數據卷掛載到service中:
# docker service create --replicas 2 --name ws -p 80:8000 --mount type=volume,src=ws_volume,dst=/volume ws:1.0 image ws:1.0 could not be accessed on a registry to record its digest. Each node will access ws:1.0 independently, possibly leading to different nodes running different versions of the image.iiiit9slq9qqwcdwwi0w0mcz5 overall progress: 2 out of 2 tasks 1/2: running [==================================================>] 2/2: running [==================================================>] verify:?Service?converged--mount?有很多的子參數,把它們寫成key=value的形式,然后用逗號隔開即可,最簡單的,只需要設置type、src、dst三個參數即可。
往期推薦
如果讓你來設計網絡
70% 開發者對云原生一知半解,“云深”如何知處?
一把王者的時間,我就學會了Nginx
如何在 Kubernetes Pod 內進行網絡抓包
點分享
點收藏
點點贊
點在看
總結
以上是生活随笔為你收集整理的浅述 Docker 的容器编排的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 字节大战腾讯元宇宙;Docker 自己定
- 下一篇: 看懂 IPv6+,这篇就够了