當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Nomad入门

發布時間：2025/3/21 编程问答 45 豆豆

生活随笔收集整理的這篇文章主要介紹了 Nomad入门小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Nomad 簡介

Nomad是一個管理機器集群并在集群上運行應用程序的工具。

Nomad的特點：

支持docker,Nomad的job可以使用docker驅動將應用部署到集群中。
Nomad安裝在linux中僅需單一的二進制文件，不需要其他服務協調，Nomad將資源管理器和調度程序的功能集成到一個系統中。
多數據中心，可以跨數據中心調度。
分布式高可用，支持多種驅動程序（Docker、VMS、Java）運行job，支持多種系統（Linux、Windows、BSD、OSX）。

Nomad安裝

一般環境下，首先安裝Vagrant，利用Vagrant連接本地的Virtualbox，創建本地測試環境。不過由于在學習過程中，本地win7環境缺失了一些組件，導致無法安裝并使用Vagrant。
所以直接使用Linux虛擬機來進行學習。本環境使用Ubuntu16.04，Docker version 17.09.0-ce。

下載Nomad二進制文件,選擇適合你系統的安裝包。

# wget https://releases.hashicorp.com/nomad/0.7.0/nomad_0.7.0_linux_amd64.zip?_ga=2.169483045.503594617.1512349197-1498904827.1511322624

解壓安裝包，將Nomad文件放在/usr/local/bin下.

# unzip -o nomad_0.7.0_linux_amd64.zip -d /usr/local/bin/ # cd /usr/local/bin # chmod +x nomad

終端輸入nomad，可看到nomad 提示，即安裝成功。

開始Nomad

為了簡單運行，我們以開發模式運行Nomad agent。開發模式可以快速啟動server端和client端，測試學習Nomad。

# nomad agent -dev==> Starting Nomad agent... ==> Nomad agent configuration:Client: trueLog Level: DEBUGRegion: global (DC: dc1)Server: true==> Nomad agent started! Log data will stream in below:[INFO] serf: EventMemberJoin: nomad.global 127.0.0.1[INFO] nomad: starting 4 scheduling worker(s) for [service batch _core][INFO] client: using alloc directory /tmp/NomadClient599911093[INFO] raft: Node at 127.0.0.1:4647 [Follower] entering Follower state[INFO] nomad: adding server nomad.global (Addr: 127.0.0.1:4647) (DC: dc1)[WARN] fingerprint.network: Ethtool not found, checking /sys/net speed file[WARN] raft: Heartbeat timeout reached, starting election[INFO] raft: Node at 127.0.0.1:4647 [Candidate] entering Candidate state[DEBUG] raft: Votes needed: 1[DEBUG] raft: Vote granted. Tally: 1[INFO] raft: Election won. Tally: 1[INFO] raft: Node at 127.0.0.1:4647 [Leader] entering Leader state[INFO] raft: Disabling EnableSingleNode (bootstrap)[DEBUG] raft: Node 127.0.0.1:4647 updated peer set (2): [127.0.0.1:4647][INFO] nomad: cluster leadership acquired[DEBUG] client: applied fingerprints [arch cpu host memory storage network][DEBUG] client: available drivers [docker exec java][DEBUG] client: node registration complete[DEBUG] client: updated allocations at index 1 (0 allocs)[DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 0)[DEBUG] client: state updated to ready

在終端輸出中看到，server和client都為true，表示同時開啟了server和client。

Nomad集群節點

# nomad node-statusID DC Name Class Drain Status fb533fd8 dc1 yc-jumpbox <none> false ready

輸出顯示了我們的節點ID，它是隨機生成的UUID，其數據中心，節點名稱，節點類別，漏斗模式和當前狀態。我們可以看到我們的節點處于就緒狀態。

# nomad server-membersName Address Port Status Leader Protocol Build Datacenter Region yc-jumpbox.global 10.30.0.52 4648 alive true 2 0.7.0 dc1 global

輸出顯示了我們自己的server，運行的地址，運行狀況，一些版本信息以及數據中心和區域。

停止Nomad agent

你可以使用Ctrl-C中斷agent。默認情況下，所有信號都會導致agent強制關閉。

Nomad Job

Job是我們在使用Nomad主要交互的內容。

示例Job

進入你的工作目錄使用nomad init命令。它會在當前目錄生成一個example.nomad,這是一個示例的nomad job配置文件。 # cd /tmp # nomad init Example job file written to example.nomad

運行這個job，我們使用nomad run命令。

# nomad run example.nomad ==> Monitoring evaluation "13ebb66d"Evaluation triggered by job "example"Allocation "883269bf" created: node "e42d6f19", group "cache"Evaluation within deployment: "b0a84e74"Evaluation status changed: "pending" -> "complete" ==> Evaluation "13ebb66d" finished with status "complete"

查看job狀態,我們使用nomad status 命令

# nomad status example ID = example Name = example Submit Date = 12/05/17 10:58:40 UTC Type = service Priority = 50 Datacenters = dc1 Status = running Periodic = false Parameterized = falseSummary Task Group Queued Starting Running Failed Complete Lost cache 0 0 1 0 0 0Latest Deployment ID = b0a84e74 Status = successful Description = Deployment completed successfullyDeployed Task Group Desired Placed Healthy Unhealthy cache 1 1 1 0Allocations ID Node ID Task Group Version Desired Status Created At 883269bf e42d6f19 cache 0 run running 12/05/17 10:58:40 UTC

檢查job的分配情況，我們使用nomad alloc-status命令。

# nomad alloc-status 83269bf ID = 83269bf Eval ID = 3ebb66d Name = example.cache[0] Node ID = e42d6f19 Job ID = example Job Version = 0 Client Status = running Client Description = <none> Desired Status = run Desired Description = <none> Created At = 12/05/17 10:58:49 UTC Deployment ID = b0a84e74 Deployment Health = healthyTask "redis" is "running" Task Resources CPU Memory Disk IOPS Addresses 8/500 MHz 6.3 MiB/256 MiB 300 MiB 0 db: 127.0.0.1:22672Task Events: Started At = 12/05/17 10:58:49 UTC Finished At = N/A Total Restarts = 0 Last Restart = N/ARecent Events: Time Type Description 10/31/17 22:58:49 UTC Started Task started by client 10/31/17 22:58:40 UTC Driver Downloading image redis:3.2 10/31/17 22:58:40 UTC Task Setup Building Task Directory 10/31/17 22:58:40 UTC Received Task received by client

查看job日志，我們使用nomad logs 命令。注意logs后面的參數為uuid和task名字。uuid可以通過nomad status example命令得到，task名字在example.nomad配置文件中定義。

# nomad logs 883269bf redis_.__.-``__ ''-.__.-`` `. `_. ''-._ Redis 3.2.1 (00000000/0) 64 bit.-`` .-```. ```\/ _.,_ ''-._( ' , .-` | `, ) Running in standalone mode|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379| `-._ `._ / _.-' | PID: 1`-._ `-._ `-./ _.-' _.-'|`-._`-._ `-.__.-' _.-'_.-'|| `-._`-._ _.-'_.-' | http://redis.io`-._ `-._`-.__.-'_.-' _.-'|`-._`-._ `-.__.-' _.-'_.-'|| `-._`-._ _.-'_.-' |`-._ `-._`-.__.-'_.-' _.-'`-._ `-.__.-' _.-'`-._ _.-'`-.__.-' ...

修改job

# vim example.nomad 在文件中找到 count = 1，改為count = 3. 完成修改后，使用nomad plan example.nomad命令 # nomad plan example.nomad +/- Job: "example" +/- Task Group: "cache" (2 create, 1 in-place update)+/- Count: "1" => "3" (forces create)Task: "redis"Scheduler dry-run: - All tasks successfully allocated.Job Modify Index: 7 To submit the job with version verification run:nomad run -check-index 7 example.nomadWhen running the job with the check-index flag, the job will only be run if the server side version matches the job modify index returned. If the index has changed, another user has modified the job and the plan's results are potentially invalid.

使用給出的更新命令去更新job。

# nomad run -check-index 7 example.nomad ==> Monitoring evaluation "93d16471"Evaluation triggered by job "example"Evaluation within deployment: "0d06e1b6"Allocation "3249e320" created: node "e42d6f19", group "cache"Allocation "453b210f" created: node "e42d6f19", group "cache"Allocation "883269bf" modified: node "e42d6f19", group "cache"Evaluation status changed: "pending" -> "complete" ==> Evaluation "93d16471" finished with status "complete"

停止job，我們使用nomad stop命令。使用nomad status命令可以看到這個job的狀態為dead（stopped）。

# nomad stop example ==> Monitoring evaluation "6d4cd6ca"Evaluation triggered by job "example"Evaluation within deployment: "f4047b3a"Evaluation status changed: "pending" -> "complete" ==> Evaluation "6d4cd6ca" finished with status "complete"

建立簡單的Nomad集群

Nomad集群分為兩部分，server服務端和client客戶端。每個區域至少有一臺server，建議使用3或者5臺server集群。Nomad客戶端是一個非常輕量級的進程，它注冊主機，執行心跳，并運行由服務器分配給它的任務。代理必須在集群中的每個節點上運行，以便服務器可以將工作分配給這些機器。

啟動服務器

第一步是為服務器創建配置文件。無論是從下載的文件github，或粘貼到一個名為server.hcl：

vim server.hcl # Increase log verbosity log_level = "DEBUG"#setup datacenter datacenter= "dc1"# Setup data dir data_dir = "/tmp/server1"# Enable the server server { enabled = true# Self-elect, should be 3 or 5 for production bootstrap_expect = 1}

這是一個相當最小的服務器配置文件，但只能以僅服務器方式啟動代理，并將其選為leader。應該對生產進行的主要變化是運行多臺服務器，并更改相應的bootstrap_expect值。
創建文件后，在新選項卡中啟動代理：

$ sudo nomad agent -config server.hcl ==> WARNING: Bootstrap mode enabled! Potentially unsafe operation. ==> Starting Nomad agent... ==> Nomad agent configuration: Client: false Log Level: DEBUG Region: global (DC: dc1) Server: true Version: 0.6.0==> Nomad agent started! Log data will stream in below:[INFO] serf: EventMemberJoin: nomad.global 127.0.0.1 [INFO] nomad: starting 4 scheduling worker(s) for [service batch _core] [INFO] raft: Node at 127.0.0.1:4647 [Follower] entering Follower state [INFO] nomad: adding server nomad.global (Addr: 127.0.0.1:4647) (DC: dc1) [WARN] raft: Heartbeat timeout reached, starting election [INFO] raft: Node at 127.0.0.1:4647 [Candidate] entering Candidate state [DEBUG] raft: Votes needed: 1 [DEBUG] raft: Vote granted. Tally: 1 [INFO] raft: Election won. Tally: 1 [INFO] raft: Node at 127.0.0.1:4647 [Leader] entering Leader state [INFO] nomad: cluster leadership acquired [INFO] raft: Disabling EnableSingleNode (bootstrap) [DEBUG] raft: Node 127.0.0.1:4647 updated peer set (2): [127.0.0.1:4647]

我們可以看到，客戶端模式被禁用，我們只是作為服務器運行。這意味著該服務器將管理狀態并進行調度決策，但不會執行任何任務。現在我們需要一些代理來運行任務！

啟動客戶端

與服務器類似，我們必須先配置客戶端。請從github下載client1和client2的配置，或將以下內容粘貼到client1.hcl：

# Increase log verbosity log_level = "DEBUG"# Setup data dir data_dir = "/tmp/client1"# Enable the client client { enabled = true# For demo assume we are talking to server1. For production, # this should be like "nomad.service.consul:4647" and a system # like Consul used for service discovery. servers = ["127.0.0.1:4647"] }# Modify our port to avoid a collision with server1 ports { http = 5656 }

將該文件復制client2.hcl并更改data_dir為“/tmp/client2 ”并將端口更改為5657.一旦創建了這兩個文件，client1.hcl并client2.hcl打開每個選項卡并啟動代理程序：

# sudo nomad agent -config client1.hcl ==> Starting Nomad agent... ==> Nomad agent configuration:Client: true Log Level: DEBUG Region: global (DC: dc1) Server: false Version: 0.6.0==> Nomad agent started! Log data will stream in below:[DEBUG] client: applied fingerprints [host memory storage arch cpu] [DEBUG] client: available drivers [docker exec] [DEBUG] client: node registration complete ...

在輸出中，我們可以看到代理僅在客戶端模式下運行。該代理將可用于運行任務，但不會參與管理集群或做出調度決策。
使用node-status命令我們應該看到ready狀態中的兩個節點：

# nomad node-status ID Datacenter Name Class Drain Status fca62612 dc1 nomad <none> false ready c887deef dc1 nomad <none> false ready

我們現在有一個簡單的三節點集群運行。演示和完整生產集群之間的唯一區別是，我們運行的是單個服務器，而不是三個或五個。

提交工作

現在我們有一個簡單的集群，我們可以用它來安排一個工作。我們還應該擁有example.nomad之前的作業文件，但是確認count仍然設置為3。
然后，使用run命令提交作業：

# nomad init # nomad run example.nomad ==> Monitoring evaluation "8e0a7cf9" Evaluation triggered by job "example" Evaluation within deployment: "0917b771" Allocation "501154ac" created: node "c887deef", group "cache" Allocation "7e2b3900" created: node "fca62612", group "cache" Allocation "9c66fcaf" created: node "c887deef", group "cache" Evaluation status changed: "pending" -> "complete" ==> Evaluation "8e0a7cf9" finished with status "complete"

我們可以在輸出中看到調度程序為其中一個客戶機節點分配了兩個任務，剩下的任務分配給第二個客戶端。
我們可以再次使用status命令驗證：

# nomad status example ID = example Name = example Submit Date = 07/26/17 16:34:58 UTC Type = service Priority = 50 Datacenters = dc1 Status = running Periodic = false Parameterized = falseSummary Task Group Queued Starting Running Failed Complete Lost cache 0 0 3 0 0 0Latest Deployment ID = fc49bd6c Status = running Description = Deployment is runningDeployed Task Group Desired Placed Healthy Unhealthy cache 3 3 0 0Allocations ID Eval ID Node ID Task Group Desired Status Created At 501154ac 8e0a7cf9 c887deef cache run running 08/08/16 21:03:19 CDT 7e2b3900 8e0a7cf9 fca62612 cache run running 08/08/16 21:03:19 CDT 9c66fcaf 8e0a7cf9 c887deef cache run running 08/08/16 21:03:19 CDT

我們可以看到我們的所有任務已經分配并正在運行。一旦我們對我們的工作感到滿意，我們就可以把它刪掉了nomad stop。

使用nomad UI

仁者見仁智者見智，我在使用途中，覺得第一種UI是挺好的，可以看到很多細節的內容，相比官方的UI還沒有完善更多功能。
目前Nomad0.7版本集成了UI，在0.7版本之前，UI一直沒有很好的實現，所以我在github上找到一位大牛的UI作品https://github.com/jippi/hashi-ui。

官方UI

需要在github上下載nomad項目到本地，地址為：https://github.com/hashicorp/nomad/tree/master/ui

認真閱讀README，將Node.js、Yarn、Ember CLI、PhantomJS安裝在本地環境中。

安裝

# cd ui/ # yarn

安裝完成后，運行這條命令：ember serve --proxy?http://10.30.0.52:4646?（10.30.0.52換成你的外網IP，4646換成你自定義的端口），即可在瀏覽器中查看。

常見問題

服務會運行在127.0.0.1網卡上，外部不能訪問？

建議在運行nomad agent時，命令行配置相應的網卡。例如： # nomad agent -config server.hcl -bind=0.0.0.0 # nomad agent -config client1.hcl -network-interface=ens160

使用docker運行服務時，容器會映射隨機端口在本地？
根據研究官方文檔，文檔中提示了docker會隨機映射端口，如果想使用靜態端口，可以在job配置文件中定義。

簡單的job配置文件

hello world

# cat hello.nomad job "hello1" {datacenters = ["dc1"] #定義數據中心group "hello2" { #組名字task "hello3" { #一般使用服務名字表示task名字driver = "docker" #使用docker驅動config {image = "hashicorp/http-echo" #服務鏡像名字args = [ #容器運行時的命令參數"-listen", ":5678","-text", "hello world",]}resources { #配置服務的資源network { mbits = 10 #限制10MB帶寬port "http" {static = "5678" #使用靜態端口}}}}} }

搭建一個redmine，由于我還沒弄明白nomad如何像docker-compose一樣啟動服務，所以mysql只好提前單獨運行起來。

# cat redmine-example.nomad job "redmine" {region = "global" #設置地區datacenters = ["dc1"] #設置數據中心type = "service" #設置該job類型是服務，主要用于conusl的服務注冊，不寫這條，該job不會注冊服務到consulupdate {max_parallel = 1 #同時更新任務數量min_healthy_time = "10s" #分配必須處于健康狀態的最低時間，然后標記為正常狀態。healthy_deadline = "3m" #標記為健康的截止日期，之后分配自動轉換為不健康狀態auto_revert = false #指定job在部署失敗時是否應自動恢復到上一個??穩定jobcanary = 0 #如果修改job以后導致更新失敗，需要創建指定數量的替身，不會停止之前的舊版本，一旦確定替身健康，他們就會提升為正式服務，更新舊版本。}group "redmine" {count = 1 # 啟動服務數量restart { attempts = 10 #時間間隔內重啟次數interval = "5m" #在服務開始運行的持續時間內，如果一直出現故障，則會由mode控制。mode是控制任務在一個時間間隔內失敗多次的行為。delay = "25s" #重新啟動任務之前要等待的時間mode = "delay" #指示調度程序延遲下一次重啟，直到下一次重啟成功。}ephemeral_disk { #臨時磁盤 MB為單位size = 300}task "redmine" {driver = "docker"env { #環境變量REDMINE_DB_MYSQL = "10.30.0.52"REDMINE_DB_POSTGRES = "3306"REDMINE_DB_PASSWORD = "passwd"REDMINE_DB_USER = "root"REDMINE_DB_NAME = "redmine"}config {image = "redmine:yc"port_map { #指定映射的端口re = 3000}}logs {max_files = 10 #日志文件最多數量max_file_size = 15 #單個日志文件大小 MB單位} resources {cpu = 500 # 500 MHz #限制服務的cpu，內存，網絡memory = 256 # 256MBnetwork {mbits = 10port "re" {} #使用上面配置的映射端口}} service {name = "global-redmine-check" #健康檢查tags = ["global", "redmine"] port = "re"check {name = "alive"type = "tcp"interval = "10s"timeout = "2s"}}} }

總結

以上是生活随笔為你收集整理的Nomad入门的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

入门
Nomad

上一篇： Git关于pull,commit,pus
下一篇： maven快照版本和发布版本