當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

K8s中Pod健康检查源代码分析

發布時間：2024/8/23 编程问答 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 K8s中Pod健康检查源代码分析小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

了解k8s中的Liveness和Readiness

Liveness:?
表明是否容器正在運行。如果liveness探測為fail，則kubelet會kill掉容器，并且會觸發restart設置的策略。默認不設置的情況下，該狀態為success.
Readiness:?
表明容器是否可以接受服務請求。如果readiness探測失敗，則endpoints控制器會從endpoints中摘除該Pod IP。在初始化延遲探測時間之前，默認是Failure。如果沒有設置readiness探測，該狀態為success。

代碼分析

基于Kubernetes 1.11.0

1.啟動探測

在kubelet啟動是時候會啟動健康檢查的探測：
kubelet.go中Run方法

... kl.probeManager.Start() //啟動探測服務 ...

2.看一下probeManager都做了哪些事情

prober_manager.go中我們看一下這段代碼：

// Manager manages pod probing. It creates a probe "worker" for every container that specifies a // probe (AddPod). The worker periodically probes its assigned container and caches the results. The // manager use the cached probe results to set the appropriate Ready state in the PodStatus when // requested (UpdatePodStatus). Updating probe parameters is not currently supported. // TODO: Move liveness probing out of the runtime, to here. type Manager interface {// AddPod creates new probe workers for every container probe. This should be called for every// pod created.AddPod(pod *v1.Pod)// RemovePod handles cleaning up the removed pod state, including terminating probe workers and// deleting cached results.RemovePod(pod *v1.Pod)// CleanupPods handles cleaning up pods which should no longer be running.// It takes a list of "active pods" which should not be cleaned up.CleanupPods(activePods []*v1.Pod)// UpdatePodStatus modifies the given PodStatus with the appropriate Ready state for each// container based on container running status, cached probe results and worker states.UpdatePodStatus(types.UID, *v1.PodStatus)// Start starts the Manager sync loops.Start() }

這是一個Manager的接口聲明，該Manager負載pod的探測。當執行AddPod時，會為Pod中每一個容器創建一個執行探測任務的worker, 該worker會對所分配的容器進行周期性的探測，并把探測結果緩存。當UpdatePodStatus方法執行時，該manager會使用探測的緩存結果設置PodStatus為近似Ready的狀態：

3.一“探”究竟

先看一下探測的struct

type Probe struct {// The action taken to determine the health of a containerHandler `json:",inline" protobuf:"bytes,1,opt,name=handler"`// Number of seconds after the container has started before liveness probes are initiated.// More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes// +optionalInitialDelaySeconds int32 `json:"initialDelaySeconds,omitempty" protobuf:"varint,2,opt,name=initialDelaySeconds"`// Number of seconds after which the probe times out.// Defaults to 1 second. Minimum value is 1.// More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes// +optionalTimeoutSeconds int32 `json:"timeoutSeconds,omitempty" protobuf:"varint,3,opt,name=timeoutSeconds"`// How often (in seconds) to perform the probe.// Default to 10 seconds. Minimum value is 1.// +optionalPeriodSeconds int32 `json:"periodSeconds,omitempty" protobuf:"varint,4,opt,name=periodSeconds"`// Minimum consecutive successes for the probe to be considered successful after having failed.// Defaults to 1. Must be 1 for liveness. Minimum value is 1.// +optionalSuccessThreshold int32 `json:"successThreshold,omitempty" protobuf:"varint,5,opt,name=successThreshold"`// Minimum consecutive failures for the probe to be considered failed after having succeeded.// Defaults to 3. Minimum value is 1.// +optionalFailureThreshold int32 `json:"failureThreshold,omitempty" protobuf:"varint,6,opt,name=failureThreshold"` }

initialDelaySeconds: 表示容器啟動之后延遲多久進行liveness探測
timeoutSeconds：每次執行探測的超時時間
periodSeconds：探測的周期時間
successThreshold：最少連續幾次探測成功的次數，滿足該次數則認為success。
failureThreshold：最少連續幾次探測失敗的次數，滿足該次數則認為fail

Handler:
不論是liveness還是readiness都支持3種類型的探測方式：執行命令、http方式以及tcp方式。

// Handler defines a specific action that should be taken // TODO: pass structured data to these actions, and document that data here. type Handler struct {// One and only one of the following should be specified.// Exec specifies the action to take.// +optionalExec *ExecAction `json:"exec,omitempty" protobuf:"bytes,1,opt,name=exec"`// HTTPGet specifies the http request to perform.// +optionalHTTPGet *HTTPGetAction `json:"httpGet,omitempty" protobuf:"bytes,2,opt,name=httpGet"`// TCPSocket specifies an action involving a TCP port.// TCP hooks not yet supported// TODO: implement a realistic TCP lifecycle hook// +optionalTCPSocket *TCPSocketAction `json:"tcpSocket,omitempty" protobuf:"bytes,3,opt,name=tcpSocket"` }

接下來看一下prober.go中的runProbe方法。

func (pb *prober) runProbe(probeType probeType, p *v1.Probe, pod *v1.Pod, status v1.PodStatus, container v1.Container, containerID kubecontainer.ContainerID) (probe.Result, string, error) {timeout := time.Duration(p.TimeoutSeconds) * time.Secondif p.Exec != nil {glog.V(4).Infof("Exec-Probe Pod: %v, Container: %v, Command: %v", pod, container, p.Exec.Command)command := kubecontainer.ExpandContainerCommandOnlyStatic(p.Exec.Command, container.Env)return pb.exec.Probe(pb.newExecInContainer(container, containerID, command, timeout))}if p.HTTPGet != nil {scheme := strings.ToLower(string(p.HTTPGet.Scheme))host := p.HTTPGet.Hostif host == "" {host = status.PodIP}port, err := extractPort(p.HTTPGet.Port, container)if err != nil {return probe.Unknown, "", err}path := p.HTTPGet.Pathglog.V(4).Infof("HTTP-Probe Host: %v://%v, Port: %v, Path: %v", scheme, host, port, path)url := formatURL(scheme, host, port, path)headers := buildHeader(p.HTTPGet.HTTPHeaders)glog.V(4).Infof("HTTP-Probe Headers: %v", headers)if probeType == liveness {return pb.livenessHttp.Probe(url, headers, timeout)} else { // readinessreturn pb.readinessHttp.Probe(url, headers, timeout)}}if p.TCPSocket != nil {port, err := extractPort(p.TCPSocket.Port, container)if err != nil {return probe.Unknown, "", err}host := p.TCPSocket.Hostif host == "" {host = status.PodIP}glog.V(4).Infof("TCP-Probe Host: %v, Port: %v, Timeout: %v", host, port, timeout)return pb.tcp.Probe(host, port, timeout)}glog.Warningf("Failed to find probe builder for container: %v", container)return probe.Unknown, "", fmt.Errorf("Missing probe handler for %s:%s", format.Pod(pod), container.Name) }

1.執行命令方式
通過newExecInContainer方法調用CRI執行命令：

// ExecAction describes a "run in container" action. type ExecAction struct {// Command is the command line to execute inside the container, the working directory for the// command is root ('/') in the container's filesystem. The command is simply exec'd, it is// not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use// a shell, you need to explicitly call out to that shell.// Exit status of 0 is treated as live/healthy and non-zero is unhealthy.// +optionalCommand []string `json:"command,omitempty" protobuf:"bytes,1,rep,name=command"` }

2.http GET方式
通過http GET方式進行探測。
Port：表示訪問容器的端口
Host：表示訪問的主機，默認是Pod IP

// HTTPGetAction describes an action based on HTTP Get requests. type HTTPGetAction struct {// Path to access on the HTTP server.// +optionalPath string `json:"path,omitempty" protobuf:"bytes,1,opt,name=path"`// Name or number of the port to access on the container.// Number must be in the range 1 to 65535.// Name must be an IANA_SVC_NAME.Port intstr.IntOrString `json:"port" protobuf:"bytes,2,opt,name=port"`// Host name to connect to, defaults to the pod IP. You probably want to set// "Host" in httpHeaders instead.// +optionalHost string `json:"host,omitempty" protobuf:"bytes,3,opt,name=host"`// Scheme to use for connecting to the host.// Defaults to HTTP.// +optionalScheme URIScheme `json:"scheme,omitempty" protobuf:"bytes,4,opt,name=scheme,casttype=URIScheme"`// Custom headers to set in the request. HTTP allows repeated headers.// +optionalHTTPHeaders []HTTPHeader `json:"httpHeaders,omitempty" protobuf:"bytes,5,rep,name=httpHeaders"` }

3.tcp方式
通過設置主機和端口即可進行tcp方式訪問

// TCPSocketAction describes an action based on opening a socket type TCPSocketAction struct {// Number or name of the port to access on the container.// Number must be in the range 1 to 65535.// Name must be an IANA_SVC_NAME.Port intstr.IntOrString `json:"port" protobuf:"bytes,1,opt,name=port"`// Optional: Host name to connect to, defaults to the pod IP.// +optionalHost string `json:"host,omitempty" protobuf:"bytes,2,opt,name=host"` }

此處腦洞一下：如果三種探測方式都設置了，會如何執行處理？

思考

通過k8s部署生產環境應用時，建議設置上liveness和readiness, 這也是保障服務穩定性的最佳實踐。
另外由于Pod Ready不能保證實際的業務應用Ready可用，在最新的 1.14 版本中新增了一個Pod Readiness Gates?特性。通過這個特性，可以保證應用Ready后進而設置Pod Ready。

結尾

針對上面的腦洞：如果三種探測方式都設置了，會如何執行處理？
答：我們如果在Pod中設置多個探測方式，提交配置的時候會直接報錯：

此處繼續源代碼：在validation.go中validateHandler中進行了限制(也為上面Handler struct提到的"One and only one of the following should be specified."提供了事實依據)

func validateHandler(handler *core.Handler, fldPath *field.Path) field.ErrorList {numHandlers := 0allErrors := field.ErrorList{}if handler.Exec != nil {if numHandlers > 0 {allErrors = append(allErrors, field.Forbidden(fldPath.Child("exec"), "may not specify more than 1 handler type"))} else {numHandlers++allErrors = append(allErrors, validateExecAction(handler.Exec, fldPath.Child("exec"))...)}}if handler.HTTPGet != nil {if numHandlers > 0 {allErrors = append(allErrors, field.Forbidden(fldPath.Child("httpGet"), "may not specify more than 1 handler type"))} else {numHandlers++allErrors = append(allErrors, validateHTTPGetAction(handler.HTTPGet, fldPath.Child("httpGet"))...)}}if handler.TCPSocket != nil {if numHandlers > 0 {allErrors = append(allErrors, field.Forbidden(fldPath.Child("tcpSocket"), "may not specify more than 1 handler type"))} else {numHandlers++allErrors = append(allErrors, validateTCPSocketAction(handler.TCPSocket, fldPath.Child("tcpSocket"))...)}}if numHandlers == 0 {allErrors = append(allErrors, field.Required(fldPath, "must specify a handler type"))}return allErrors }

原文鏈接
本文為云棲社區原創內容，未經允許不得轉載。

創作挑戰賽新人創作獎勵來咯，堅持創作打卡瓜分現金大獎

總結

以上是生活随笔為你收集整理的K8s中Pod健康检查源代码分析的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：阿里云ACM：云原生配置管理利器，让云上
下一篇：深入浅出网络编程与Swoole内核