K8S故障排除方法
1.查看pods哪些是有問題的,Running正常,其他異常:
/opt/kubernetes/bin/kubectl get pods --all-namespaces -owide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE default nginx-dbddb74b8-d78cd 1/1 Running 0 17m 172.17.90.3 192.168.18.148 <none>2.查看異常pod的詳情
/opt/kubernetes/bin/kubectl describe pods nginx-dbddb74b8-2hthr我這邊異常信息如下:
Warning FailedScheduling 32m (x2 over 32m) default-scheduler 0/2 nodes are available: 2 node(s) had taints that the pod didn't tolerate.解決辦法:參考: https://github.com/kubernetes-sigs/kubespray/issues/2798
3.查看異常服務的詳情
/opt/kubernetes/bin/kubectl describe services nginx4.查看集群node的狀態
/opt/kubernetes/bin/kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME 192.168.18.147 NotReady <none> 62m v1.12.1 192.168.18.147 <none> CentOS Linux 7 (Core) 3.10.0-862.el7.x86_64 docker://18.9.5我這邊是NotReady狀態,經排查發現,node18.147上面的kubelet kube-proxy掛掉了,服務起來后就可以了
5.查看node詳情
/opt/kubernetes/bin/kubectl describe node 192.168.18.147Warning FailedScheduling 32m (x2 over 32m) default-scheduler 0/2 nodes are available: 2 node(s) had taints that the pod didn't tolerate.這個的具體解決方法:
我這邊查看pod詳情,Taints顯示如下:
Taints: node.kubernetes.io/unreachable:NoSchedule執行如下命令后即可
[root@master tmp]# /opt/kubernetes/bin/kubectl taint nodes --all node.kubernetes.io/unreachable- node/192.168.18.147 untainted node/192.168.18.148 untainted?
6.查看集群組件狀態
/opt/kubernetes/bin/kubectl get cs NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-1 Healthy {"health": "true"} etcd-2 Healthy {"health": "true"} etcd-0 Healthy {"health": "true"}?
7.查看服務集群IP、端口、運行時長
/opt/kubernetes/bin/kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)AGE kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 4h51m nginx NodePort 10.0.0.215 <none> 88:40675/TCP 92m?
總結
- 上一篇: Linux - 查看、修改、更新系统时间
- 下一篇: shell脚本获取绝对路径