Troubleshooting
Application Failure
Lab:
The most things I mistaked in:
The troubleshooting went in these processes:
Control Plane Failure
First check status of nodes if they are healthy:
Then status of pods:
You can also check the control plane services manually if they are deployed as services:
Check Service logs:
Lab:
Worker Node Failure

When a worker node stops communicating to a master it is shown as unknown:

Then we can proceed to check status of the nodes:
Also checking kubelet status on the worker nodes:
Check Certificates:
Lab:

Also, when editing the /etc/kubernetes/kubelet.conf make sure to restart the kubelet using:
Network Troubleshooting
CoreDNS Troubleshooting Commands
Check if a network plugin is installed:
Upgrade Docker (specific commands can vary based on your system):
Disable SELinux:
Modify CoreDNS to allow privilege escalation:
Edit the kubelet config file (usually at /var/lib/kubelet/config.yaml) and add:
/var/lib/kubelet/config.yaml) and add:Edit Corefile to forward DNS queries directly to an upstream DNS:
Check kube-dns service endpoints:
Kube-Proxy Troubleshooting Commands
Check kube-proxy pod status:
Check kube-proxy logs:
Verify kube-proxy configmap:
Check for kube-proxy network bindings:
Debug Service issues:
DNS Troubleshooting:
Lab:

Slides
Last updated
