A different method to debug Kubernetes Pods

By Adam Stawski
15 August 2023

Generated with carbon.now.sh

🤔 So what’s the problem?

Debugging applications running inside Kubernetes containers can be a challenging task, especially when the containers have limited capabilities and there’s lack of basic debugging tools. When an application is deployed on top of a scratch image, even accessing a shell becomes impossible.

A common way of addressing those issues is modifying the Dockerfile and including the essential binaries in it, or deploying a sidecar container alongside the main application. However, those are unsuitable for production workloads.

Containerisation leverages the power of Linux cgroups and namespaces to isolate resources, thus enabling efficient and secure application execution. In a Kubernetes cluster, those functionalities are managed by the Container Runtime Interface. Gaining access to the nodes provides valuable insight into the container environment, which provides for effective troubleshooting. In this blog, I will demonstrate a step-by-step guide on how to access a running Kubernetes Pod by examining its namespace.

🛠️ Let’s prepare a Kubernetes environment.

💡 I have placed all the required manifests and scripts in a public repository: https://github.com/cloudziu/debugging-scratch.

First, let’s prepare a cluster. I will be using an Azure-managed Kubernetes cluster:

$ ssh-keygen -t rsa -b 4096 -f $HOME/.ssh/lab_rsa;
$ az group create --name adam-playground \
		--location westeurope;
$ az aks create --resource-group adam-playground \
		--name testCluster \
		--node-count 1 \
		--enable-node-public-ip \
		--ssh-key-value ~/.ssh/lab_rsa.pub;

# Get Kubeconfig after cluster is created
$ az aks get-credentials --name testCluster \
		--resource-group adam-playground \

For the sake of this illustration, I will create a single node cluster on which I will deploy an example application.

💡 Additionally, to access workers by SSH, the node network security group needs to be edited to allow incoming connections to access port 22.

Having obtained the kubeconfig, the faulty application can be deployed. After a moment, the pod should be up and running:

### Terminal 1 - local ###

# Deploy faultyapp to the cluster
$ kubectl appy -f https://raw.githubusercontent.com/cloudziu/debugging-scratch/master/k8s-deployment.yaml
$ kubectl get pods
NAME                        READY   STATUS    RESTARTS        AGE
faultyapp-7cc8dfc6d-ln4cl   1/1     Running   0               10s

Now we can jump into the debugging process!

💩 Those logs are cow . . . 

For the purposes of this post, I have created a simple Golang application that is intentionally poorly written. Our goal is to display the log message It works!:). The Pod definition cannot be changed, and the Pod cannot be restarted. Let’s see if the Pod has something interesting to offer, by viewing its logs.

### Terminal 1 - local ###

$ kubectl logs faultyapp-7cc8dfc6d-ln4cl 
Something went wrong...
Something went wrong...
Something went wrong...
Something went wrong...

The logs themselves are not very helpful. All that is known at this point is that “Something went wrong”. I will try to execute some common debugging commands and try running kubectl exec into the Pod.

### Terminal 1 - local ###

$ kubectl exec -it faultyapp-7cc8dfc6d-ln4cl -- bash
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "b788da9537d59c1afbf24d17eace71fd2b6b4e5ded4157f0390ee79f0988dea0": OCI runtime exec failed: exec failed: unable to start container process: exec: "bash": executable file not found in $PATH: unknown

$ kubectl exec -it faultyapp-7cc8dfc6d-ln4cl -- ls
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "b121694c51371a73be10d0f322d5b55768217401002445c34d9eaf11d45a004a": OCI runtime exec failed: exec failed: unable to start container process: exec: "ls": executable file not found in $PATH: unknown

As you can see, this method will not allow us to do anything. The app is built as a scratch image, and besides our application, it does not contain any additional binaries.

Now the fun stuff begins. Let’s connect to the node and see what we can learn from it!

🕵🏻 Wandering around the node

I will now open a second terminal and connect to the node on which the Pod is running.

### Terminal 2 - local ###

$ ssh [email protected] -i ~/.ssh/lab_rsa

In order to be able to investigate, I want to obtain the PID of the application. The easiest option would be to run top or ps aux and try to locate the process manually, but I want to show you a slightly different approach that may come in handy in the future.

The idea is to locate the control group subsystem that is responsible for managing the Faultyapp container. In this example, we are utilising an Azure AKS cluster that is using cgroup v2.

In the first terminal, I will get the resource UID. Kubernetes is using this to create the cgroups for the Pod’s. The only difference is that dashes - are changed to underscores _. In version 2 of control groups, the Kubernetes subsystem is located in /sys/fs/cgroup/kubepods.slice/. I will also need to check the Pod’s qosClass since Kubernetes splits cgroup management based on the QoS class.

### Terminal 1 - local ###

$ kubectl get pod faultyapp-7cc8dfc6d-ln4cl -o jsonpath='{.metadata.uid}' | sed 's/-/_/g'
d658de75_e8f1_4027_a8e0_c67f46d386d7% # Remove the % sign

$ kubectl get pod faultyapp-7cc8dfc6d-ln4cl -o jsonpath='{.status.qosClass}'
Burstable% # Remove the % sign

Knowing the Pod’s UID and QoS Class, I can use the information to get to the correct cgroup directory.

### Terminal 2 - Worker Node ###

$ ls /sys/fs/cgroup/kubepods.slice | grep kubepod
drwxr-xr-x  2 root root 0 Jul 30 15:01 kubepods-besteffort.slice
drwxr-xr-x 15 root root 0 Jul 30 15:07 kubepods-burstable.slice

# Pods cgroup location
cd /sys/fs/cgroup/kubepods.slice/kubepods-${PODS_QOS}.slice/kubepods-burstable-pod${PODS_UID}.slice/

# List containers inside the Pod
$ ls | grep cri-containerd

We can see two containers within it. One of them is a ./pause container. You can read more about it here: https://blog.devgenius.io/k8s-pause-container-f7abd1e9b488. The actual container with the app can be found in the Pod status field .status.containerStatuses[].containerID. I will go straight to the Faultyapp container. We can now get the process id directly from the cgroup. It will be located in the cgroup.procs file.

### Terminal 2 - Worker Node ###

# Get the process id
$ cat cri-containerd-ceeeef06afe89c8223d33b11e8d9e0b207118ac4dac3af826687668ee1eed1fe.scope/cgroup.procs

# Validate what is running under the process
$ ps aux | grep 16254
azureus+    16254 0.0  0.1 713972 10476 ?        Ssl  15:04   0:00 ./faultyapp
azureus+   94806  0.0  0.0   7004  2168 pts/0    S+   16:22   0:00 grep --color=auto 16254

Got it! With that, we can try to find out what is going out inside the app. Lets try to run strace to get some more insight.

### Terminal 2 - Worker Node ###

$ sudo strace -p 16254 -f
# The app is trying to read a file port.txt
[pid 16269] openat(AT_FDCWD, "port.txt", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 16254] epoll_pwait(5,  <unfinished ...>
# The file does not exist
[pid 16269] <... openat resumed>)       = -1 ENOENT (No such file or directory)
[pid 16254] <... epoll_pwait resumed>[], 128, 0, NULL, 0) = 0
[pid 16269] write(1, "Something went wrong...\\n", 24 <unfinished ...>

After filtering the output, we can see the application is trying to read a text file called port.txt, and a few lines later, there is a message stating ENOENT (No such file or directory). Let’s create that file.

First, I will to try locate the path of the overlay filesystem that Faultyapp is using. To do so, I will change the directory to /proc/16254. What I want to retrieve from this is the ./root subdirectory that is linking to the process specific root path. Running df -h will help with locating the mount path of the fs. We can access the Pod’s fs this way, as well as create the port.txt file. Keep in mind that this file will not persist between container restarts! I will make a guess and insert a common port number 80 into the file.

### Terminal 2 - Worker Node ###

# Locate the rootfs of faultyapp process
$ /proc/16254# df -h /proc/16254/root/
Filesystem      Size  Used Avail Use% Mounted on
overlay         124G   21G  104G  17% /run/containerd/io.containerd.runtime.v2.task/k8s.io/23132390768e5249c661af6bbe9f89883a2a4972ae3a4f71ad7d6c8a2febe369/rootfs

$ cd /run/containerd/io.containerd.runtime.v2.task/k8s.io/23132390768e5249c661af6bbe9f89883a2a4972ae3a4f71ad7d6c8a2febe369/

$ ls 
bin  dev  etc  proc  sys  var

$ ls ./bin
faultyapp # Gotya

# Create the file, and change its ownership
$ echo -n "80" > ./rootfs/bin/port.txt
$ chown 1000:2000 ./rootfs/bin/port.txt 

Let’s see if something happens by checking the Faultyapp logs.

### Terminal 1 - local ###

Something went wrong...
Something went wrong...
Something went wrong...
Something went wrong...
It's not working :(
It's not working :(
It's not working :(

Something clearly changed, but it is not yet the desired state. Once again, I will check strace to gain some more insight.

### Terminal 2 - Worker Node ###

# Filtering the strace noice will leave us with this output
[pid 16269] connect(4, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("")}, 16 <unfinished ...>
[pid 16269] getsockopt(4, SOL_SOCKET, SO_ERROR,  <unfinished ...>
[pid 16269] write(1, "It's not working :(\\n", 20 <unfinished ...>

The application is apparently trying to connect to localhost with the port that was defined in port.txt it would be worth inspecting its network namespace. I will show you how to access the Pod’s network context and validate that it is the correct one. First, let me grab the Pod’s IP Address with kubectl, and then I will use nsenter to run ip addr show against its Linux namespace.

### Terminal 2 - Worker Node ###

# Lookup PID's network namespace and compare the IP with Pods IP
$ kubectl get pods faultyapp-7cc8dfc6d-ln4cl -o wide
NAME                        READY   STATUS    RESTARTS   AGE   IP            NODE                                NOMINATED NODE   READINESS GATES
faultyapp-7cc8dfc6d-ln4cl   1/1     Running   0          36m   aks-nodepool1-42544644-vmss000000   <none>           <none>

$ nsenter -t 16254 -n ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 3a:22:20:88:e8:44 brd ff:ff:ff:ff:ff:ff link-netnsid 0
		# IP of the Pod's interface
    inet brd scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::3822:20ff:fe88:e844/64 scope link
       valid_lft forever preferred_lft forever

We can use various debugging tools with this method, such as tcpdump, dig, ss. Using ss, I will verify whether there is any port listening in the Pod namespace.

### Terminal 2 - Worker Node ###

# Removed one collumn for preattier output
$ nsenter -t 16254 -n ss -tulpn
Netid    State    Recv-Q   Send-Q   Local Address:Port   Process
tcp      LISTEN   0        4096     *:8080               users:(("faultyapp",pid=16254,fd=3))

We can see our faultyapp listens on port 8080, not 80. It was a close guess. I will edit port.txt so that it matches the listening port.

### Terminal 2 - Worker Node ###

$ echo -n "8080" > ./bin/port.txt
$ cat ./bin/port.txt

Let’s review the Pod logs again.

### Terminal 1 - local ###

$ kubectl logs faultyapp-7cc8dfc6d-ln4cl
It's not working :(
It's not working :(
It works! :)

Success! 🎉 We have managed to debug and fix our faulty application. Thank you for staying with me and reading to the end! Embrace the cloud-native, keep exploring, and let containerisation take your applications to new heights! 🐳

Used materials:

Icon used in thumbnail – Details icons created by Icongeek26 – Flaticon