Skip to main content

8 posts tagged with "Kubernetes"

View All Tags

Debugging 'Too many open files' in Kubernetes: nofile vs inotify/fsnotify

· 5 min read
Kobbi Gal
I like to pick things apart and see how they work inside

When you see too many open files in a containerized app, it’s tempting to jump straight to ulimit -n. Sometimes that’s correct. But on Linux (especially with Go apps using fsnotify), the error can also be caused by inotify limits—even if your process has a huge file-descriptor limit.

This post is a practical, copy/paste-friendly checklist to debug the problem on a real Kubernetes cluster.

Step 0: Decide which “limit” you’re hitting

There are (at least) three common failure modes that can all look like “too many open files”:

  • Per-process file descriptor limit (classic EMFILE)
    • Think: ulimit -n, /proc/<pid>/limits, or systemd LimitNOFILE.
  • System-wide file table exhaustion (ENFILE)
    • Think: /proc/sys/fs/file-nr approaching /proc/sys/fs/file-max.
  • inotify instance/watch limits (common with fsnotify)
    • Think: fs.inotify.max_user_instances and fs.inotify.max_user_watches.

The rest of the tutorial helps you quickly identify which one applies.

Step 1: Check the real limits of the failing process (not your shell)

Inside the pod:

# Find your process (adjust the pattern for your app)
ps -eo pid,comm,args | grep -E 'myapp|server|gateway' | grep -v grep

# Replace <pid> with the real PID
cat /proc/<pid>/limits | sed -n '/Max open files/p'

# Count currently open FDs
ls /proc/<pid>/fd | wc -l

Why this matters:

  • ulimit -n shows the limit for your current shell, which may be totally different from a process started by an init system (systemd, supervisord, Kubernetes runtime, etc.).

Step 2: Check node-wide file table pressure (ENFILE)

On the Kubernetes node:

cat /proc/sys/fs/file-nr
cat /proc/sys/fs/file-max

file-nr is usually three numbers: allocated, unused, max. If allocated is near max, you have a node-level exhaustion problem that can break unrelated workloads.

Step 3: Check inotify limits (the usual fsnotify culprit)

On the node (or inside the container—these are node kernel settings):

cat /proc/sys/fs/inotify/max_user_instances
cat /proc/sys/fs/inotify/max_user_watches
cat /proc/sys/fs/inotify/max_queued_events

If your app uses Go file watching, fsnotify’s Linux notes are worth skimming: fsnotify README (Linux).

Key concept:

  • max_user_instances is per-UID. In Kubernetes, multiple containers/processes can share the same numeric UID (e.g., a non-root “app user”), which means they share the same inotify instance budget on that node.

Step 4: Count inotify “instances” currently in use (who’s consuming them?)

On Linux, inotify instances show up as file descriptors named anon_inode:inotify.

Run this on the node to see which processes (and UIDs) are holding inotify instances:

for pid in /proc/[0-9]*; do
p=${pid#/proc/}
[ -r "$pid/fd" ] || continue
c=0
for fd in "$pid"/fd/*; do
[ "$(readlink "$fd" 2>/dev/null)" = "anon_inode:inotify" ] && c=$((c+1))
done
[ "$c" -gt 0 ] || continue
uid=$(awk '/^Uid:/{print $2}' "$pid/status" 2>/dev/null)
cmd=$(tr '\0' ' ' < "$pid/cmdline" 2>/dev/null)
echo "uid=$uid pid=$p inotify_instances=$c $cmd"
done | sort -k3 -nr | head -n 50

What to look for:

  • A single UID with inotify_instances totals near max_user_instances
  • Your app process holding many anon_inode:inotify FDs
  • A node “agent” (log collector, metrics sidecar, etc.) consuming a lot

Step 5: (Optional) Count how many watches each inotify FD holds

If you suspect “too many watches” (not instances), you can inspect /proc/<pid>/fdinfo/*:

pid=<pid>
for fd in /proc/$pid/fd/*; do
[ "$(readlink "$fd" 2>/dev/null)" = "anon_inode:inotify" ] || continue
n=$(grep -c 'inotify wd' /proc/$pid/fdinfo/${fd##*/} 2>/dev/null || true)
echo "pid=$pid fd=${fd##*/} watches=$n"
done | sort -t= -k3 -nr | head -n 20

Step 6: “How do I kill the file watcher?”

You generally can’t “kill a watch” directly. The kernel releases it when the owning process closes the FD.

Practical actions:

  • Restart the pod / process that owns the inotify instances.
  • If a node is saturated, cordon/drain and reschedule to another node as a temporary workaround.
  • Fix the root cause by raising limits and/or reducing watcher usage.

Step 7: Mitigation—raise inotify instance limits (and make it persistent)

If you’ve confirmed max_user_instances is the bottleneck, increasing it is often the quickest fix.

Temporary (until reboot):

sysctl -w fs.inotify.max_user_instances=1024

Persistent:

  • Add a sysctl config file on the node (exact location varies by distro), for example:
    • /etc/sysctl.d/99-inotify.conf
  • Then reload sysctls (varies by environment), commonly:
    • sysctl --system

In managed Kubernetes, you may prefer:

  • baking sysctl settings into your node image/bootstrap
  • setting them via a privileged DaemonSet (policy-dependent)

Step 8: Don’t forget nofile (it’s still real)

Even if inotify was the cause this time, it’s worth capturing nofile facts for your runbook:

  • Systemd defaults:
systemctl show --property DefaultLimitNOFILE
systemctl show --property DefaultLimitNOFILESoft
  • Unit-specific limits (examples):
systemctl show kubelet --property LimitNOFILE
systemctl show containerd --property LimitNOFILE

And always verify the actual process limit via /proc/<pid>/limits (Step 1).

Summary: the fastest “root cause” loop

  1. Check the failing process’s actual Max open files and FD count.
  2. Check node file-nr/file-max for system-wide FD exhaustion.
  3. Check inotify sysctls and enumerate anon_inode:inotify to find the UID/process consuming instances.
  4. Apply the smallest safe mitigation (restart offender, reschedule, raise sysctl) and confirm the error disappears.

How To Deploy Application with Azure Workload Identity

· 3 min read
Kobbi Gal (Akeyless)
Escalations Engineer at Akeyless

This tutorial is a guide on how to deploy an application in Kubernetes that will authenticate using Azure Workload Identity on Azure Kubernetes Services (AKS).

Prerequisites

  • Access to the Azure CLI and an Azure account.
  • kubectl installed and access to the AKS cluster.
  • helm.

See links for more information about Azure Identity and AKS.

Enable OIDC on AKS

  1. Check if the OIDC issuer is enabled in the AKS cluster. Enable it if it's not.

(Optional) Enable Workload Identity plugin

az aks update \
--resource-group "$AZURE_RESOURCE_GROUP" \
--name "$AKS_CLUSTER_NAME" \
--enable-workload-identity

This will deploy a Deployment named azure-wi-webhook-controller-manager in the kube-system namespace:

❯ kubectl get deploy -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
azure-wi-webhook-controller-manager 2/2 2 2 48d

This step is optional since we can explicitly specify the application that will use Azure Workload Identity to mount the Azure token as a volume. More on that in a bit.

Create User Assigned Managed Identity for Application

# Replace with your preferred names and location
IDENTITY_NAME="app-wi"
IDENTITY_RG="$AZURE_RESOURCE_GROUP"
LOCATION="${AZURE_LOCATION:-eastus}"

az identity create --resource-group "$IDENTITY_RG" --name "$IDENTITY_NAME" --location "$LOCATION"
CLIENT_ID=$(az identity show --resource-group "$IDENTITY_RG" --name "$IDENTITY_NAME" --query clientId -o tsv)
PRINCIPAL_ID=$(az identity show --resource-group "$IDENTITY_RG" --name "$IDENTITY_NAME" --query principalId -o tsv)
TENANT_ID=$(az account show --query tenantId -o tsv)
OIDC_ISSUER=$(az aks show --resource-group "$AZURE_RESOURCE_GROUP" --name "$AKS_CLUSTER_NAME" --query "oidcIssuerProfile.issuerUrl" -o tsv)

Create a Federated Credential

# namespace and service account name that your test app will use
NAMESPACE="default"
SA_NAME="app-wi-sa"

az identity federated-credential create \
--resource-group "$IDENTITY_RG" \
--name "${IDENTITY_NAME}-fc" \
--identity-name "$IDENTITY_NAME" \
--issuer "$OIDC_ISSUER" \
--subject "system:serviceaccount:${NAMESPACE}:${SA_NAME}"

Install Azure Workload Identity Webhook

This is what injects AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_FEDERATED_TOKEN_FILE, and the projected token volume into pods that use the label. See Service Principal for more info on those environmental variables.

helm repo add azure-workload-identity https://azure.github.io/azure-workload-identity/charts
helm repo update
kubectl create namespace azure-workload-identity-system 2>/dev/null || true
helm upgrade --install workload-identity-webhook azure-workload-identity/workload-identity-webhook \
--namespace azure-workload-identity-system \
--set azureTenantId="$TENANT_ID"s

Create a Kubernetes ServiceAccount

Here is where the link between Kubernetes and Azure Workload Identity happens:

kubectl create namespace "$NAMESPACE" 2>/dev/null || true
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: $SA_NAME
namespace: $NAMESPACE
annotations:
azure.workload.identity/client-id: "$CLIENT_ID"
EOF

As we can see, we annotate the ServiceAccount with azure.workload.identity/client-id: "$CLIENT_ID".

Deploy Application with Workload Identity

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-wid
spec:
replicas: 1
selector:
matchLabels:
app: hello-wid
template:
metadata:
labels:
app: hello-wid
azure.workload.identity/use: "true"
spec:
serviceAccountName: $SA_NAME
containers:
- name: alpine
image: alpine
command:
- "sh"
- "-c"
- "echo "Workload Identity tutorial done! Sleeping..." && sleep 10000"
EOF

The main things we're doing here are:

  1. We set the application Deployment to use the ServiceAccount we created in the previous step and that is linked to an Azure Workload Identity.
  2. We set the Deployment Pod specification to use Azure Workload Identity by setting the label azure.workload.identity/use: "true".

How to Deploy Kubernetes Services using Gateway API/AWS Load Balancer Controller

· 9 min read
Kobbi Gal (Akeyless)
Escalations Engineer at Akeyless

This tutorial contains a working example of exposing TCP services (LDAP/LDAPS + SSH) from a single-node k3s cluster running on an EC2 instance, using:

  • Kubernetes Gateway API
  • AWS Load Balancer Controller (LBC) for:
    • NLB (L4) via TCPRoute
    • ALB (L7) via HTTPRoute/GRPCRoute (example file included)

The key implementation detail for k3s-on-EC2 with the default overlay networking (flannel): use instance targets + NodePorts for L4 routes. ClusterIP + pod IP targets won’t work unless pods are VPC-routable (AWS VPC CNI).

Installing PiHole On Raspberry Pi 4, MicroK8s running Ubuntu 20.04 (focal)

· 17 min read
Kobbi Gal
I like to pick things apart and see how they work inside

PiHole, What’s That?

The Wikipedia definition should be sufficient in explaining what the software does:

Pi-hole or Pihole is a Linux network-level advertisement and Internet tracker blocking application which acts as a DNS sinkhole and optionally a DHCP server, intended for use on a private network

I wanted to deploy it for a few reasons:

  • I have a spare Raspberry Pi 4 lying around.
  • Because I’m working on getting my CKAD (Certified Kubernetes Application Developer) certification and thought it would be a great hands-on practice.
  • I couldn’t find a good enough article that described how to install PiHole on Kubernetes. The majority did not go throught the whole procedure, were aimed for Docker/Swarm and Raspbian (Raspberry Pi flavored Linux distribution).
  • I got tired of all the advertisements and popups on all the devices while surfing the web at home.

This post is here to explain how was able to deploy PiHole on Kubernetes and how I resolved some of the problems that occurred during the deployment process.

Debugging NodeJS Microservice with Shared Storage on Kubernetes

· 7 min read
Kobbi Gal
I like to pick things apart and see how they work inside

sort-exceeded

Introduction

One of our largest customer recently had a problem loading a list of resources from our web application. The problem was a blocker for the customer and required to identify the problem and provide a workaround, if possible. I was assigned the task as I was the SME in this area (NodeJS microservices, infrastructure such as storage, microservice messaging and configuration).

Fixing Production Down caused by MongoDB Corruption and Heketi/GlusterFS Failed Provisioning

· 11 min read
Kobbi Gal
I like to pick things apart and see how they work inside

Introduction

Today I received an escalation from one of our largest and most strategic customers. Over the weekend, the customer had ‘patched’ their 3 Ubuntu 18.04 nodes running Kubernetes 1.17. They were using glusterfs as their shared storage class.