Reading /proc for Debugging
It is 2 AM. A payment service in production is "stuck." CPU is flat, memory is flat, no errors in the logs, latency has climbed to 30 seconds per request, and the on-call engineer has already tried
kubectl rollout restart. You are the next escalation. There is no APM. There is no strace installed in the container. There is no gdb. There is only a bash shell, thecoreutilsfrom busybox, and the running process.This is the situation
/procwas made for. Every running process on a Linux box has a directory under/proccontaining its complete live state — what files it has open, what memory it has mapped, what syscall it is blocked in, what its environment variables are, what cgroups it belongs to, how many context switches it has done since it started. All of it is cat-able from a minimal shell. A senior engineer who knows/proccan diagnose a hung process in under five minutes with no special tools installed. This lesson is how.
The Shape of /proc
/proc is a pseudo-filesystem: its contents are generated on the fly by the kernel when you read them. It has two kinds of entries:
- Per-process directories (
/proc/[pid]and/proc/self) — one per running process. - System-wide files (
/proc/cpuinfo,/proc/meminfo,/proc/mounts, and so on).
# Top-level structure
ls /proc | head -20
# 1 <- PID 1 (systemd)
# 2 <- PID 2 (kthreadd)
# 100 <- some other process
# cpuinfo
# meminfo
# mounts
# loadavg
# version
# sys <- /proc/sys (kernel tunables)
# ...
# /proc/self is a magic symlink that points to your own pid
ls -l /proc/self
# lrwxrwxrwx 1 admin admin 0 Apr 19 10:00 /proc/self -> 45678
# So you can inspect yourself
cat /proc/self/status | head -5
# Name: cat
# State: R (running)
# Tgid: 45678
# Ngid: 0
# Pid: 45678
Everything under /proc/[pid] is either a regular-looking file (readable with cat) or a symlink to something real (an open file, a CWD, an executable).
Reading from /proc/[pid]/* is the fastest way to answer "what is this process actually doing right now?" — faster than restarting it, faster than attaching a debugger, faster than adding logs and redeploying. You will never have less information than /proc gives you, and most of the time you will have everything you need.
The Map of /proc/[pid]
Here is the full per-process directory, with the most useful entries annotated. The ones marked ⭐ are the ones you will reach for constantly.
ls /proc/$PID/
# attr cwd io mem oom_adj schedstat stat task
# autogroup environ ⭐ limits ⭐ maps oom_score ⭐ smaps statm timers
# cgroup ⭐ exe ⭐ loginuid mountinfo pagemap stack ⭐ status wchan ⭐
# clear_refs ⭐ fd ⭐ ns net personality ⭐ syscall ⭐ comm ...
# cmdline ⭐ fdinfo numa_maps mounts root ⭐ sched ⭐ cmdline
| File | What it tells you | Typical use |
|---|---|---|
cmdline | Command line args, NUL-separated | "What was this process started with?" |
comm | Short name (15 chars) | Quick identification |
status | Every Name:, State:, Pid:, Uid:, Vm*:, Sig*:, Cpus_allowed: | The all-in-one overview |
stat | Same info as status, machine-parseable space-separated | Scripts |
cwd | Symlink to current working directory | "What dir is this cd'd into?" |
exe | Symlink to the binary | "What is actually running — even if the binary was deleted?" |
environ | Environment variables, NUL-separated | "What env did it inherit?" |
fd/ | Directory of open file descriptors (symlinks) | Open files, sockets, pipes |
fdinfo/ | One file per fd with position, flags, etc. | Fine-grained fd state |
maps | Memory regions (virtual address ranges) | "What's in this process's address space?" |
smaps | Per-region RSS/PSS/Swap/etc. | Real memory accounting |
limits | rlimits (ulimit) currently in effect | "Is this process limited to 1024 fds?" |
cgroup | Which cgroups the process is in | Container and systemd membership |
ns/ | Symlinks to namespace inodes | Which namespaces the process is in |
wchan | Kernel function the process is blocked in | "Why is it not running?" |
syscall | Current syscall number and args | "What syscall is it stuck in?" |
stack | Kernel stack trace (root only) | "Where exactly in the kernel?" |
sched | Scheduler stats | Context switches, wait time |
io | Cumulative read/write bytes | "Is this process hammering the disk?" |
task/ | One subdirectory per thread | Per-thread state of a multi-threaded process |
mem | Live memory of the process (root only) | Dump memory with gdb or custom tools |
mountinfo | Every mount visible to this process | Differs per mount namespace — important for containers |
Most of these are readable by the owning user; a few require root. You do not need to memorize the list — just remember that ls /proc/[pid] is your starting point and every file there is cat-able.
The Hung Process Playbook
Here is the exact sequence to run when a process is hung. Five minutes, no extra tools.
1. What is it, and when did it start?
# What is the process?
cat /proc/$PID/comm
# Payment-service
# Full command line
cat /proc/$PID/cmdline | tr '\0' ' '; echo
# /usr/bin/python3 /app/payment_service.py --config /etc/app/prod.yaml
# When did it start?
ls -ld /proc/$PID
# dr-xr-xr-x 9 app app 0 Apr 19 02:14 /proc/45678 <- started at 02:14
# (note: "0" size is normal for /proc)
# Or read it directly from stat (field 22 is start time in jiffies since boot)
awk '{print $22}' /proc/$PID/stat
2. What state is it in?
cat /proc/$PID/status | head -20
# Name: python3
# State: S (sleeping) <- not running — blocked somewhere
# Tgid: 45678
# Pid: 45678
# PPid: 1
# Uid: 1000 1000 1000 1000
# Gid: 1000 1000 1000 1000
# FDSize: 128
# Groups: 1000
# VmPeak: 1258432 kB
# VmSize: 1258432 kB
# VmRSS: 483128 kB
# Threads: 34 <- multi-threaded
# SigQ: 0/31389
# SigPnd: 0000000000000000
# SigBlk: 0000000000000000
# SigIgn: 0000000000001000
# SigCgt: 0000000180004a07 <- has handlers installed
States and what they mean:
R (running)— on a CPU or in the runqueueS (sleeping)— waiting on something, wakeable by signalsD (disk sleep)— waiting on I/O, not killableT (stopped)— received SIGSTOP or SIGTSTPZ (zombie)— exited, parent has not reapedI (idle)— kernel thread idling
3. If it is blocked, what on?
# What kernel function is it blocked in?
cat /proc/$PID/wchan
# futex_wait_queue_me <- waiting on a mutex/condvar
# do_epoll_wait <- idle event loop
# sk_wait_data <- waiting on socket data
# do_nanosleep <- sleeping
# What syscall is currently in progress?
cat /proc/$PID/syscall
# 202 0x7f1b... 0x80 0x0 0x0 0x0 ...
# ^^ syscall number. Look it up with ausyscall or the table
ausyscall 202
# futex
# So this process is stuck inside a futex call — almost always a lock
wchan + syscall together tell you exactly why a process is not running. futex is a lock. epoll_wait is an event loop idling (fine!). read or recvfrom is waiting on I/O. nanosleep is voluntarily sleeping. For a "stuck" process, a wchan of futex_wait_queue_me means some other thread holds a lock and never released it — now you need to find that thread.
4. Look at every thread
# Show each thread's state and wchan
for tid in $(ls /proc/$PID/task); do
state=$(awk '/^State:/{print $2$3}' /proc/$PID/task/$tid/status)
wchan=$(cat /proc/$PID/task/$tid/wchan)
echo "$tid $state $wchan"
done
# 45678 Ssleeping futex_wait_queue_me
# 45679 Ssleeping futex_wait_queue_me
# 45680 Rrunning 0
# 45681 Ssleeping futex_wait_queue_me
# ...
# htop shows the same with 'H' to toggle thread view
If 33 threads are all stuck in futex and 1 thread is running hot, that one thread is holding the lock everyone else wants. That is the thread to profile.
5. What files and sockets does it have open?
# Every open file descriptor
ls -l /proc/$PID/fd | head -15
# lr-x------ 1 app app 64 Apr 19 02:14 0 -> /dev/null
# l-wx------ 1 app app 64 Apr 19 02:14 1 -> pipe:[91283]
# l-wx------ 1 app app 64 Apr 19 02:14 2 -> pipe:[91284]
# lrwx------ 1 app app 64 Apr 19 02:14 3 -> socket:[123456]
# lrwx------ 1 app app 64 Apr 19 02:14 4 -> anon_inode:[eventpoll]
# lr-x------ 1 app app 64 Apr 19 02:14 5 -> /etc/app/prod.yaml
# lrwx------ 1 app app 64 Apr 19 02:14 6 -> socket:[123460]
# ...
# How many fds open?
ls /proc/$PID/fd | wc -l
# 217
# Compare to the limit
grep 'Max open files' /proc/$PID/limits
# Max open files 1024 4096
# ^ soft ^ hard limit
# Which socket is fd 3?
sudo ss -p | grep "$PID,fd=3"
# tcp ESTAB 0 0 10.0.1.5:50123 10.0.99.1:5432 users:(("python3",pid=45678,fd=3))
# So fd 3 is a TCP connection to 10.0.99.1:5432 — the database
A service mysteriously hung after running for exactly 7 days. /proc/[pid]/fd showed 1024 open file descriptors — exactly the soft rlimit. Every fd pointed at a socket to the same external API. A retry loop was leaking one TCP connection per network hiccup and never closing them. No error message, no log line — the process just silently stopped accepting work once accept() started returning EMFILE. ls /proc/[pid]/fd | wc -l and grep 'Max open files' /proc/[pid]/limits took 10 seconds to find. The fix was a with block around the HTTP client. Since then, every production service gets an alert when its open fd count crosses 50% of its limit.
6. What is in memory?
# High-level layout
cat /proc/$PID/maps | head -10
# 556a... r--p ... /usr/bin/python3.11 <- text
# 556a... r-xp ... /usr/bin/python3.11
# 556a... rw-p ... [heap] <- where malloc grows
# 7f... rw-p ... [stack:45679] <- thread 45679's stack
# 7f... r-xp ... /usr/lib/.../libssl.so.3
# 7f... rw-p ... <- anonymous, probably malloc/mmap
# ...
# Detailed per-region memory accounting
head -20 /proc/$PID/smaps
# 556a... r--p ... /usr/bin/python3.11
# Size: 4 kB
# Rss: 4 kB <- pages in RAM
# Pss: 1 kB <- proportional (shared/4)
# Shared_Clean: 4 kB
# Shared_Dirty: 0 kB
# Private_Clean: 0 kB
# Private_Dirty: 0 kB
# Referenced: 4 kB
# Anonymous: 0 kB
# ...
# Total PSS across the whole process
awk '/^Pss:/ {sum+=$2} END {print sum" kB"}' /proc/$PID/smaps
# 512340 kB <- actual memory cost
# Detecting swap usage
awk '/^Swap:/ {sum+=$2} END {print sum" kB"}' /proc/$PID/smaps
7. I/O and scheduler stats
# Cumulative I/O
cat /proc/$PID/io
# rchar: 458092340 <- bytes read (including from page cache)
# wchar: 891234 <- bytes written
# syscr: 5821 <- read/write syscall counts
# syscw: 203
# read_bytes: 16384 <- actual bytes from disk
# write_bytes: 0
# cancelled_write_bytes: 0
# Scheduler stats
head -20 /proc/$PID/sched
# python3 (45678, #threads: 34)
# ...
# se.sum_exec_runtime : 1842.193421
# se.nr_migrations : 2
# nr_voluntary_switches : 2847 <- yielded voluntarily (I/O)
# nr_involuntary_switches : 193 <- preempted by scheduler
8. Which cgroup, which namespaces?
# cgroup membership — on cgroup v2 it's one line
cat /proc/$PID/cgroup
# 0::/system.slice/docker-abc123.scope
# (on v1 you get one line per subsystem: memory, cpu, etc.)
# Namespace membership — each is a symlink with a unique inode
ls -l /proc/$PID/ns/
# lrwxrwxrwx 1 app app 0 ... cgroup -> 'cgroup:[4026531835]'
# lrwxrwxrwx 1 app app 0 ... ipc -> 'ipc:[4026532152]'
# lrwxrwxrwx 1 app app 0 ... mnt -> 'mnt:[4026532150]'
# lrwxrwxrwx 1 app app 0 ... net -> 'net:[4026532155]'
# lrwxrwxrwx 1 app app 0 ... pid -> 'pid:[4026532153]'
# lrwxrwxrwx 1 app app 0 ... user -> 'user:[4026531837]'
# lrwxrwxrwx 1 app app 0 ... uts -> 'uts:[4026532151]'
# Two processes with the same inode for pid:[...] are in the same PID namespace
# (meaning, usually, the same container)
This is a huge deal: if you see two processes with pid:[4026532153], they are the same container. Different inode means different namespace means different container (or the host).
We cover namespaces in depth in Module 5; for now, just know /proc/[pid]/ns/ is where you check.
System-Wide /proc Files You Will Use
Not under a PID, but equally useful for production debugging.
# CPU info — useful for sizing, NUMA, and feature detection
cat /proc/cpuinfo | grep -E '^(processor|model name|cache size|cpu MHz)' | head
# Memory — used/free/cached/buffers at a glance
cat /proc/meminfo | head -10
# MemTotal: 32893400 kB
# MemFree: 1823440 kB
# MemAvailable: 18435212 kB
# Buffers: 120456 kB
# Cached: 14280432 kB
# ...
# Load average and process counts
cat /proc/loadavg
# 0.52 0.68 0.71 2/485 48923
# ^ ^ ^
# | | +-- last PID created
# | +-- total processes
# +-- currently runnable
# All currently mounted filesystems (from the kernel's point of view)
cat /proc/mounts | head -5
# Kernel version and build info
cat /proc/version
# Interrupts per CPU
cat /proc/interrupts | head
# TCP stats — RCV buffers, active connections, etc.
cat /proc/net/tcp | head
/proc/sys/ is the control surface for kernel tunables — it is both readable (current value) and writable (change it). echo 1 > /proc/sys/net/ipv4/ip_forward turns on routing. cat /proc/sys/kernel/pid_max tells you the max PID. Almost every "tune the kernel" recipe you will find on the internet is really "write a value into a file under /proc/sys". The sysctl command is a nicer frontend for the same files.
Recovering a Deleted Binary or Log
A classic /proc trick: if a process has a file open and someone deletes the file on disk, you can still recover the content through /proc/[pid]/fd.
# Imagine: nginx is running, someone ran "rm /var/log/nginx/access.log"
# Nginx still holds fd 5 open pointing at the deleted file
ls -l /proc/$(pgrep -f 'nginx: master')/fd | grep deleted
# l-wx------ 1 root root 64 Apr 19 10:00 5 -> /var/log/nginx/access.log (deleted)
# Recover the content while nginx is still running
cp /proc/$(pgrep -f 'nginx: master')/fd/5 /tmp/recovered-access.log
# Now /tmp/recovered-access.log has everything nginx has written so far
# Same trick for a deleted binary
cp /proc/$PID/exe /tmp/recovered-binary
This is the single most important /proc trick nobody mentions. Worth filing away.
Key Concepts Summary
/procis the kernel's live UI. Reading a file under/procruns kernel code that generates fresh state — nothing is cached on disk.- Every running process has a
/proc/[pid]directory./proc/selfis a magic symlink to your own. - The "debug a hung process" checklist is cat-only.
cmdline→status→wchan→syscall→fd/→limits→maps/smaps. - Process state decodes the behavior. R/S/D/T/Z each mean something specific;
Dis the only one that cannot be killed. /proc/[pid]/task/lists every thread. Per-threadwchanandstatustell you which thread is holding the lock./proc/[pid]/fdplusss -ptells you every network connection.lsofis a nicer interface over the same data./proc/[pid]/cgroupand/proc/[pid]/ns/tell you which container and which namespaces the process belongs to./proc/[pid]/fd/[N]can recover deleted files as long as the process is still holding the fd open.- System-wide files (
cpuinfo,meminfo,loadavg,mounts,sys/) cover the rest of the host.
Common Mistakes
- Looking at
VmSizeand panicking — that is virtual address space, most of which may never be touched.VmRSSis what is in RAM. - Reading
/proc/[pid]/statusas "the process's total state" and ignoring/proc/[pid]/task/*/status. A multi-threaded process's interesting state is usually per-thread. - Treating
/proc/[pid]/statas user-friendly. It is not — it is machine-parseable. For humans,/proc/[pid]/statushas labels. - Writing to
/proc/sys/files without knowing whether the change persists. It does not — reboot loses it. Use/etc/sysctl.confor/etc/sysctl.d/*.conffor persistence. - Forgetting that
/proc/[pid]disappears instantly when the process exits — scripts reading it must handle "directory vanished" as a normal case. - Using
ps(which reads/procunder the hood) and then ignoring/procitself when ps does not show what you need. Half the columnspscould show are just there instatuswaiting for you. - Assuming
/proc/[pid]/cmdlineis always populated. For kernel threads and some short-lived helpers it is empty; fall back tocomm. - Reading
/proc/mountsto see "my mounts" from inside a container — you will see the container's mount namespace view, which is often different from the host's. That is a feature, not a bug.
A Python service is hung. You cat /proc/$PID/wchan and see `futex_wait_queue_me`. You check each thread under /proc/$PID/task/*/wchan and find 33 threads in `futex_wait_queue_me` and exactly one thread with `wchan` value `0` (running). What is the most likely situation and what is your next step?