Linux Fundamentals for Engineers

Everything Is a File (Really)

"Everything is a file" is the slogan every Linux beginner hears and then half-forgets. It sounds like marketing. It is not. It is the single design decision that lets you pipe the output of ps into grep, redirect stderr to a network socket, read a GPU's temperature with cat, check a process's open files with ls, capture an audio device with dd, and write to a hard drive with the same syscalls you use to write to a text file. Every one of those is the same mechanism underneath. If "everything is a file" were only half-true, Linux would be a pile of unrelated APIs. Because it is mostly true, Linux is a coherent system, and you can reason about parts of it you have never used by analogy to parts you use every day.

What "File" Actually Means Here

Forget the desktop-computer definition of a file (a document on disk). In Linux, a file is anything that:

has a path somewhere in the filesystem tree (or can be referenced by a file descriptor),
can be opened with open() to get a file descriptor back,
supports some subset of read(), write(), close(), ioctl(), mmap(), lseek(), poll().

That is it. The definition is about the API, not about what is behind it. The thing behind the file can be:

bytes on a disk,
an LED on the motherboard,
the CPU temperature sensor,
a TCP connection,
a process's memory map,
a kernel event queue,
a fake stream of zeros (/dev/zero),
a character device a driver invents out of thin air.

As long as something exposes read()/write()/etc., Linux calls it a file and your programs can talk to it the same way.

KEY CONCEPT

"Everything is a file" is really "everything uses the file API." The kernel maintains one giant switchboard (the Virtual File System, VFS) that routes your read() and write() calls to whatever driver or subsystem actually implements them. This is why Unix tools like cat, tee, dd, grep, and pipe work on things their original authors never imagined, they only know about the file API, and the kernel fills in the rest.

The VFS: One Switchboard, Many Backends

When your program calls read(fd, buf, 4096), it does not matter whether fd points at an ext4 file, an NFS mount, a USB device, or a TCP socket. The kernel receives the syscall, looks at the file descriptor, finds the file operations table associated with it, and calls the appropriate read function for that backend.

Each filesystem module plugs in its own implementation of read, write, open, mmap, etc. Your code never has to know. tail -f /var/log/syslog and tail -f /proc/mounts and tail -f /sys/class/net/eth0/statistics/rx_bytes all use the same syscalls, the VFS takes care of routing each one to the right backend.

This is also why you can bind mount, overlay, chroot, and containerize parts of the filesystem: the VFS is a namespace you can rearrange.

The Seven Things That Look Like Files

Every entry in the filesystem has a type. ls -l tells you which type with the first character of each line:

Character	Type	What it is	Example
`-`	Regular file	Bytes on a storage backend	`/etc/hostname`, `/usr/bin/ls`
`d`	Directory	A container for other entries	`/etc`, `/home`
`l`	Symlink	A pointer to another path	`/etc/localtime`, `/bin` on many distros
`c`	Character device	A byte stream (read/write one byte at a time)	`/dev/null`, `/dev/tty`, `/dev/random`
`b`	Block device	Random-access storage (read/write in blocks)	`/dev/sda`, `/dev/nvme0n1`
`p`	Named pipe (FIFO)	A unidirectional queue between processes	`/run/initctl`, anything from `mkfifo`
`s`	Socket	A bidirectional channel between processes	`/var/run/docker.sock`, `/run/systemd/journal/socket`

# See every type side by side
ls -l /dev/null /dev/sda /etc/localtime /run/docker.sock 2>/dev/null /tmp
# crw-rw-rw- 1 root root 1, 3 Apr 19 10:01 /dev/null           ← character device
# brw-rw---- 1 root disk 8, 0 Apr 19 10:01 /dev/sda            ← block device
# lrwxrwxrwx 1 root root   35 Apr 19 10:01 /etc/localtime -> ... ← symlink
# srw-rw---- 1 root docker 0 Apr 19 10:01 /run/docker.sock      ← socket
# drwxrwxrwt 1 root root 4096 Apr 19 10:02 /tmp                 ← directory

And you open, read, and (usually) write them all with the same syscalls.

PRO TIP

The first column is not decorative. When you debug "I cannot open this path," checking the type tells you which failure mode to expect. Sockets require connect() or accept(), not just open(). Named pipes block until the other side opens them. Character devices may need ioctl() to configure before read() works. The first ls -l character tells you which rules apply.

Regular Files and Directories

These are what most people picture when they say "file."

Regular files are sequences of bytes stored on some backing storage (disk, SSD, tmpfs). Reading advances a position; lseek() moves it. mmap() lets you access them as if they were memory.
Directories are also files: they contain a list of (name, inode) pairs. You cannot read() them the usual way in modern Linux; you use getdents64() via ls, find, or opendir()/readdir(). But they have the same inode structure as regular files, the same permission bits, the same ownership.

Everything below this section is where Linux gets genuinely interesting.

Device Files: Talking to Hardware Like a Text File

Device files live in /dev and are the kernel's way of exposing hardware (and fake hardware) to userspace.

ls -l /dev | head -15
# crw-rw-rw- 1 root root     1,   3 Apr 19 10:01 null
# crw-rw-rw- 1 root root     1,   5 Apr 19 10:01 zero
# crw-rw-rw- 1 root root     1,   8 Apr 19 10:01 random
# crw-rw-rw- 1 root root     1,   9 Apr 19 10:01 urandom
# brw-rw---- 1 root disk     8,   0 Apr 19 10:01 sda
# brw-rw---- 1 root disk     8,   1 Apr 19 10:01 sda1
# crw--w---- 1 root tty      4,   0 Apr 19 10:01 tty0
# ...

The two numbers (1, 3 for /dev/null, 8, 0 for /dev/sda) are the major and minor device numbers. The kernel uses them to find the correct driver. When you open("/dev/null"), the kernel looks up major 1, finds the mem driver, and the driver's read returns immediately with 0 bytes while its write discards everything.

Character devices

Byte-at-a-time, no seeking. Terminals, serial ports, random number generators, audio.

# Read 16 random bytes from the hardware RNG
head -c 16 /dev/urandom | xxd
# 00000000: a9f1 7c12 8be3 44d7 5b1e 0fa2 3c09 8811  ..|...D.[...<...

# Write to the current terminal (same as echo)
echo "hello" > /dev/tty

# The infinite byte streams
head -c 20 /dev/zero | xxd
# 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
# 00000010: 0000 0000                                ....

Block devices

Random-access storage. The block layer handles caching, queuing, and scheduling.

# Read the first 512 bytes (one sector) of a disk — raw
sudo dd if=/dev/sda bs=512 count=1 2>/dev/null | xxd | head
# This is the disk's partition table (MBR).

# Write zeros to a whole disk (DESTRUCTIVE — never on your real system)
# sudo dd if=/dev/zero of=/dev/sdX bs=1M status=progress

WARNING

Writing to block devices with dd bypasses the filesystem entirely. There is no "undo." Specifying the wrong device (/dev/sda when you meant /dev/sdb) will corrupt the OS disk silently and you will not find out until the next reboot. Every senior engineer has at least one story about this. Always lsblk twice before you dd once.

Pipes and FIFOs: Files as Communication

A pipe is a one-way byte stream between two processes. When you type ps aux | grep nginx, the shell creates a pipe, attaches ps's stdout to the write end, and grep's stdin to the read end. Both ends are file descriptors. ps does not know it is writing to a pipe, it just calls write(1, buf, n). grep does not know it is reading from a pipe, it just calls read(0, buf, n).

Unnamed pipes (from the | operator or pipe() syscall) disappear when both ends close.

Named pipes (FIFOs) live in the filesystem as p-type entries:

# Create a named pipe
mkfifo /tmp/myfifo
ls -l /tmp/myfifo
# prw-r--r-- 1 admin admin 0 Apr 19 10:30 /tmp/myfifo

# Terminal A: write to it
echo "hello from A" > /tmp/myfifo
# (blocks until someone reads)

# Terminal B: read from it
cat /tmp/myfifo
# hello from A

FIFOs are a crude but reliable way for two processes to talk without a network or a shared database. They are still used in real systems, /run/initctl is a named pipe on Debian-derived distros.

Sockets: The One That Needs More Than open()

Network and Unix-domain sockets are files once they exist (they have a file descriptor, they support read()/write()), but creating them requires different syscalls: socket(), bind(), connect(), listen(), accept().

You will see socket files on disk for Unix-domain sockets:

ls -l /var/run/docker.sock /run/systemd/journal/socket 2>/dev/null
# srw-rw---- 1 root docker 0 Apr 19 09:00 /var/run/docker.sock
# srw-rw-rw- 1 root root   0 Apr 19 09:00 /run/systemd/journal/socket

Once a process has called accept() or connect() and gotten a file descriptor back, it can use plain read() and write() on it, which is exactly why shell tricks like this work:

# Bash's special /dev/tcp syntax — not a real path, bash intercepts it
exec 3<>/dev/tcp/example.com/80
printf 'GET / HTTP/1.0\r\nHost: example.com\r\n\r\n' >&3
cat <&3
# HTTP/1.1 200 OK
# ...

Bash is faking a path to look like a file, then using file descriptor 3 to talk HTTP over a TCP socket. The kernel did not invent this, bash did. But it works because once you have a file descriptor, it behaves like a file.

KEY CONCEPT

Sockets are the clearest place where "everything is a file" almost holds. The API for using them after creation is the file API. But creating them needs socket-specific syscalls, and some operations (like peeking at incoming data without consuming it) need socket-specific flags. This is the general pattern: the file API is the lowest common denominator, and special cases use ioctl() or dedicated syscalls.

/proc and /sys: Files That Are Really the Kernel

/proc and /sys are pseudo filesystems, the files in them are not stored anywhere. Reading /proc/loadavg calls a kernel function that generates the current output on the fly.

# The process list — one directory per running PID
ls /proc | head
# 1      <- PID 1 (init/systemd)
# 1234
# cmdline
# cpuinfo
# loadavg
# ...

# Current load average
cat /proc/loadavg
# 0.52 0.68 0.71 2/485 48923
#                    ^ running/total processes, last PID

# Mounted filesystems
cat /proc/mounts | head -5

# Everything about PID 1
ls /proc/1/
# cmdline     cwd         environ     fd          maps        status
# exe         io          limits      mem         mountinfo   stat

# The kernel version
cat /proc/version
# Linux version 6.5.0-generic ...

# Memory info
cat /proc/meminfo | head
# MemTotal:       32893400 kB
# MemFree:         1823440 kB
# MemAvailable:   18435212 kB
# ...

/sys is the newer, more structured equivalent. It exposes devices, drivers, and kernel subsystems as a tree of directories.

# Every network interface
ls /sys/class/net/
# eth0  lo  docker0  ...

# Bytes received on eth0 since boot
cat /sys/class/net/eth0/statistics/rx_bytes
# 48293012

# Every block device's queue settings
ls /sys/block/sda/queue/
# scheduler  rotational  read_ahead_kb  nr_requests  ...

WAR STORY

A Kubernetes cluster started losing nodes during heavy I/O bursts. dmesg showed nothing interesting. Someone noticed CPU steal time was high on exactly the affected nodes. The fix came from /sys/block/nvme0n1/queue/scheduler: the default mq-deadline scheduler was saturating one CPU per NVMe device. Switching to none (appropriate for NVMe) via a simple echo none > /sys/block/nvme0n1/queue/scheduler fixed it. The whole diagnosis and fix was file I/O against /sys. No reboots, no packages, no config management.

File Descriptors: Where Your Process Keeps Its Open Files

When you open() a file, the kernel hands you back a small integer, the file descriptor (fd). Your process has a table mapping fd numbers to underlying kernel file objects.

Every process has at least three open by default:

fd 0, stdin
fd 1, stdout
fd 2, stderr

You can see any process's open file descriptors:

# What is PID 1 holding open?
ls -l /proc/1/fd | head -10
# lrwx------ 1 root root 64 Apr 19 09:00 0 -> /dev/null
# lrwx------ 1 root root 64 Apr 19 09:00 1 -> /dev/null
# lrwx------ 1 root root 64 Apr 19 09:00 10 -> socket:[9824]
# lr-x------ 1 root root 64 Apr 19 09:00 100 -> anon_inode:inotify
# ...

# Human-friendly tool that covers all processes
sudo lsof -p 1 | head -15

# Every open file in the system (slow but useful)
sudo lsof | wc -l

The socket:[9824] and anon_inode:inotify entries show processes holding open things that do not live anywhere in the filesystem tree. They are still files from the process's point of view, they have file descriptors and accept the file API, but they have no path.

PRO TIP

lsof is one of the most useful Linux tools you will ever learn. When a filesystem will not unmount ("device or resource busy"), lsof +D /mount/point lists every process holding a file open there. When a port is in use, lsof -i :8080 shows exactly which process owns it. Learn lsof before you need it, you will need it at 2 AM some day.

Where the Abstraction Leaks

"Everything is a file" is elegant, but it is not universal. A few honest exceptions:

System V IPC (message queues, semaphores, shared memory from the 1980s): uses its own ipcs / ipcrm tooling and separate syscalls. Largely obsolete, newer code uses POSIX shared memory (shm_open) or mmap, which is file-like.
eventfd, timerfd, signalfd, inotify, epoll: these create file descriptors for things that traditionally were not files (events, timers, signals). This is Linux slowly pulling more and more into the file API, but none of them have a path on disk, so "everything is a file" requires a generous definition of "file."
Sockets need special syscalls to create: covered above. Once made, they are file-like.
ioctl() exists for a reason: any time a device needs a configuration operation that does not fit read/write/open/close, it exposes it as an ioctl. Terminal baud rates, loopback device setup, network interface flags, all ioctl. It is the "catch-all syscall that does not fit the model," and its existence is proof that the model is not perfect.
mmap() blurs the line: once you mmap a file, you access its contents as memory, not through read/write syscalls. Is that still "the file API"? Kind of. It is the same file, the same fd, but the access pattern is totally different.

None of these break the model, they just clarify it. The right mental model is: the file API is the default, and Linux reaches for it whenever it possibly can. Deviations are reluctant and usually because the underlying resource genuinely does not fit the stream-of-bytes abstraction.

Why This Matters in Production

ls -l is a diagnostic tool. Before you debug "I cannot talk to this thing," check what kind of thing it is. A socket you expected to be a regular file means a service is running; a regular file you expected to be a socket means the service is down.
Everything composable. tail -f /var/log/syslog | grep ERROR | tee /dev/tty | mail -s "error" oncall@… works because every piece is a file. Pipelines are the original microservices.
/proc and /sys are free instrumentation. Most Linux subsystems you will ever tune have knobs under /sys. You do not need special APIs, you need cat and echo.
lsof and /proc/$PID/fd are your X-ray. When a process "will not let go" of a file or port, these tell you why.
Docker sockets, journald sockets, nginx sockets. Most service-to-service interaction on a host goes through Unix-domain sockets in /run. Knowing they are files tells you how to debug them (ls, strace, lsof) and how to secure them (chmod, SELinux).

Key Concepts Summary

"Everything is a file" = "everything uses the file API." The VFS routes your read/write syscalls to whatever backend (ext4, procfs, sockfs, devtmpfs) implements them.
Seven file types, identified by the first character of ls -l: regular (-), directory (d), symlink (l), char device (c), block device (b), FIFO (p), socket (s).
Character vs block devices. Character = stream of bytes (terminals, RNG). Block = random-access storage (disks). Major/minor numbers identify the driver.
Pipes are files too. Named pipes (FIFOs) live on disk with type p. Anonymous pipes from the shell are just fds with no path.
Sockets are almost files. You create them with socket()/bind()/connect(), but once you have the fd, you use read and write like any other file.
/proc and /sys are synthetic. Reading from them calls kernel code that generates the output. No disk involved.
File descriptors are how a process holds open files. /proc/$PID/fd/ and lsof expose them. fd 0/1/2 are stdin/stdout/stderr by convention.
The abstraction leaks at the edges. ioctl(), sockets, and various *fd syscalls exist because not everything fits the read/write model cleanly.

Common Mistakes

Assuming files are bytes on disk. /proc/loadavg is a file. The current CPU temperature is a file. A TCP connection, once established, is a file.
Using open() on a socket path and wondering why it does not behave like a TCP connection. You need socket() + connect(), not open().
Being surprised that cat /dev/null returns instantly. /dev/null's read is implemented to return 0 bytes immediately. That is the whole driver.
Trying to cp a directory without -r. Directories are files, but the file API for them is readdir(), not read(). cp knows this; dd does not.
Writing to /sys or /proc files and expecting atomic semantics. Some are atomic (a single echo to /proc/sys/kernel/SOMETHING), some are not. Read the specific subsystem's docs.
Leaving file descriptors open and then wondering why a filesystem will not unmount. lsof +D /mnt is the 3 AM answer.
Confusing "everything is a file" with "everything is on disk." Most files in /proc, /sys, /dev, /run, and /tmp are not on disk at all. They live in RAM or are generated on the fly.

KNOWLEDGE CHECK

You run `ls -l /var/run/docker.sock` and see `srw-rw---- 1 root docker 0 Apr 19 09:00 /var/run/docker.sock`. A script you wrote tries `open('/var/run/docker.sock', 'r')` and fails with 'No such device or address.' What is going on?

The Kernel, Userspace, and System Calls

Continue

The Boot Sequence

←→ navigateM toggle sidebar