Docker & Container Fundamentals

Volumes and Data

At 2 AM, a developer runs docker-compose down -v to "clean up" before a fresh start. At 2:01 AM they realize the -v flag means "remove volumes" — including the one backing Postgres. By 2:02 they are staring at an empty /var/lib/postgresql/data and the previous week's work of development data is gone. They start typing "docker volume recover" into Google.

Container storage seems simple: write to disk, the disk keeps your data. It is not simple. Containers have four different kinds of places writes can go — writable layer, bind mount, named volume, tmpfs — and each has different rules about when data persists and who owns it. Getting this wrong manifests as vanished database data, permission-denied errors, silent disk fill-ups, and mystery-slow I/O. This lesson explains all four, when to use each, and the small number of patterns that keep your data safe.


Where Container Writes Actually Go

A container's / is an OverlayFS mount with four potential destinations for any write:

  1. Writable upper layer (the default). Anything written to a path that is not a mount point lands here. Destroyed when the container is removed.
  2. Bind mount. A host directory or file grafted into the container at a specific path. Persists on the host, not tied to the container's lifecycle.
  3. Named volume. A Docker-managed storage location, typically under /var/lib/docker/volumes/. Persists until explicitly removed with docker volume rm.
  4. tmpfs. A RAM-backed filesystem mount. Fast, ephemeral, counts against container memory limits.
KEY CONCEPT

The rule: any data you want to survive container removal must live in a bind mount or a named volume. The writable layer is convenient for everything else, but it is wiped on docker rm. One line of config difference between "stored properly" and "lost forever" is what catches teams out; always look at where your app writes data and confirm that path is mounted somewhere persistent.


The Writable Layer: Fast but Ephemeral

docker run -it --name temp alpine sh
# Inside the container
echo "notes" > /root/note.txt
cat /root/note.txt
exit

# Container is stopped but not removed
docker start -a temp
cat /root/note.txt
# notes    ← still here because the container was only stopped

# Now remove
docker rm temp
# The note is gone forever. The overlay diff was deleted.

The writable layer is where uncaptured writes go. Logs that the app writes to /var/log/app.log without a volume. Caches the app creates in /tmp. Python __pycache__ files. All of it lives in the container's overlay diff/ directory and is deleted with docker rm.

This is usually what you want for ephemeral state. It becomes a problem when you accidentally put something important there.


Bind Mounts: Host Path → Container Path

# Mount the host's /home/user/src into /app inside the container
docker run --rm -v /home/user/src:/app -w /app node:20 npm test

# Read-only
docker run --rm -v /etc/ssl/certs:/etc/ssl/certs:ro alpine ls /etc/ssl/certs

# The more-explicit --mount form
docker run --rm \
    --mount type=bind,source=/home/user/src,target=/app,readonly \
    -w /app node:20 npm test

A bind mount grafts a specific host directory or file into the container at the target path. The container sees its /app; the host still sees its /home/user/src. Changes are instantly visible on both sides — it is the same inode.

When to use bind mounts

  • Local development. Mount your source tree into the container, run npm run dev or python -m flask run, edit files on the host, watch the container pick up changes immediately.
  • Config injection. Mount a single config file into the container (-v /etc/myapp/config.yaml:/etc/app/config.yaml:ro).
  • Host paths the container needs to see. /var/run/docker.sock (for tools that talk to Docker), /etc/hosts, /sys for system tooling.
  • Persistent data in small, simple setups. On a single server, you can bind-mount /srv/myapp/data into the container. Backups are just tarballs of the host path.

Common pitfalls

Host path does not exist

docker run --rm -v /does/not/exist:/app alpine ls /app
# Docker creates /does/not/exist on the host (as root) and mounts it empty.
# The container sees an empty directory, not an error.

Docker silently creates missing bind-mount source paths as root. This is why a typo in your compose file leads to "my app's config is empty" instead of a clear error.

Permission mismatches

# Your host user is UID 1000
ls -ld /home/user/src
# drwxr-xr-x 20 user user 4096 Apr 20 10:00 /home/user/src

# The image's USER is node (UID 1000 — happens to match on Node image)
docker run --rm -v /home/user/src:/app node:20 ls -l /app
# Works fine — UIDs match.

# The image's USER is nobody (UID 65534) or root (UID 0)
docker run --rm -v /home/user/src:/app --user nobody alpine touch /app/newfile
# touch: /app/newfile: Permission denied

The container sees host files with their host UIDs, not "translated." If the container's user does not have permission on the host path, writes fail. Fixes:

  • Run the container as a UID that matches the host file ownership: --user $(id -u):$(id -g).
  • chown the host path to a UID the container uses.
  • Use a named volume instead (Docker can initialize it with the image's ownership).
  • Use user namespaces (--userns) to remap — more advanced.
WARNING

On macOS and Windows with Docker Desktop, bind mounts are proxied through a VM's shared filesystem. This is slow — often 10-50× slower than native — especially for many small files (npm install, Python site-packages, Git operations). Workarounds: use named volumes for node_modules/venv/vendor dirs (not bind mounts), use Docker Desktop's "VirtioFS" sharing mode (enabled by default in recent versions), or use WSL2 on Windows where Linux paths are native.

Overlay a single file

# Mount just one file
docker run --rm -v /etc/myconfig.yaml:/etc/app/config.yaml:ro myapp

Critical detail: Docker cannot replace a file that does not exist in the image. If /etc/app/config.yaml is not in the image, the bind mount creates it as a directory on the host and you get confusing errors. Make sure the target file exists in the image first (even if empty).


Named Volumes: Docker-Managed Storage

# Create explicitly
docker volume create mydata

# Or let Docker create on first use
docker run -d --name db \
    -v mydata:/var/lib/postgresql/data \
    postgres:16

# List
docker volume ls
# DRIVER    VOLUME NAME
# local     mydata

# Inspect to see where Docker stores it
docker volume inspect mydata
# [
#     {
#         "Name": "mydata",
#         "Mountpoint": "/var/lib/docker/volumes/mydata/_data",
#         "Driver": "local",
#         ...
#     }
# ]

# Remove (DESTROYS THE DATA)
docker volume rm mydata

Named volumes are stored by Docker at /var/lib/docker/volumes/<name>/_data. You interact with them by name; Docker manages the filesystem layout.

When to use named volumes

  • Databases. Postgres, MySQL, Mongo, Redis — the canonical use. The volume survives container restarts, upgrades, and recreations.
  • Dependencies caches in dev loops. Mount a named volume at /app/node_modules so deps are persistent and the host OS cannot interfere (especially important on Docker Desktop).
  • Any state you do not want bound to a specific host path. Portable across docker hosts if you back the volume up and restore it.

Named volumes and first-time population

A cool named-volume behavior: when a volume is empty and mounted at a path in the image that already contains files, Docker copies the image's files into the volume on first use.

# The node:20 image has /app pre-populated with... well, nothing typically
# But the mariadb image has /var/lib/mysql with initialization files
docker run -d --name db -v dbdata:/var/lib/mysql mariadb:10.11
# Docker copies mariadb's initial /var/lib/mysql into the 'dbdata' volume

docker run --rm -v dbdata:/data alpine ls /data
# ibdata1  ib_logfile0  ib_logfile1  mysql  performance_schema  ...

This is different from bind mounts, which always show only the host path's contents (the image's files at that path are hidden).

PRO TIP

Use named volumes for anything the image initializes. Pointing a bind mount at /var/lib/mysql starts the database with an empty directory (because the host path overrides the image's version), which breaks the initialization scripts. A named volume is empty the first time, gets populated from the image, and persists across restarts. Tutorials that use bind mounts for databases tend to include hand-crafted init steps to paper over this.


tmpfs: RAM-Backed Scratch

# With -v-style syntax on Linux
docker run --rm --tmpfs /tmp:rw,size=100m alpine sh

# With --mount
docker run --rm --mount type=tmpfs,target=/tmp,tmpfs-size=100m alpine sh

Inside the container, /tmp is a tmpfs capped at 100 MB. Writes are in RAM. The tmpfs evaporates when the container stops.

When to use tmpfs

  • High-churn caches that the container generates and consumes (build caches, temp files).
  • Secrets at runtime — mount a tmpfs, write secret into it, use it, have it disappear on restart.
  • Pair with --read-only to give the container a writable /tmp while the rest of the root is immutable:
docker run --rm --read-only --tmpfs /tmp --tmpfs /run alpine sh -c 'touch /app/x'
# touch: /app/x: Read-only file system    ← expected
# Inside /tmp and /run, writes work because of tmpfs

Trade-off: tmpfs counts against the container's memory limit. A 1 GB tmpfs with a 512 MB memory limit will OOM-kill the container once the tmpfs fills.


The Four Places Compared

FeatureWritable layerBind mountNamed volumetmpfs
PersistenceUntil docker rmHost's lifecycleUntil docker volume rmContainer stop
Storage location/var/lib/docker/overlay2/<id>/diffAny host path/var/lib/docker/volumes/<name>RAM
Portability across hostsNoYes (same path)Via backup/restoreNo
Host ownershipDocker'sHost userDocker'sDocker's
Best forEphemeral writesDev loops, configsDatabases, stateful appsScratch, secrets
PerformanceFastSlow on macOS/WinNative speedRAM speed
Can be mounted read-onlyN/AYes (ro)Yes (ro)Yes
Shows image's files when emptyN/ANo (hidden)Yes (populated)No

Backing Up and Restoring Volumes

Since volumes are just directories, a portable backup is a tarball:

# Back up 'mydata' to /tmp/mydata.tar.gz on the host
docker run --rm \
    -v mydata:/data \
    -v $(pwd):/backup \
    alpine \
    tar czf /backup/mydata.tar.gz -C /data .

# Restore from tarball into a (possibly new) volume 'newdata'
docker run --rm \
    -v newdata:/data \
    -v $(pwd):/backup \
    alpine \
    tar xzf /backup/mydata.tar.gz -C /data

For databases specifically, use the database's native backup tool (pg_dump, mysqldump) — it gives consistent snapshots while the DB is running, and the backup is portable to different DB versions. Tarring /var/lib/postgresql/data from a running Postgres risks a torn, unrecoverable backup.

WAR STORY

A team "backed up" their production Postgres volume by tarring its data directory every night. When they tried to restore after a hardware failure, the restored database refused to start — "invalid checkpoint record." Turns out tarring live Postgres data produces a snapshot Postgres cannot recognize as valid WAL state. The actual fix: switch to pg_dump/pg_basebackup + WAL archiving. Lesson: a volume is just a directory; "backing up the directory" is not the same as "backing up the database." Always use the app's native backup tooling for stateful services.


Volume Drivers: Beyond Local Storage

The default volume driver is local (stores in /var/lib/docker/volumes). Other drivers exist for NFS, cloud storage, and distributed systems:

# NFS volume
docker volume create --driver local \
    --opt type=nfs \
    --opt o=addr=10.0.0.5,rw \
    --opt device=:/exports/data \
    nfs-data

# Then mount normally
docker run -v nfs-data:/app/data myapp

For production multi-host setups, use an orchestrator's storage abstraction (Kubernetes PersistentVolume + CSI drivers). Docker's local driver is fine for single-host; beyond that, you want a proper storage layer.


Cleaning Up

# Remove stopped containers AND anonymous volumes they created
docker container prune -f

# Remove dangling volumes (no container using them)
docker volume prune -f

# Remove everything: stopped containers, unused images, unused volumes, build cache
# DANGEROUS on shared machines; read the prompt
docker system prune -a --volumes

# See what docker is using disk-wise
docker system df
# TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
# Images          35        8         12.3GB    8.1GB (65%)
# Containers      8         3         234MB     100MB (42%)
# Local Volumes   12        4         890MB     600MB (67%)
# Build Cache     0         0         0B        0B

Volumes are NOT removed by docker rm <container> unless you pass -v:

docker rm -v <container>       # removes anonymous volumes it created
docker rm <container>          # leaves anonymous volumes behind

Named volumes are never removed by docker rm — they belong to the user, not the container. This is by design (safer defaults), but also why "dangling" volumes accumulate.


Inspecting a Container's Storage

# See every mount on a container
docker inspect demo --format='{{json .Mounts}}' | jq
# [
#   {
#     "Type": "volume",
#     "Name": "dbdata",
#     "Source": "/var/lib/docker/volumes/dbdata/_data",
#     "Destination": "/var/lib/postgresql/data",
#     "RW": true,
#     ...
#   },
#   {
#     "Type": "bind",
#     "Source": "/home/user/src",
#     "Destination": "/app",
#     "RW": true,
#     ...
#   }
# ]

# See the OverlayFS layout
docker inspect demo --format='{{json .GraphDriver.Data}}' | jq
# {
#   "LowerDir": "/var/lib/docker/overlay2/<hash>/l/XXX:...",
#   "UpperDir": "/var/lib/docker/overlay2/<hash>/diff",
#   "WorkDir":  "/var/lib/docker/overlay2/<hash>/work",
#   "MergedDir":"/var/lib/docker/overlay2/<hash>/merged"
# }

Key Concepts Summary

  • Four storage destinations. Writable layer (ephemeral), bind mount (host path), named volume (Docker-managed), tmpfs (RAM).
  • Only bind mounts and named volumes persist across container removal. Everything else is wiped.
  • Bind mounts expose host paths. Great for dev and config; watch out for permissions and Docker Desktop slowness.
  • Named volumes are Docker-managed. Great for databases and any state the image initializes.
  • First-time population. Named volumes are populated from the image on first use; bind mounts hide the image's content at the target path.
  • tmpfs for scratch and read-only patterns. RAM-backed, counts against memory limits.
  • Always use the database's native backup tool for stateful services — tarring the data directory is not a backup.
  • Volumes are not auto-deleted with containers. Use docker rm -v for anonymous volumes; explicit docker volume rm for named.
  • --mount is the explicit form of -v. Use it in production for clarity.
  • docker system df shows disk usage; docker system prune --volumes reclaims unused (carefully).

Common Mistakes

  • Running a database with no volume, losing all data on docker rm. docker run -d postgres with no -v is a test-only pattern.
  • Using a bind mount on a database's data dir, breaking the image's initialization. Use a named volume.
  • Typo'ing a bind-mount host path. Docker silently creates the directory as root; your app sees empty data.
  • Running the container as a user that doesn't have permission on the host-side bind-mount path. Use --user $(id -u) or chown.
  • Using bind mounts for node_modules / .venv on Docker Desktop macOS/Windows. Named volume is 10-50× faster.
  • Calling docker-compose down -v without realizing -v deletes the named volumes. Backup first!
  • Assuming --rm removes volumes. It only removes anonymous volumes; named volumes stay.
  • Forgetting that tmpfs counts against the container's memory limit. A 2 GB tmpfs with a 1 GB limit crashes quickly.
  • Not backing up volumes. Docker doesn't do it for you; tarballs or native DB tools are yours to manage.
  • Trusting bind-mount config injection with missing target files. -v /host/config.yaml:/etc/app/config.yaml requires /etc/app/config.yaml to exist in the image — otherwise Docker creates a directory there.

KNOWLEDGE CHECK

Your teammate runs `docker run -d --name pg -p 5432:5432 postgres:16` and loads production-seeding data into it. Two days later, after a `docker rm -f pg` and a `docker run -d --name pg postgres:16`, the data is gone. Where did it go, and what was the correct setup?