The Boot Sequence
Your production Kubernetes node takes 4 minutes to come back after a reboot. The cluster autoscaler gives up and provisions a new one. You SSH into an identical healthy node and
systemd-analyze blame. The top entry says 90 seconds inNetworkManager-wait-online.service. You dig deeper: the node was configured with two NICs, one of which is unplugged in the rack. NetworkManager is patiently waiting for a carrier on a cable that does not exist. Onesystemctl mask NetworkManager-wait-onlineand the boot is down to 40 seconds. But you would never have found that without understanding the sequence of stages a Linux box goes through between power-on and login — and which tools let you see into each stage.Boot issues are special. Before
initis running, you have no logs in the normal sense. Before the filesystem is mounted, you have no tools. Before the network is up, you cannot SSH in. The only way to debug the boot is to know what happens in what order, and which artifacts each stage leaves behind. This lesson maps the full journey so the next time a box will not come back up, you know exactly where to look.
The Full Journey in One Picture
Linux Boot: From Power Button to Login Prompt
Runs from ROM the moment the CPU powers on. Initializes RAM, CPU, and peripherals (POST). Finds a bootable device in the configured boot order. On UEFI, reads a signed .efi binary from the EFI System Partition. Ends by handing control to the bootloader.
Shows the boot menu (if enabled). Reads kernel parameters. Loads the kernel image (vmlinuz) and initramfs into RAM. Jumps to the kernel entry point. Typical cost: a few hundred ms.
Decompresses itself, sets up virtual memory, initializes the scheduler, detects and initializes drivers, mounts the initramfs as a temporary root filesystem. Everything printed here ends up in dmesg.
A minimal in-RAM root filesystem. Its only job: find and mount the real root. This is where LVM gets assembled, LUKS volumes get unlocked, network storage gets mounted, NVMe drivers get loaded. Failures here end at an (initramfs) prompt.
The initramfs pivots to the real root filesystem (pivot_root or switch_root). The initramfs is freed. Control passes to /sbin/init — on almost every modern distro, that is systemd.
systemd reads /etc/systemd/system/default.target, calculates the dependency graph, and starts every unit required in parallel. Mounts filesystems, starts networking, brings up services. This is where most of your boot time lives.
Once default.target is reached, login prompts appear on the console (getty), SSH daemons accept connections (sshd), graphical login managers start (gdm/sddm). The system is now up.
Hover to expand each layer
Each layer in that stack is a different world with its own tools. Let us walk through them.
Stage 1: Firmware (BIOS or UEFI)
When you press the power button, the CPU starts executing code from a fixed address in ROM. That code is the firmware — BIOS on older hardware, UEFI on anything modern.
The firmware's job:
- POST (Power-On Self-Test). Check RAM, CPU, peripherals. Beep codes or blinking LEDs if something is wrong.
- Initialize basic hardware. Memory controller, PCIe bus, USB, disk controllers.
- Find a boot device. Walk the configured boot order (disk, network, USB) and look for something bootable.
- Hand off control.
On BIOS (legacy), bootable means "the first 512 bytes of the disk ends in 0x55AA." Those 512 bytes are the MBR (Master Boot Record). It contains a tiny bootloader stub.
On UEFI (modern), bootable means "there is an EFI System Partition (ESP) with a signed .efi file the firmware knows how to execute." The firmware loads that file directly — no 512-byte limit, no chain of stages.
# Are you booted via BIOS or UEFI?
[ -d /sys/firmware/efi ] && echo "UEFI" || echo "BIOS"
# On UEFI, list the EFI boot entries
efibootmgr -v
# BootCurrent: 0001
# Boot0000* Ubuntu HD(...)/File(\EFI\ubuntu\shimx64.efi)
# Boot0001* ubuntu HD(...)/File(\EFI\ubuntu\grubx64.efi)
# ...
# See the ESP
mount | grep efi
# /dev/nvme0n1p1 on /boot/efi type vfat (rw,relatime,...)
ls /boot/efi/EFI/
# BOOT ubuntu Microsoft ...
The difference between BIOS and UEFI matters at exactly two moments in your life: (1) installing a new OS, and (2) debugging a machine that will not boot. For everything in between, you can forget which one you have. But when debugging, knowing which firmware model you are under determines whether you look at MBR/GRUB or at the ESP and efibootmgr.
Stage 2: The Bootloader
The bootloader is the first software from "your" operating system that runs. On Linux, that is usually GRUB (Grand Unified Bootloader) or, increasingly, systemd-boot.
Its job:
- Show a menu (if configured).
- Read its config (
/boot/grub/grub.cfgfor GRUB). - Load the kernel image (
vmlinuz-*) and the initramfs (initrd-*orinitramfs-*) into RAM. - Pass a set of kernel command-line parameters (the line starting with
linuxin the GRUB menu). - Jump to the kernel entry point.
# What command-line parameters did the running kernel get?
cat /proc/cmdline
# BOOT_IMAGE=/boot/vmlinuz-6.5.0-generic root=UUID=3f... ro quiet splash
# Kernel parameters you will meet in production:
# root=UUID=… — which partition is /
# ro — mount root read-only initially (fsck-safe)
# quiet — suppress kernel messages during boot
# console=ttyS0,115200 — send console to a serial port (cloud VMs, servers)
# init=/bin/sh — RESCUE: bypass systemd, get a shell as PID 1
# rd.break — RESCUE: stop in the initramfs before pivoting
# systemd.unit=rescue.target — boot to rescue instead of default
ls /boot/
# vmlinuz-6.5.0-generic
# initrd.img-6.5.0-generic
# config-6.5.0-generic
# System.map-6.5.0-generic
Knowing how to edit kernel parameters at the GRUB menu (press e, edit the linux line, then Ctrl-X to boot) is a fundamental recovery skill. init=/bin/bash gets you a root shell on a broken system with no password prompt. systemd.unit=rescue.target gets you single-user mode. If you run physical servers or manage cloud VMs with console access, practice this on a throwaway VM before you need it in an emergency.
Stage 3: The Kernel Takes Over
The bootloader jumps to the kernel's entry point. The first thing the kernel does is decompress itself (kernel images are compressed to fit in limited pre-OS RAM). Then:
- Initialize virtual memory and page tables.
- Detect CPUs, bring secondary cores online.
- Initialize the scheduler.
- Initialize built-in drivers and the PCIe bus.
- Mount the initramfs as
/(as atmpfsin RAM). - Execute
/initfrom the initramfs.
From here on, everything the kernel prints goes to the kernel ring buffer — accessible after boot as dmesg.
# Full dmesg since boot
dmesg | head -30
# [ 0.000000] Linux version 6.5.0-generic (buildd@...)
# [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.5.0-generic ...
# [ 0.040000] Kernel supported cpus:
# [ 0.040000] Intel GenuineIntel
# ...
# With human-friendly timestamps
dmesg -T | tail -30
# Filter to errors and warnings only
dmesg -l err,warn
# Watch live as things happen (useful for hotplug debugging)
dmesg -w
A fleet of GPU nodes started failing PCIe enumeration intermittently — no GPUs visible to the OS after reboot. nvidia-smi reported "No devices were found." The symptom was intermittent and survived reboots only sometimes. Looking at dmesg | grep -i pci showed "pcieport 0000:00:01.0: can't find device capability" at exactly the time the GPUs failed to show up. The fix was a BIOS update (PCIe training margin issue on a specific motherboard revision). But the diagnosis took 30 minutes instead of 3 days because dmesg preserves the kernel's view of the boot.
Stage 4: initramfs — The Minimal Root
The initramfs is the weirdest part of the boot to most engineers. It exists for one reason: chicken and egg.
The kernel needs to mount the real root filesystem. But the real root might live:
- On an LVM volume that has not been assembled yet.
- On a LUKS-encrypted partition that needs a passphrase.
- On iSCSI, NFS, or NVMe-over-Fabrics — requiring networking and drivers.
- On Btrfs or ZFS, with subvolume selection.
- On a device whose driver is a kernel module that is not built into
vmlinuz.
The kernel cannot do any of that in its built-in drivers without becoming enormous. So instead, the kernel has a tiny built-in root (the initramfs) just capable enough to set up whatever is needed, mount the real root, and pivot to it.
# What is inside your initramfs?
sudo lsinitramfs /boot/initrd.img-$(uname -r) | head -30
# .
# bin
# init <- the script that runs first
# sbin
# usr/lib/modules/… <- selected drivers
# scripts/local <- mount local disks
# scripts/nfs <- mount NFS roots
# ...
# On modern systems, use dracut instead
lsinitrd /boot/initramfs-$(uname -r).img | less
The initramfs's /init is a shell script (or a small binary). It:
- Loads any extra kernel modules needed for the real root device.
- Assembles LVM volumes, unlocks LUKS volumes, etc.
- Mounts the real root read-only at
/root(inside the initramfs). - Calls
switch_root /root /sbin/init— which throws away the initramfs and starts the realinit(systemd) with the real root as/.
When the initramfs fails to find the real root, you drop into an (initramfs) prompt. This is a minimal busybox shell. ls, cat, mount, and lsblk work; almost nothing else does. Cause-of-death is usually one of: missing driver for a storage device (after a hardware change), misconfigured LVM (after a disk replacement), or LUKS passphrase failure. The fix is usually modprobe <driver>, vgchange -ay, then manually mounting root and running init.
Stage 5: The Switch to the Real Root
switch_root is a specific operation: the kernel unmounts the old root, mounts the new root at /, and re-executes /sbin/init. The initramfs is freed from RAM at this moment.
On every modern distro, /sbin/init is a symlink to /lib/systemd/systemd.
ls -l /sbin/init
# lrwxrwxrwx 1 root root 20 Apr 19 2024 /sbin/init -> /lib/systemd/systemd
From here on, everything is systemd.
Stage 6: systemd as PID 1
systemd takes over as PID 1 — the parent of every process, the process the kernel will reap orphans to. Its job is to bring the system to the default target, which on a server is usually multi-user.target and on a desktop is graphical.target.
A target in systemd is a named collection of services that should be running. multi-user.target pulls in networking, logging, sshd, cron, and any services you have enabled. graphical.target depends on multi-user.target and adds a display manager.
# What is the default target on this box?
systemctl get-default
# multi-user.target
# What is required to reach it?
systemctl list-dependencies multi-user.target | head -20
# The current state of every unit
systemctl list-units --state=failed
# Any failures here are why the boot is not "green"
# What is the current target?
systemctl is-active multi-user.target
# active
systemd parallelizes aggressively: it starts every unit whose dependencies are satisfied as soon as it can. This is why modern Linux boots are fast compared to SysV init's strict sequential startup. But it also means a single unit that hangs for 90 seconds (hello NetworkManager-wait-online) silently becomes your boot time.
systemd will not be fully covered until Module 4. For now the key insight is: once PID 1 is systemd, your boot time is whatever systemd says it is. Every slow boot past this point is a systemd debug exercise, and systemd has excellent tooling for it.
Stage 7: Login and Steady State
When the default target is reached, the system is "up":
gettyprocesses run on/dev/tty1,/dev/tty2, etc. to show console login prompts.sshdis listening on port 22.- On a desktop, a display manager (gdm, sddm) shows the graphical login.
- Your services (whatever you
enabled) are running.
systemd keeps running as PID 1 forever. Every process on the system is eventually a descendant of it.
# The process tree, showing how everything descends from PID 1
pstree -p | head -30
# systemd(1)─┬─ModemManager(...)─┬─{ModemManager}(...)
# ├─NetworkManager(...)
# ├─containerd(...)
# ├─sshd(...)───sshd(...)───bash(...)
# └─systemd-journald(...)
Debugging the Boot
Here is the toolkit in the order you use it.
journalctl -b — logs from this boot
# Entire boot log, from kernel through userspace
journalctl -b
# Last boot (if it failed)
journalctl -b -1
# Kernel-only messages from this boot
journalctl -b -k
# Only failures and warnings
journalctl -b -p warning..err
# Since a specific time, this boot
journalctl -b --since "5 min ago"
journalctl -b is your first stop for any boot question. It includes kernel messages (dmesg equivalent) and every userspace service log, correlated by timestamp.
systemd-analyze — what took so long?
# Total time breakdown
systemd-analyze
# Startup finished in 3.812s (kernel) + 12.435s (userspace) = 16.247s
# multi-user.target reached after 12.435s in userspace.
# Ranked list of slow units
systemd-analyze blame | head -15
# 9.104s NetworkManager-wait-online.service
# 1.234s snapd.seeded.service
# 890ms cloud-init.service
# ...
# Critical path — the chain of dependencies that determined total boot time
systemd-analyze critical-chain
# Visualize it as SVG
systemd-analyze plot > /tmp/boot.svg
dmesg — the kernel's view
Kernel-level problems (driver loads, hardware errors, OOM kills, filesystem corruption) show up in dmesg and nowhere else if you catch them early enough.
dmesg -T -l err,warn # errors and warnings with timestamps
dmesg -T | grep -i "i/o error" # I/O errors are a sign of dying disks
dmesg -T | grep -i oom # Out-of-memory kills
/proc/cmdline — what parameters did we boot with?
Already shown above. Vital when debugging "why did this boot behave differently from that one" — kernel parameters drive all kinds of behavior.
Boot in rescue mode
When the normal boot is broken and you have console access:
# At the GRUB menu, press 'e' to edit, append to the linux line, Ctrl-X to boot:
systemd.unit=rescue.target # Single-user mode, filesystems mounted
systemd.unit=emergency.target # Even more minimal, root fs read-only
init=/bin/bash # Raw bash as PID 1, no systemd at all
rd.break # Stop in the initramfs BEFORE pivoting to real root
In cloud environments (AWS, GCP, Azure), your "console" is the serial console the provider gives you. Most Linux cloud images are pre-configured with console=ttyS0 already in kernel parameters for exactly this reason. When an EC2 instance will not come back after a reboot, the serial console often shows you exactly where the boot is stuck — usually in the initramfs or early in systemd. Screenshots from the EC2 console are one of the most valuable AWS debugging features and nobody uses them.
Things That Go Wrong (And Where They Show Up)
| Symptom | Stage | First check |
|---|---|---|
| Blank screen, no BIOS messages | Firmware | Power, RAM, motherboard — hardware |
| "No bootable device" | Firmware | Boot order, ESP missing, disk failure |
| GRUB error prompt | Bootloader | grub.cfg, disk moved or renamed |
(initramfs) prompt | initramfs | Missing driver, LVM/LUKS assembly, root UUID |
| Kernel panic | Kernel / initramfs | dmesg (from a live USB if needed) |
| Hangs for 90s then continues | systemd | systemd-analyze blame — probably NetworkManager-wait-online |
| Login prompt but no SSH | systemd | systemctl status ssh, journalctl -u ssh -b |
| Boots but no network | systemd | networkctl, journalctl -u NetworkManager -b |
| Slow but eventually works | systemd | systemd-analyze blame |
Key Concepts Summary
- Boot is seven stages. Firmware → bootloader → kernel → initramfs → switch_root → systemd → login. Each stage has its own tools.
- BIOS vs UEFI only matters during install and recovery. Know which you have (
/sys/firmware/efi). - Kernel parameters are configuration.
/proc/cmdlineshows them at runtime. Edit them at the GRUB menu for recovery. - initramfs exists to bootstrap access to the real root. LVM, LUKS, network storage, and non-builtin drivers live here.
switch_rootis the handoff. Once it runs, the initramfs is gone and systemd takes over.- systemd makes boot parallel. Most boot time is systemd orchestrating services — and
systemd-analyze blameis how you find slow ones. journalctl -bis the universal log viewer for the current boot. Kernel + userspace, correlated by time.- Every stage leaves artifacts. dmesg for kernel, journalctl for systemd,
/proc/cmdlinefor the bootloader's intent.
Common Mistakes
- Trying to debug a boot hang from SSH when SSH has not started yet. The first 10 seconds of boot are before sshd; if your node hangs there, you need console access, not remote access.
- Ignoring
systemd-analyze blamebecause "the boot eventually finished." A 90-second wait on one unit is not a normal boot — it is a bug.systemctl mask UNITif you do not need it. - Rebuilding the initramfs after every change "just in case."
update-initramfs -u(ordracut) is only needed when you add modules required to mount root (new storage drivers, LUKS changes, LVM layout changes). For normal system updates,apt/dnfrebuilds it automatically. - Editing
grub.cfgdirectly. It is regenerated byupdate-grub/grub-mkconfig. Edit/etc/default/gruband the files in/etc/grub.d/, then regenerate. - Not setting up a serial console on server or cloud instances. When SSH is gone and the machine is remote, a serial console is the only way to see what is wrong.
- Assuming an unresponsive machine is "hung" when it is actually in a
(initramfs)prompt waiting for input. Always check the console before rebooting. - Confusing
systemctl rescue(a runtime operation that drops running users to rescue mode) with the GRUBsystemd.unit=rescue.targetparameter (a boot-time choice). They look similar but are different moments in a system's life.
A server reboots and gets stuck at an `(initramfs)` prompt. You see `ALERT! UUID=3f... does not exist. Dropping to a shell!` What is the most likely cause, and which stage of boot are you in?