Linux Fundamentals for Engineers

Logs with journalctl

An outage hits at 04:17 UTC. A developer ssh's into the affected node and types tail /var/log/messages. Nothing. tail /var/log/syslog. Nothing. ls /var/log/. Mostly empty — a couple rotated files from weeks ago. Someone on Slack says "try journalctl." The developer types journalctl | tail. Thousands of lines stream out of nowhere. They try journalctl -u myapp — and there is the entire service history, cleanly per-unit, with structured fields, timestamps, and priority levels.

The "where are my logs" problem is not that modern Linux lacks logs — it is that the logs moved. systemd's journal stores them in a binary format in /var/log/journal/ or /run/log/journal/, not plain text. You query them with journalctl, not tail. Once you know the query vocabulary, the journal is strictly better than scattered text files — it is indexed, structured, per-service, time-windowed, and stable across reboots. This lesson is how to use it.


What the Journal Actually Is

The systemd journal is a binary, indexed, append-only log store maintained by the systemd-journald daemon (PID 1's little sibling). Every log message on a modern Linux system flows through it:

  • Kernel messages (the stuff dmesg prints).
  • stdout and stderr of every systemd-managed service.
  • Anything that calls syslog() from an application.
  • Anything that writes to /dev/log (the legacy syslog socket).
  • Anything sent to /run/systemd/journal/socket.
  • Structured logs submitted via sd_journal_send().

Each log entry is a record with fields: the message, the service, the PID, the UID, the timestamp (both wallclock and monotonic), the priority, the hostname, and any custom fields the sender added.

# See one entry in full, with every field
journalctl -n 1 -o verbose
# Fri 2026-04-19 10:00:00.123456 UTC [s=...;i=deadbeef;b=...;m=...;t=...;x=...]
#     _TRANSPORT=stdout
#     _UID=1000
#     _GID=1000
#     _PID=12345
#     _COMM=myapp
#     _EXE=/opt/myapp/bin/server
#     _CMDLINE=/opt/myapp/bin/server --config /etc/myapp.yaml
#     _SYSTEMD_UNIT=myapp.service
#     _SYSTEMD_CGROUP=/system.slice/myapp.service
#     PRIORITY=6
#     MESSAGE=ready to accept connections on :8080
#     _HOSTNAME=web-01
#     ...

Underscore-prefixed fields are trusted — the kernel or journald set them, the sender cannot forge them. Fields without underscores were supplied by the sender.

KEY CONCEPT

The journal's killer feature is structured data. Every log entry is a set of fields, not just a string. You can query for "every entry from this unit with priority ≥ error since 10 minutes ago" in one command, and the query runs against an index — not a linear scan of a text file. No more grep ERROR /var/log/* | grep 'Apr 19' rituals.


The Query Language

journalctl takes filters as command-line arguments. Filters combine with AND within one field (no — with OR within a single field, actually — let us be precise) and AND across different fields.

  • Multiple -u flags: any of the listed units.
  • Multiple _PID= field filters: any of the listed PIDs.
  • Combining -u and --since: both must hold.

The most used flags:

# By unit (service)
journalctl -u nginx                       # all logs for nginx.service
journalctl -u nginx -u postgresql          # either one

# By time window
journalctl --since "2 hours ago"
journalctl --since "2026-04-19 10:00" --until "2026-04-19 11:00"
journalctl --since today
journalctl --since yesterday --until today

# By priority (syslog levels: 0=emerg, 3=err, 4=warning, 6=info, 7=debug)
journalctl -p err                          # err and above (err, crit, alert, emerg)
journalctl -p warning..err                 # warning, err, crit, alert, emerg

# By boot — this boot, the previous boot, etc.
journalctl -b                              # current boot only
journalctl -b -1                           # previous boot
journalctl --list-boots                    # enumerate all stored boots

# Kernel only
journalctl -k                              # like dmesg but with timestamps + full text
journalctl -k -b                           # kernel, this boot

# Follow (like tail -f)
journalctl -f
journalctl -u myapp -f

# Specific executable / user / cgroup
journalctl _COMM=sshd
journalctl _UID=1000
journalctl _SYSTEMD_UNIT=myapp.service

# By PID
journalctl _PID=12345

# Show explained priority, output format options
journalctl -o short-iso                    # ISO-8601 timestamps
journalctl -o json                         # structured output
journalctl -o json-pretty | jq '.MESSAGE'  # feed into jq
journalctl -o cat                          # just the message, no prefix

Some combinations are everyday work:

# What went wrong in the last hour for this service?
journalctl -u myapp --since "1 hour ago" -p err..crit

# Live-tail just the message text
journalctl -u myapp -f -o cat

# The last 100 entries for every kubernetes-related unit
journalctl -u 'kube*' -u 'containerd*' -n 100

# Any segfaults across the system this week?
journalctl --since "1 week ago" | grep -i segfault

# Everything sshd did, ever
journalctl _COMM=sshd --no-pager
PRO TIP

journalctl -p err..crit is your go-to for "what is actually broken" — it filters out the noise of info-level chatter and gives you only errors and above. Pair it with -u UNIT to scope to a specific service, and --since to scope to a window. Three flags, 90% of your debugging.


Useful Discovery Commands

# What units have produced logs today?
journalctl --since today -F _SYSTEMD_UNIT | sort -u

# What priorities have appeared in the last hour?
journalctl --since "1 hour ago" -F PRIORITY | sort -u

# Disk usage of the journal
journalctl --disk-usage
# Archived and active journals take up 482.1M on disk.

# Show fields available in the journal (useful for writing filters)
journalctl -N | head
# __CURSOR, __REALTIME_TIMESTAMP, __MONOTONIC_TIMESTAMP, _BOOT_ID,
# _MACHINE_ID, _HOSTNAME, _UID, _GID, _CAP_EFFECTIVE, _SOURCE_REALTIME_TIMESTAMP,
# _PID, _COMM, _EXE, _CMDLINE, _SYSTEMD_CGROUP, _SYSTEMD_UNIT, MESSAGE, PRIORITY, ...

How Services Produce Logs

Any systemd-managed service has three built-in ways to produce journal entries:

  1. Write to stdout/stderr. StandardOutput=journal is the default — every line becomes a journal entry tagged with the unit.
  2. Call syslog. Any openlog() / syslog() / logger invocation ends up in the journal.
  3. Use the native journal protocol. sd_journal_send() sends structured fields — message, priority, and any custom fields.

For a service written today, option 1 is almost always the right choice. Write to stdout; systemd captures it; you query with journalctl -u. No logging library required.

# Verify a service's output is going to the journal
systemctl show myapp.service -p StandardOutput,StandardError
# StandardOutput=journal
# StandardError=journal

The structured form is great when you have the tooling:

# Python: systemd.journal.send
from systemd import journal
journal.send("login failed",
             PRIORITY=3,         # err
             USER="alice",
             IP="10.0.0.5",
             ATTEMPTS=5)

Then query by those fields:

journalctl USER=alice PRIORITY=3
PRO TIP

Structured fields beat string parsing every time. A log line like "login failed user=alice ip=10.0.0.5 attempts=5" requires a regex or awk to filter. The same entry with USER=alice ip=10.0.0.5 attempts=5 as journal fields is a one-liner query with zero text parsing. When you have the choice of what to log and how, emit fields.


Persistence: Where the Logs Live

By default, journald writes to /run/log/journal/ — a tmpfs. That means on reboot, the journal is wiped. Debugging a machine that crashed is useless if the crash logs are gone.

To persist the journal across reboots, create the real directory:

sudo mkdir -p /var/log/journal
sudo systemd-tmpfiles --create --prefix /var/log/journal

# Or just restart journald after the directory exists
sudo systemctl restart systemd-journald

# Verify persistence is on
ls /var/log/journal/
# Some-machine-id/   <- your logs live in here
journalctl --disk-usage
# Archived and active journals take up 482.1M on disk.

Most modern distros have persistent journals enabled by default. A couple still do not (minimal images, containers, some cloud AMIs). If journalctl --disk-usage reports something in /run/ or returns nothing, you are volatile.

Controlling size

In /etc/systemd/journald.conf (or a drop-in under /etc/systemd/journald.conf.d/):

[Journal]
Storage=persistent
SystemMaxUse=2G           # total disk the journal may use
SystemKeepFree=500M       # keep this much free on the filesystem
SystemMaxFileSize=100M    # each journal file up to this size
MaxRetentionSec=2week     # discard entries older than this
Compress=yes
ForwardToSyslog=no        # set to yes if you ALSO want to feed rsyslog/syslog-ng
ForwardToKMsg=no

Then:

sudo systemctl restart systemd-journald

# Rotate and vacuum right now
sudo journalctl --rotate
sudo journalctl --vacuum-size=1G      # keep the newest 1 GB
sudo journalctl --vacuum-time=7d      # discard older than 7 days
sudo journalctl --vacuum-files=10     # keep the 10 newest files

When to Reach for rsyslog or an External Collector

journald is excellent for local querying. It is less great for:

  • Long-term retention. MaxRetentionSec= tops out at whatever the local disk can hold.
  • Cross-host aggregation. Every node has its own journal.
  • Complex routing. Sending certain logs to certain destinations with rules.
  • Legacy tools. Monitoring agents that expect /var/log/*.log files.

The common pattern on production fleets:

  1. journald collects locally — fast, structured, available via journalctl for on-node debugging.
  2. Either rsyslog (with imjournal) or a dedicated shipper (vector, fluent-bit, promtail, filebeat) reads from the journal and ships to a central aggregator (Loki, Elasticsearch, Splunk, Datadog, CloudWatch).

To pull from the journal into rsyslog, enable the module:

# /etc/rsyslog.d/50-journal.conf
module(load="imjournal")

Or read via the stateful journalctl cursor:

# Resume from wherever we left off last time
journalctl --cursor-file=/var/run/mytool.cursor -f --output=json

For cloud deployments, tools like vector or fluent-bit with the systemd input can read directly from the journal database without re-parsing text.

WAR STORY

A team using Kubernetes on EC2 had intermittent node problems. Pods would go Unknown for 30 seconds, then return. Application logs in Datadog showed nothing — but journalctl -k on an affected node revealed hundreds of "Call Trace" kernel panics correlating with each Unknown event. The node-level shipper was only tailing /var/log/containers/*.log, missing journald entirely. One configuration change (adding journald as a vector input) surfaced kernel messages into the central log system, and the root cause — a driver bug — was visible within a day. Journald is not just app logs; it is the only place kernel messages live on a modern system.


Debugging Flow: Missing Logs

When you cannot find logs, work through this list:

# 1. Is journald running?
systemctl status systemd-journald

# 2. Are logs being received? (rate per second)
journalctl -f | pv -l -i 5 -r >/dev/null

# 3. Are they persistent?
journalctl --disk-usage
# If this reports /run/, logs are wiped at reboot

# 4. Is my service's output going to the journal?
systemctl show myapp.service -p StandardOutput,StandardError

# 5. Any rate limiting?
grep -E 'Suppressed|dropped' /var/log/journal/*.* 2>/dev/null
# Or in the config:
grep -E 'RateLimit' /etc/systemd/journald.conf

# 6. Am I the right user to read them?
groups
# Need to be in systemd-journal or adm group to read /var/log/journal/* without sudo

# 7. Are logs going to rsyslog but not the journal?
grep -E '^(:|include|input|module)' /etc/rsyslog.conf /etc/rsyslog.d/*.conf

Rate-limiting surprises

journald rate-limits by default to prevent log floods: RateLimitBurst=10000 and RateLimitIntervalSec=30s. If a service bursts over the limit, you will see lines like:

Suppressed 1234 messages from unit myapp.service

For a service that legitimately logs fast (web server access logs), raise the limit in a drop-in for that unit's journald config, or use a separate log pipeline.


Useful Output Formats

# Default — looks like legacy syslog
journalctl -u myapp
# Apr 19 10:00:01 web-01 myapp[12345]: ready

# Explicit RFC-3339 / ISO 8601 timestamps
journalctl -u myapp -o short-iso
# 2026-04-19T10:00:01+0000 web-01 myapp[12345]: ready

# JSON — pipe to jq
journalctl -u myapp -o json | jq -r '[.__REALTIME_TIMESTAMP, .PRIORITY, .MESSAGE] | @tsv'

# Verbose (every field) — essential for debugging why filters do not match
journalctl -u myapp -o verbose

# Export for archival
journalctl -u myapp -o export > myapp.journal.export

# Just the message body
journalctl -u myapp -o cat

Key Concepts Summary

  • The journal is binary, indexed, structured. Every message is a set of fields, not a line of text. You query with journalctl, not grep.
  • journald captures everything. Kernel messages, service stdout/stderr, syslog calls, native structured logs — all in one place.
  • -u UNIT + --since + -p err is the production triage combo. Narrows millions of lines to the relevant few in one command.
  • -b filters to a specific boot. -b 0 is the current boot; -b -1 is the previous one.
  • -f is tail -f. With -u, you live-tail one service.
  • Field filters (_COMM=, _PID=, custom fields) use the index. Way faster than text grep.
  • Persistence is not automatic on every distro. Check with journalctl --disk-usage; create /var/log/journal/ to make it persistent.
  • Size is configurable. SystemMaxUse, MaxRetentionSec, SystemMaxFileSize in journald.conf.
  • External aggregation is usually still required. journald is the local layer; a shipper (vector, fluent-bit, rsyslog+imjournal) feeds central systems.
  • Rate limits exist. Busy services can be suppressed; raise limits or use a separate pipeline for high-rate logs.

Common Mistakes

  • Using tail /var/log/syslog on a modern distro. Many distros do not ship rsyslog at all — the only log is the journal.
  • Restarting a machine to "clear the logs." If persistence is on, the logs stay. If it is off (tmpfs), you just lost historical evidence of why the machine misbehaved.
  • Ignoring the --since flag and scrolling through hours of output. --since "10 min ago" gets you to the incident window in one shot.
  • Not enabling persistence on ephemeral cloud instances. The moment the node reboots, every crash clue is gone.
  • Running journalctl -u myapp | tail -1000 instead of journalctl -u myapp -n 1000. The latter uses the journal's index and is much faster.
  • Assuming every service's output goes to the journal. If a service writes directly to a file (e.g. ExecStart=... > /var/log/app.log), the journal has nothing.
  • Forgetting user journals exist: journalctl --user for services run under systemctl --user. Completely separate from system logs.
  • Rotating logs from a script with rm /var/log/journal/*. Use journalctl --vacuum-* — it is designed for it and knows which files are still in use.
  • Over-persisting. A node that keeps 50 GB of journal gets slow to query. Set sensible retention.
  • Treating journalctl output like grep-able text. Use field filters instead — they are indexed and do not depend on message formatting staying stable.

KNOWLEDGE CHECK

You ssh into a server that rebooted 30 minutes ago during an incident. You run `journalctl --list-boots` and see only the current boot (-0 / today) — no previous boots. You need to know why the machine crashed. What happened and what is the fix going forward?