Loops That Do Not Break
Looping in Bash is where "it works on my machine" meets production. A loop that iterates over filenames works fine when every filename is a.txt, b.txt, c.txt. It breaks the moment someone names a file my report.txt or the data contains a newline.
This lesson covers the loop forms that stay correct under real-world input — filenames with spaces, lines ending in backslashes, CSV rows with embedded newlines. The goal is loops you never have to come back to because edge-case input broke them.
The default for x in $(cmd); do is almost always wrong. It word-splits and globs. Use while IFS= read -r or for x in glob/* instead. A few memorized idioms cover 95% of real loops.
The for loop — two flavors
Flavor 1: list iteration
for fruit in apple banana cherry; do
echo "$fruit"
done
Iterates over the literal words after in. Works well when the list is static or comes from a safe source (a glob, an array).
Flavor 2: C-style
for (( i = 0; i < 10; i++ )); do
echo "iteration $i"
done
C-syntax for counting. Useful when you need an index. No $ needed on variables inside the (( )).
Iterating over arrays
files=("a.txt" "my report.pdf" "c.log")
for f in "${files[@]}"; do
echo "$f"
done
The quoted "${arr[@]}" form is mandatory. Without quotes, each element gets word-split again.
Iterating over glob matches
for file in /var/log/*.log; do
echo "$file"
done
This is the idiomatic way to iterate files. The shell expands the glob; each match becomes one iteration. Spaces in filenames work correctly because glob expansion preserves element boundaries.
# Enable nullglob so the loop doesn't run with "*.log" literal when no matches
shopt -s nullglob
for file in /var/log/*.log; do
process "$file"
done
Without nullglob, if no .log files exist, the loop runs once with $file equal to the literal string *.log. That is almost never what you want.
Why for x in $(cmd) is a trap
The most common broken loop:
# BROKEN for filenames with spaces or newlines
for file in $(ls /tmp); do
rm "$file"
done
What breaks:
$(ls)word-splits on every IFS character (space, tab, newline).- A filename like
my report.txtsplits into two "files":myandreport.txt. - A filename containing a
*gets globbed against the current directory.
The fixes
Fix 1: use a glob directly.
for file in /tmp/*; do
rm "$file"
done
Fix 2: use while read with null-delimited input for maximum safety.
while IFS= read -r -d '' file; do
rm "$file"
done < <(find /tmp -maxdepth 1 -print0)
Fix 3: read into an array with readarray/mapfile.
mapfile -t files < <(ls /tmp)
for file in "${files[@]}"; do
rm "/tmp/$file"
done
The glob fix is almost always the simplest. Reach for find + null-delimited read when you need recursive traversal or filtering.
The while read loop — the workhorse
For reading input line by line:
while IFS= read -r line; do
echo "got: $line"
done < input.txt
Three pieces that matter:
1. IFS= (empty)
Normally, read strips leading/trailing whitespace. IFS= on the read command line disables that, preserving the line exactly.
# Without IFS=
line=" hello "
echo "$line" | while read line; do echo "[$line]"; done
# [hello] <- leading and trailing spaces gone
# With IFS=
echo "$line" | while IFS= read line; do echo "[$line]"; done
# [ hello ] <- preserved
2. -r (no backslash interpretation)
Without -r, read interprets \ as an escape character. This breaks lines that legitimately contain backslashes (paths on Windows, regex patterns, etc.)
# Without -r
echo 'hello\nworld' | while read line; do echo "[$line]"; done
# [hellonworld] <- backslash interpreted as escape, "\n" became "n"
# With -r
echo 'hello\nworld' | while read -r line; do echo "[$line]"; done
# [hello\nworld] <- backslash preserved as literal
Always use -r. There is no scenario where you want the backslash interpretation.
3. done < input.txt (not piping)
If you pipe into the while loop (cat file | while ...), the loop runs in a subshell and variable changes don't escape. Redirecting (done < file) runs the loop in the current shell.
See the earlier lesson on subshells for the details.
while IFS= read -r line is the canonical line-reading idiom. Memorize it. Anything missing -r or IFS= is subtly wrong for real-world input.
Reading into multiple fields
read can split a line into multiple variables:
echo "alice 30 engineer" | while read -r name age role; do
echo "name=$name age=$age role=$role"
done
# name=alice age=30 role=engineer
The last variable gets the remainder of the line (any extra fields).
With a delimiter
IFS=: while read -r user _ uid _ _ _ shell; do
echo "$user uses $shell"
done < /etc/passwd
Sets IFS to : for the read, splits each line on :, assigns to positional variables, _ is a convention for "ignore."
Reading CSV (approximately)
# Simple case — no embedded commas or quotes
while IFS=',' read -r col1 col2 col3; do
echo "$col1 / $col2 / $col3"
done < data.csv
For real CSV (with embedded commas, quotes, multi-line fields) — don't use Bash. Use csvkit, python, or a real CSV tool. Bash's read cannot parse CSV correctly.
Loops over command output — the right pattern
When you need to iterate over command output (not just files), use while read with process substitution:
# Process users from a command
while IFS= read -r user; do
echo "processing $user"
done < <(getent passwd | awk -F: '{print $1}')
This avoids both:
- The
for x in $(cmd)word-splitting problem. - The
cmd | whilesubshell-variable-loss problem.
Null-delimited loops — the bulletproof version
For maximum safety against unusual filenames (newlines in names, any special characters), use null-delimited input:
# find: -print0 emits null-separated output
# read: -d '' reads until null byte
while IFS= read -r -d '' file; do
echo "processing: $file"
done < <(find /path -type f -print0)
This handles any filename except ones with literal \0 in them — which is impossible on Unix filesystems.
When to reach for this:
- Untrusted input where filenames might contain newlines.
- Scripts that absolutely must not fail on edge cases.
- High-integrity operations (rm, chmod) where a wrong filename is catastrophic.
For casual scripts, glob patterns are usually enough.
The continue and break keywords
Both work as expected:
for file in /var/log/*.log; do
[[ ! -r "$file" ]] && continue # skip unreadable
if [[ "$(head -1 "$file")" == "DONE" ]]; then
break # stop after finding a completed one
fi
process "$file"
done
With nested loops, continue N / break N jumps N levels:
for a in 1 2 3; do
for b in x y z; do
if [[ "$b" == "y" ]]; then
continue 2 # continue the OUTER loop
fi
echo "$a $b"
done
done
# 1 x
# 2 x
# 3 x
Rarely needed, but useful when you need it.
The until loop
Like while, but loops until the condition is true:
until ping -c1 -W1 server >/dev/null 2>&1; do
echo "server not ready, waiting..."
sleep 5
done
echo "server is up"
Equivalent to while ! cond; do — use whichever reads better.
Reading a file into an array — mapfile / readarray
Bash 4+ has mapfile (alias readarray) for reading lines into an array:
mapfile -t lines < file.txt
echo "${#lines[@]} lines"
echo "${lines[0]}" # first line
echo "${lines[-1]}" # last line
-t strips the trailing newline from each line. Without -t, each array element includes its \n.
# With a filter callback
mapfile -t logs < <(grep ERROR app.log)
mapfile is faster than a manual while read loop for large files.
Progress reporting in long loops
total=$(find /data -type f | wc -l)
i=0
while IFS= read -r -d '' file; do
i=$((i + 1))
if (( i % 100 == 0 )); then
printf '\r%d / %d (%.0f%%)' "$i" "$total" "$(bc -l <<< "$i/$total*100")"
fi
process "$file"
done < <(find /data -type f -print0)
echo
\r returns to the start of the line so you overwrite instead of scrolling. Update every N iterations, not every iteration — otherwise you're paying printf cost per item.
Loop performance tips
1. Minimize forks
# SLOW — forks a process every iteration
for f in *.txt; do
length=$(wc -l "$f" | awk '{print $1}')
echo "$f: $length"
done
# FASTER — read once, process in Bash where possible
for f in *.txt; do
mapfile -t lines < "$f"
echo "$f: ${#lines[@]}"
done
Inside a hot loop, every subshell or command substitution is a fork. Cut them where you can.
2. Use mapfile for whole-file reads
# SLOW
while IFS= read -r line; do
lines+=("$line")
done < file.txt
# FASTER
mapfile -t lines < file.txt
3. Batch external calls
# SLOW — invokes grep per file
for f in *.log; do
grep ERROR "$f"
done
# FAST — one grep invocation over all files
grep ERROR *.log
For real CPU-bound work, Bash is the wrong tool. But for I/O-bound loops, the above tricks help.
When a Bash loop feels too slow, the question is usually not "how can I speed up the loop" but "why is there a subprocess inside this loop?" Remove the fork; the loop speeds up 10-100x.
A worked example — processing a log directory
A script that rotates old logs:
#!/bin/bash
set -euo pipefail
shopt -s nullglob
log_dir="${1:-/var/log/myapp}"
retention_days=30
archive_dir="$log_dir/archive"
mkdir -p "$archive_dir"
# Find .log files older than retention_days
found=0
while IFS= read -r -d '' file; do
base="$(basename "$file")"
dest="$archive_dir/$base.$(date +%Y%m%d).gz"
gzip -c "$file" > "$dest"
rm -f "$file"
found=$((found + 1))
done < <(find "$log_dir" -maxdepth 1 -name "*.log" -mtime +$retention_days -print0)
echo "archived $found log files"
Every idiom from this lesson: nullglob, while IFS= read -r -d '', null-delimited find, safe quoting everywhere, counter increments in the parent shell (no pipeline subshell), error-proof with set -euo pipefail.
Quiz
You want to iterate over every .txt file in the current directory, including ones with spaces in the name. Which loop is safest?
What to take away
for file in glob/*is the safe, idiomatic file-iteration loop. Addshopt -s nullglobfor zero-match safety.while IFS= read -r line; do ...; done < fileis the canonical line-reading loop.- Never
for x in $(cmd). Nevercmd | while. Both have quiet bugs. - For maximum safety against exotic filenames, use null-delimited:
while IFS= read -r -d '' f; do ...; done < <(find ... -print0). mapfile -t arr < fileis the fastest way to slurp lines into an array.(( i = 0; i < N; i++ ))for C-style counting loops.- Remove subprocess forks from hot loops — they dominate performance.
Next lesson: functions — local scope, return codes vs stdout, and the pattern for "returning" strings from a function.