Bash & Shell Scripting for Engineers

Word Splitting and Why Quotes Matter

Every Bash bug you have ever debugged at 2am can be traced to one of two sources: forgetting to quote a variable, or thinking a quoted variable does something different than it actually does. Word splitting is the mechanism that turns both of those into production outages.

This lesson is about building the correct mental model for word splitting so that quoting stops feeling arbitrary. Once you understand what Bash does between typing a command and executing it, quoting becomes obvious.

KEY CONCEPT

Bash does not "run your command". It takes your command line, applies a sequence of transformations (parameter expansion, word splitting, globbing, etc.), and then runs whatever words are left. Quoting controls which transformations apply to which parts.


The difference that nobody explains clearly

Consider:

filename="report 2026.pdf"

# Version 1
rm $filename

# Version 2
rm "$filename"

These look like the same command. They are not. Here is what Bash actually does:

Version 1 (unquoted $filename):
  1. Expand $filename      -> rm report 2026.pdf
  2. Word-split on spaces  -> rm "report" "2026.pdf"
  3. Pathname expansion    -> rm "report" "2026.pdf"
  4. Execute               -> rm report 2026.pdf
  
  Result: rm called with TWO filename arguments.
  Error: both files not found (probably).

Version 2 (quoted "$filename"):
  1. Expand "$filename"    -> rm "report 2026.pdf"
  2. Word-split            -> skipped (it's already one word)
  3. Pathname expansion    -> skipped (it's quoted)
  4. Execute               -> rm "report 2026.pdf"
  
  Result: rm called with ONE filename argument.
  Success.

The quotes are not cosmetic. They switch off two of the transformations Bash would otherwise apply. That is the whole point.


The expansion pipeline

Every command you type runs through this pipeline before execution:

YOUR COMMAND LINErm $filenameSTEP 1: EXPANSIONParameter expansion, command substitution, arithmetic — $var becomes its valueSTEP 2: WORD SPLITTINGUNQUOTED results are split on $IFS (space, tab, newline by default). Quoted results are not.STEP 3: PATHNAME EXPANSION (GLOBBING)Words with *, ?, [abc] matched against filesystem. Quoted words skip this too.STEP 4: EXECUTEWhatever words survive are passed as ARGV to the command. That is what runs.

Quoting a variable ("$var") skips steps 2 and 3 for that variable. That is the entire rule.


What IFS actually is

IFS stands for Internal Field Separator. It is an environment variable that tells Bash which characters to split on during word splitting.

The default value contains three characters: space, tab, newline.

# See the current IFS (weird because the separators are invisible)
printf '%q\n' "$IFS"
# Output: $' \t\n'

When Bash word-splits an unquoted expansion, it breaks the result on runs of those three characters.

text="a  b\tc\nd"
for word in $text; do    # unquoted — word splitting happens
  echo "[$word]"
done
# Output:
# [a]
# [b]
# [c]
# [d]

With "$text" (quoted), no splitting happens — the whole string stays as one "word."

Changing IFS to split on different characters

Occasionally you want to split. Setting IFS to a custom value lets you do it deliberately:

csv="apple,banana,cherry"

IFS=',' read -ra fruits <<< "$csv"
echo "${fruits[0]}"   # apple
echo "${fruits[1]}"   # banana
echo "${fruits[2]}"   # cherry

This is the idiomatic pattern for parsing comma-separated data. We cover read -ra properly in a later lesson; the key idea here is that IFS controls splitting.

PRO TIP

Change IFS for a single command by prefixing it: IFS=',' read -ra arr <<< "$str". That sets IFS only for that command, not for the rest of the script.


Pathname expansion (globbing) — the other silent transformation

The second transformation that quoting disables:

pattern="*.txt"

echo $pattern    # Lists every .txt file in the directory
echo "$pattern"  # Prints the literal string "*.txt"

When Bash sees an unquoted word with *, ?, or [abc] after word splitting, it matches that word against the filesystem. Each matching filename becomes a separate argument.

This is the thing most likely to cause a silent production bug. Consider:

# A user sets their name to "John *"  (yes, some people do this)
name="John *"
echo "Welcome $name!"    # Quoted — prints "Welcome John *!"
echo "Welcome" $name!    # Unquoted — expands * to every file in cwd

Now imagine $name came from a database query, and the script was running as root in /. You have just echod every top-level file name in the filesystem into your log.

WARNING

Globbing runs by default on every unquoted expansion. This is the source of most Bash security bugs. User-controlled strings must always be quoted.


Empty-string vs unset — another consequence

Unquoted variables also behave differently when empty:

empty=""

cmd $empty arg      # Runs: cmd arg  (empty word dropped entirely)
cmd "$empty" arg    # Runs: cmd "" arg  (empty string preserved as an argument)

This is not usually what you want. If you are passing arguments to a command, you almost always want the empty string preserved as an argument, not silently dropped.

flag=""
grep $flag pattern file    # runs: grep pattern file
grep "$flag" pattern file  # runs: grep "" pattern file  (pattern becomes file!)

Both are wrong in different ways. The right answer is to build the command differently — often using an array. We cover arrays in Module 2.


The "looks like one word" trap

cmd="ls -la"

$cmd /tmp        # Works: word-splits "ls -la" into "ls" and "-la"
"$cmd" /tmp      # Fails: tries to run a program literally named "ls -la"

In this specific case, the unquoted version happens to do what you want. But:

cmd="ls 'my directory'"
$cmd             # Fails: word-splits into "ls" "'my" "directory'"
                 # The single quotes are treated as literal characters.

Shell quoting is a parse-time construct, not a runtime one. You cannot store a command with its arguments in a string and expect the string's quotes to survive word splitting. For that, you need an array:

cmd=(ls -la "/my directory")
"${cmd[@]}"      # Correctly runs: ls -la "/my directory"

More on this in Module 2.


The one rule that covers 90% of it

KEY CONCEPT

Quote every variable expansion, every time, unless you have a specific reason not to.

That is the rule. If you can articulate why you want word splitting for a specific expansion — you want to split a space-separated list into arguments, or you are intentionally globbing — then leave it unquoted. Otherwise, quote.

In practice, you will almost never want to unquote:

# 99% of expansions should be like this
rm "$file"
cp "$src" "$dst"
echo "user=$user, age=$age"
for file in "$@"; do ...
grep -r "$pattern" "$dir"

The remaining 1% is when you are deliberately exploiting word splitting or globbing — and when you do, add a comment explaining it.


Real-world bug — the space in a path

Classic production failure. Script is supposed to back up user home directories:

#!/bin/bash
for user in $(ls /home); do
  src=/home/$user
  dst=/backups/$user
  tar -czf $dst.tar.gz $src
done

Looks fine. Runs fine for years. One day someone creates a user named alice smith. The next backup run:

  • $(ls /home) returns alice smith bob carol
  • Unquoted in the for, this word-splits to 4 items
  • for user in alice smith bob carol
  • When user is alice, $src = /home/alice, which does not exist → backup fails

Fixed version:

#!/bin/bash
for user_dir in /home/*; do
  user=$(basename "$user_dir")
  src="$user_dir"
  dst="/backups/$user"
  tar -czf "$dst.tar.gz" "$src"
done

Changes: use a glob instead of ls, quote every variable, and treat the path as the thing that moves (not the username).


The special case: arrays and "$@"

There is one place where quoting does not suppress splitting — and it is a feature, not a bug:

# "$@" — expands to one word per argument, each individually quoted
for arg in "$@"; do
  echo "got: [$arg]"
done

# "${arr[@]}" — same thing for arrays
arr=("one item" "another item")
for item in "${arr[@]}"; do
  echo "got: [$item]"
done

This is the pattern for passing arguments through unchanged. Quoted "$@" is not one word — it is one word per element, and each element is properly quoted. Compare:

"$*"   # one word — all args joined with space (usually wrong)
"$@"   # N words — each arg preserved individually (usually right)

Getting this right is the difference between a script that works with spaces in filenames and one that does not.


Debugging word splitting with printf

When you are unsure whether your quoting is working, use printf '%s\n' — it prints one argument per line. The output shows you exactly how many words Bash produced:

text="one  two  three"

printf '[%s]\n' $text      # unquoted
# [one]
# [two]
# [three]

printf '[%s]\n' "$text"    # quoted
# [one  two  three]

printf repeats its format string for each argument, so if you see three lines vs one, you know how many arguments Bash passed.

PRO TIP

Add printf '[%s]\n' "$arg" anywhere you suspect quoting trouble. It's the fastest way to confirm what Bash actually sees.


Quiz

KNOWLEDGE CHECK

You have a file named report draft.txt (with a space). Which of these safely prints its contents?


What to take away

  • Bash transforms your command line through four steps before execution: expansion → word splitting → pathname expansion → execute.
  • Quoting a variable skips word splitting AND pathname expansion for that variable.
  • Unquoted $var can produce zero, one, or many arguments depending on its contents. That is the usual source of bugs.
  • IFS controls what word splitting splits on. Default is space/tab/newline.
  • Default rule: quote every variable expansion. Unquote only when you specifically need word splitting or globbing.
  • "$@" is a special case: it expands to one word per argument, each individually quoted. This is the correct way to pass arguments through.

Next lesson: command substitution and subshells — the $(cmd) form and the scoping rules that cause variable-modified-but-not-actually bugs.