Arrays and Associative Arrays
Most Bash scripts are broken — in a slow, rare, not-yet-noticed way — because they use strings where they should use arrays. Space in a filename? Quote containing a quote? Empty argument that needs to be preserved? All of these break strings-as-argument-lists. Arrays are the fix.
This lesson is about how to use arrays (indexed and associative), how to expand them safely, and the one operator — "$@" vs "$*" — that catches out almost every engineer the first time they hit it.
If your variable holds a list of arguments, it must be an array. Always. Strings with spaces-as-separators cannot survive round-trips. Arrays preserve element boundaries through quoting.
Why arrays exist — the case a string cannot handle
# "A list of arguments" stored as a string:
args="--verbose --filename=my report.txt --retries=3"
some_command $args # Fails: word-splits into
# ["--verbose", "--filename=my", "report.txt", "--retries=3"]
# The space inside "my report.txt" destroyed the boundary.
some_command "$args" # Fails: passes one giant argument
# ["--verbose --filename=my report.txt --retries=3"]
No amount of quoting on a string fixes this. The information "my report.txt is one argument" is gone the moment it becomes part of a space-separated string.
The array solution:
args=(--verbose "--filename=my report.txt" --retries=3)
some_command "${args[@]}"
# Passes three arguments:
# ["--verbose", "--filename=my report.txt", "--retries=3"]
Element boundaries are preserved. The space inside element 2 is part of that element and does not split into two.
Declaring and using indexed arrays
# Declare and assign
fruits=(apple banana cherry)
# Add later
fruits+=(date)
# Access by index
echo "${fruits[0]}" # apple
echo "${fruits[1]}" # banana
echo "${fruits[-1]}" # date (negative index: from end)
# Length
echo "${#fruits[@]}" # 4
# All elements
echo "${fruits[@]}" # apple banana cherry date
# All indexes
echo "${!fruits[@]}" # 0 1 2 3
# Slice
echo "${fruits[@]:1:2}" # banana cherry (from index 1, 2 elements)
Building up an array
files=()
for path in /var/log/*.log; do
files+=("$path") # append each matching file
done
echo "Found ${#files[@]} log files"
do_something "${files[@]}"
+= on an array appends. On a scalar, it appends to the string.
Iterating
# Iterate elements
for fruit in "${fruits[@]}"; do
echo "$fruit"
done
# Iterate indexes (useful when you need both)
for i in "${!fruits[@]}"; do
echo "$i: ${fruits[$i]}"
done
Always quote "${arr[@]}" when iterating. Without quotes you get word splitting on each element.
The "$@" vs "$*" distinction
This is the single concept every Bash engineer must understand.
$@ and $* are the positional parameter lists. They behave differently when quoted:
# Script invoked as: ./script.sh "one arg" "two arg" "three"
echo "$#" # 3
echo "$@" # one arg two arg three (unquoted — word-split)
echo "$*" # one arg two arg three (unquoted — same)
echo "$@" # with quotes: ["one arg"] ["two arg"] ["three"]
echo "$*" # with quotes: ["one arg two arg three"] (all joined by $IFS)
The rule
"$@"— expands to N separate words, one per argument, each individually quoted. Use this for passing args through."$*"— expands to ONE word containing all arguments joined by the first character of$IFS. Use this when you genuinely want a single joined string (rare).
The unquoted forms ($@ and $*) are essentially always wrong — they both subject to word splitting, and you lose the original argument boundaries.
Wrapper script example
# Correct — passes args through unchanged
wrapper() {
echo "About to run: $*"
the_real_command "$@" # use "$@" for argument forwarding
}
wrapper --input "my file.txt" --retries 3
# About to run: --input my file.txt --retries 3
# the_real_command gets 4 distinct arguments
If you used "$*" instead, the_real_command would get one argument: the literal string "--input my file.txt --retries 3".
Forwarding arguments? Use "$@". Joining arguments into a message? Use "$*". Anything else is wrong in one of the two cases.
Arrays work the same way
The same @ vs * applies to named arrays:
args=(--verbose "my file.txt" --retries 3)
command "${args[@]}" # 4 arguments, preserved
command "${args[*]}" # 1 argument, joined by $IFS
Always "${arr[@]}" for passing arguments. Always.
Building a command dynamically
A common real-world pattern — conditionally include flags:
args=(--input "$INPUT" --output "$OUTPUT")
if [[ "$VERBOSE" == "true" ]]; then
args+=(--verbose)
fi
if [[ -n "$TIMEOUT" ]]; then
args+=(--timeout "$TIMEOUT")
fi
if [[ "$DRY_RUN" == "true" ]]; then
args+=(--dry-run)
fi
my_command "${args[@]}"
This is clean, each element stays its own argument, and you can add/remove flags without string concatenation nightmares.
Compare to the (wrong) string approach:
cmd="my_command --input $INPUT --output $OUTPUT"
[[ "$VERBOSE" = "true" ]] && cmd="$cmd --verbose"
[[ -n "$TIMEOUT" ]] && cmd="$cmd --timeout $TIMEOUT"
eval $cmd # eval is an anti-pattern; also breaks on spaces
Use arrays. Never eval.
Associative arrays
Bash 4+ introduces associative arrays (hash maps):
declare -A ages
ages[alice]=30
ages[bob]=25
ages["charlie brown"]=42 # keys can contain spaces if quoted
echo "${ages[alice]}" # 30
echo "${ages[charlie brown]}" # 42
# Keys
echo "${!ages[@]}" # alice bob charlie brown
# Values
echo "${ages[@]}" # 30 25 42
# Iterate
for name in "${!ages[@]}"; do
echo "$name is ${ages[$name]}"
done
Associative array use cases
Configuration maps:
declare -A regions=(
[us-east-1]="Virginia"
[us-west-2]="Oregon"
[eu-west-1]="Ireland"
)
echo "Region ${1} is in ${regions[$1]}"
Deduplication:
declare -A seen
for item in "${items[@]}"; do
if [[ -z "${seen[$item]}" ]]; then
seen[$item]=1
echo "unique: $item"
fi
done
Counting:
declare -A counts
for word in one two three one two one; do
counts[$word]=$(( ${counts[$word]:-0} + 1 ))
done
# counts[one]=3, counts[two]=2, counts[three]=1
Associative arrays are Bash 4+. macOS ships Bash 3.2 by default. If your script needs to run on default macOS, you cannot use associative arrays. Consider Bash-for-macOS via Homebrew or rewrite in Python.
Common array mistakes
Mistake 1: unquoted expansion
# Broken — splits elements containing spaces
for f in ${files[@]}; do
echo "$f"
done
# Correct — preserves spaces in elements
for f in "${files[@]}"; do
echo "$f"
done
Mistake 2: $array instead of ${array[@]}
arr=(a b c)
echo $arr # prints "a" — bare $arr is ${arr[0]}
echo "${arr[@]}" # prints "a b c"
Bare $arr references only the first element. This trips people up once and then they remember forever.
Mistake 3: using strings when you need arrays
# Tempting
TAGS="--tag=v1 --tag=v2 --tag=v3"
docker build $TAGS . # BROKEN on tags with spaces, gets fragile
# Correct
TAGS=(--tag=v1 --tag=v2 --tag=v3)
docker build "${TAGS[@]}" .
Mistake 4: "${arr[*]}" when you meant "${arr[@]}"
arr=(one "two words" three)
command "${arr[*]}" # passes ONE argument: "one two words three"
command "${arr[@]}" # passes THREE arguments: "one" "two words" "three"
The * form is rarely what you want.
Array patterns that matter
Reading lines from a file into an array
# Modern — readarray aka mapfile
mapfile -t lines < file.txt
echo "${lines[0]}" # first line
# Alternative with a loop (for older Bash)
lines=()
while IFS= read -r line; do
lines+=("$line")
done < file.txt
-t strips the trailing newline from each line.
Splitting a string on a delimiter
csv="alice,bob,charlie"
IFS=',' read -ra parts <<< "$csv"
# parts=(alice bob charlie)
echo "${parts[1]}" # bob
read -r -a reads into an indexed array. IFS=',' scopes IFS to this one command.
Joining an array on a delimiter
Bash has no built-in join; the idiomatic workaround:
arr=(one two three)
IFS=','
joined="${arr[*]}" # "one,two,three"
# Or as a one-liner:
printf '%s,' "${arr[@]}" # outputs "one,two,three," with trailing comma
Removing an element by index
unset 'arr[2]' # removes index 2 — but leaves a gap in indexes
# To actually compact the array
arr=("${arr[@]}") # reassignment rebuilds the index
Passing arrays to functions
Arrays are not first-class values — you can't just pass them as a single argument. Two patterns:
Pattern 1: pass as "$@"
process_items() {
for item in "$@"; do
echo "processing: $item"
done
}
items=(a "b with spaces" c)
process_items "${items[@]}"
This is the simplest and most portable.
Pattern 2: nameref (Bash 4.3+)
process_array() {
local -n arr="$1" # nameref — arr now aliases the caller's variable
for item in "${arr[@]}"; do
echo "$item"
done
}
items=(a "b with spaces" c)
process_array items # pass the NAME, not the expansion
Useful when you need to mutate the caller's array.
Quiz
You have files=("a.txt" "my report.pdf" "c.log"). Which loop correctly iterates over all three files, preserving the space in the second one?
What to take away
- Lists of arguments must be arrays, not space-separated strings.
"${arr[@]}"— quoted, with@— is the universal safe expansion. One word per element, quoting preserved."${arr[*]}"joins elements into a single string. Use only when you specifically want that.- Bare
$arris${arr[0]}— not the whole array. - Declare associative arrays with
declare -A(Bash 4+). Use them for maps, counts, and dedup. - Build commands as arrays, not as strings:
args+=(...), thencmd "${args[@]}". Never useeval. "$@"for forwarding arguments;"$*"when you want a joined string message.
Next lesson: the quoting rules you actually need — single vs double vs none — and the specific cases where getting it wrong breaks things.