Structuring Larger Scripts
Every long Bash script started as a short Bash script. Somewhere around 200 lines, the structure starts to strain. Somewhere around 500 lines, it becomes hard to change without breaking something. And somewhere between 500 and 2000 lines, the right answer is not "write more Bash" but "stop writing Bash."
This lesson is about keeping a growing Bash script manageable, organizing shared helpers into libraries, and recognizing the signs that it's time to switch languages.
Bash is a glue language, not an application language. It excels at stringing together OS tools. It does not excel at business logic, complex data transformations, or anything that would be classes in another language. Know when to stop.
The shape of a well-structured Bash script
A Bash script over 100 lines benefits from a consistent structure:
#!/usr/bin/env bash
# Description, version, author.
# 1. Strict mode
set -euo pipefail
IFS=$'\n\t'
# 2. Constants
readonly VERSION="1.0.0"
readonly SCRIPT_NAME="${0##*/}"
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# 3. Source libraries
source "$SCRIPT_DIR/lib/logging.sh"
source "$SCRIPT_DIR/lib/utils.sh"
# 4. Global state (minimize)
VERBOSE=false
DRY_RUN=false
# 5. Functions — in order of abstraction, top-down
usage() {
cat <<EOF
$SCRIPT_NAME — does the thing
Usage: $SCRIPT_NAME [options] <input>
...
EOF
}
parse_args() {
while [[ $# -gt 0 ]]; do
# ...
done
}
do_work() {
# ...
}
cleanup() {
# ...
}
# 6. main — the orchestration
main() {
parse_args "$@"
trap cleanup EXIT
do_work
}
# 7. Entry point
main "$@"
The structure matters less than having some structure. Scripts with no structure — where state, config, and logic are interleaved — rot the fastest.
The main function idiom
Wrapping the top-level flow in a main function has several benefits:
- Local variables — variables in
mainare function-local, not script-global. - Testability — you can source the script without running it, then call
mainor individual functions for testing. - Readability — reading
maintells you the whole flow in 10 lines.
main() {
parse_args "$@"
validate_input
do_work
}
# Only run main if script is executed directly, not sourced
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
main "$@"
fi
The if [[ "${BASH_SOURCE[0]}" == "${0}" ]] check is Bash's equivalent of Python's if __name__ == "__main__":. It lets the file be sourced (for testing or composition) without running main.
Organizing into libraries
Once you have multiple scripts in a project, shared helpers should live in a library file:
myproject/
├── bin/
│ ├── deploy
│ ├── rollback
│ └── status
└── lib/
├── logging.sh
├── config.sh
└── git.sh
Each script in bin/ sources what it needs:
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/../lib/logging.sh"
source "$SCRIPT_DIR/../lib/git.sh"
main() {
log_info "starting deploy"
ensure_clean_repo
# ...
}
main "$@"
What belongs in a library
- Logging and error helpers —
log_info,die,warn. - Common OS checks —
require_command jq,require_file config.yml. - Retry / timeout wrappers —
retry 3 'some_command'. - Protocol-level helpers —
git_current_branch,k8s_current_context.
What does NOT belong in a library
- Constants specific to one script.
- Business logic specific to one workflow.
- Stuff that only has one caller.
If a helper only has one caller, leave it inline. Extract to a library when you have 2+ callers.
A sample lib/logging.sh
# lib/logging.sh
# Logging helpers with levels, colors, and a standard format.
LOG_LEVEL="${LOG_LEVEL:-info}" # debug, info, warn, error
_log() {
local level="$1"; shift
local msg="$*"
printf '[%s] [%-5s] %s\n' "$(date -u +'%Y-%m-%dT%H:%M:%SZ')" "$level" "$msg" >&2
}
_should_log() {
# Return 0 if $1 >= LOG_LEVEL
local priority=(debug info warn error)
local want=-1 have=-1 i
for i in "${!priority[@]}"; do
[[ "${priority[i]}" == "$LOG_LEVEL" ]] && want=$i
[[ "${priority[i]}" == "$1" ]] && have=$i
done
(( have >= want ))
}
log_debug() { _should_log debug && _log DEBUG "$@"; }
log_info() { _should_log info && _log INFO "$@"; }
log_warn() { _should_log warn && _log WARN "$@"; }
log_error() { _should_log error && _log ERROR "$@"; }
die() {
log_error "$@"
exit 1
}
Usage:
source lib/logging.sh
log_info "starting up"
log_debug "config file: $cfg"
log_warn "disk usage high: ${pct}%"
log_error "failed to connect to $host"
die "unrecoverable error: cannot continue"
Now every script in your project has consistent logging.
A sample lib/utils.sh
# lib/utils.sh
# Generic helpers.
require_command() {
for cmd in "$@"; do
command -v "$cmd" >/dev/null || die "required command not found: $cmd"
done
}
require_file() {
for f in "$@"; do
[[ -f "$f" ]] || die "required file not found: $f"
done
}
retry() {
local attempts="$1"; shift
local delay="${RETRY_DELAY:-1}"
local i
for (( i = 1; i <= attempts; i++ )); do
if "$@"; then
return 0
fi
(( i < attempts )) && sleep "$delay"
done
return 1
}
confirm() {
local prompt="${1:-Continue?}"
read -r -p "$prompt [y/N] " reply
[[ "${reply,,}" == "y" || "${reply,,}" == "yes" ]]
}
# Run a command with a timeout (portable-ish)
with_timeout() {
local secs="$1"; shift
timeout "$secs" "$@"
}
Usage:
require_command jq curl kubectl
require_file config.yml secrets.env
retry 3 curl -fsSL "$url" -o output.txt
confirm "Delete everything?" && rm -rf /stuff
Signs your script is outgrowing Bash
Not every script should become a library; not every library should stay in Bash. Red flags that Bash is the wrong tool:
When Bash IS the right tool
Despite the red flags, Bash is the right tool for:
- Wrapping a handful of commands with some error handling.
- CI pipeline glue that shells out to existing tools.
- OS-level orchestration: systemd units, cron jobs, container entrypoints.
- Ad-hoc one-offs that will probably never run again.
- Deployment scripts that call
ssh,docker,kubectl,git. - Anything under ~100 lines where the control flow is sequential.
Bash's superpower is zero-dependency, ubiquitous availability. On any Unix-like system, you can assume bash is present. Python/Go bring a runtime or binary; Bash just runs.
The "rewrite in Python" migration pattern
When a Bash script hits its limits, the migration usually looks like:
Step 1: identify the core logic
Separate the "orchestration" (which external tools to call, in what order) from the "logic" (data transformation, decisions, state).
Step 2: rewrite the logic in Python, keep the orchestration shell-like
Python's subprocess module runs shell commands, and libraries like sh (or invoke, or plumbum) make it feel shell-like:
import subprocess
def git_current_branch():
result = subprocess.run(
["git", "branch", "--show-current"],
capture_output=True, text=True, check=True,
)
return result.stdout.strip()
def deploy(env):
branch = git_current_branch()
if branch != "main" and env == "prod":
raise ValueError(f"prod deploys must come from main, not {branch}")
# ...
You get: real data types, real tests, real error handling, real packaging. The orchestration of external tools still works the same way.
Step 3: remove the old script
Keep it around for a release as a wrapper (#!/bin/bash; exec python newimpl.py "$@"), then delete.
Do not rewrite working, stable Bash scripts just because they're Bash. Rewrite when there's a concrete pain point (hard to change, hard to test, hard to debug). "It's ugly" is not a reason; "it broke last week and no one could figure out why" is.
Testing Bash scripts — bats
bats (Bash Automated Testing System) is the closest Bash gets to a test framework:
#!/usr/bin/env bats
setup() {
source "$BATS_TEST_DIRNAME/../lib/utils.sh"
tmpdir=$(mktemp -d)
}
teardown() {
rm -rf "$tmpdir"
}
@test "require_command succeeds for existing command" {
run require_command ls
[ "$status" -eq 0 ]
}
@test "require_command fails for missing command" {
run require_command nonexistent_command_xyz
[ "$status" -ne 0 ]
[[ "$output" == *"not found"* ]]
}
@test "retry succeeds on second try" {
i=0
fake() { i=$((i + 1)); (( i >= 2 )); }
run retry 3 fake
[ "$status" -eq 0 ]
}
Run with bats test/. Works, but writing tests for complex Bash scripts is painful compared to Python pytest. If you find yourself needing more than a handful of tests, consider migration.
Documentation in scripts
Good scripts are self-documenting at the top:
#!/usr/bin/env bash
#
# deploy.sh — deploy app to a given environment
#
# Usage: deploy.sh [options] <env>
#
# Options:
# -v, --verbose verbose output
# -n, --dry-run show what would happen
# -f, --force skip confirmation
#
# Environment variables:
# DEPLOY_HOST target host (required)
# DEPLOY_KEY path to SSH key (default: ~/.ssh/id_rsa)
#
# Exit codes:
# 0 success
# 2 invalid arguments
# 10 git state is not clean
# 20 tests failed
# 30 deploy failed
#
# Examples:
# deploy.sh prod
# deploy.sh --dry-run staging
# DEPLOY_KEY=~/.ssh/deploy_key deploy.sh staging
Then make --help print the same information:
usage() {
# Print the header comments (lines starting with '#')
sed -n '/^#$/,/^[^#]/p' "$0" | sed 's/^# //;s/^#//'
}
This keeps --help and the top-of-file docs in sync — one source of truth.
Common structural mistakes
Mistake 1: global state everywhere
# BROKEN — state scattered across script
current_env=""
deploy_target=""
extra_flags=()
set_env() {
current_env="$1"
# ...
}
set_target() {
deploy_target="$1"
# ...
}
If the global variables are set in one function and read in another with nothing in between showing the flow, good luck debugging.
Fix: keep state local when possible, and pass explicitly between functions.
Mistake 2: functions with side effects on globals
do_thing() {
result="computed_value" # modifies a global
}
do_thing
echo "$result"
Breaks the moment you call do_thing from a pipeline (subshell) or rename anything. Prefer returning via stdout or nameref.
Mistake 3: no separation between orchestration and work
# 200 lines of commands interleaved with if/else and echos
# No functions. No structure.
Break it up. Even if each function is called once, named functions turn a wall of code into a readable outline.
Mistake 4: copy-pasting between scripts
The moment you copy code between scripts, extract it to a library. Every copy will drift, and by the third copy they'll all be slightly different.
Quiz
Your Bash deploy script has grown to 600 lines, uses associative arrays for config, computes JSON with jq in most functions, and broke three times last quarter during incidents. What is the right move?
What to take away
- Structure growing scripts: strict mode, constants, sourced libraries, functions,
mainfunction, entry point guard. main "$@"guarded byif [[ "${BASH_SOURCE[0]}" == "${0}" ]]lets the script be sourced without running.- Extract shared helpers into
lib/*.sh. Source them at the top of each script. - Minimum libraries to have:
logging.sh,utils.sh(require_command, retry, confirm). - Document usage at the top of the file. Make
--helpprint the same docs. - Know the signs to leave Bash: >500 lines, nested data structures, heavy jq/awk dependence, real tests required, performance bottlenecks.
- When rewriting, keep the orchestration style; move the logic to Python or Go.
batsexists for testing Bash, but it's a signal you're nearing the language's limits.
Next module: debugging shell scripts — the tools, the common failure modes, and ShellCheck.