Bash & Shell Scripting for Engineers

Handling Inputs Safely

Most Bash scripts grow the same way: start with $1, add $2, get --flag support via a messy if-chain, and eventually someone passes --help and the script does something irreversible because it wasn't expecting that. Argument parsing is where the transition from "script" to "tool" happens.

This lesson covers the three real options for parsing arguments, how to pick between them, and the specific patterns for validating user-provided input without opening your script to injection.

KEY CONCEPT

Pick one parsing approach per script. getopts for short options, manual while-loops for long options, getopt (GNU) only when you have no alternative. Mixing approaches produces scripts that are impossible to extend cleanly.


Positional arguments — the simplest case

#!/usr/bin/env bash
set -euo pipefail

if [[ $# -lt 2 ]]; then
  echo "usage: $0 <source> <dest>" >&2
  exit 2
fi

src="$1"
dst="$2"

Works for scripts with a fixed, small number of required args. As soon as you have any optional flags, move to one of the parsing styles below.


getopts — the POSIX way, for short options

getopts is built into Bash (and POSIX shells). It handles short options (-v, -f file, -o=value):

#!/usr/bin/env bash
set -euo pipefail

verbose=false
output=""
tries=3

usage() {
  cat <<EOF >&2
usage: $0 [-v] [-o file] [-t count] <input>
  -v       verbose output
  -o file  output file (default: stdout)
  -t count number of tries (default: 3)
EOF
  exit 2
}

while getopts ":vo:t:h" opt; do
  case "$opt" in
    v) verbose=true ;;
    o) output="$OPTARG" ;;
    t) tries="$OPTARG" ;;
    h) usage ;;
    :) echo "error: -$OPTARG requires an argument" >&2; usage ;;
    \?) echo "error: invalid option -$OPTARG" >&2; usage ;;
  esac
done
shift $((OPTIND - 1))

# Positional args after flags
if [[ $# -lt 1 ]]; then
  usage
fi
input="$1"

echo "verbose=$verbose output=$output tries=$tries input=$input"

Parts that matter

  • ":vo:t:h" — options string. A colon after an option means it takes an argument (-o file). A leading colon enables silent error mode (we handle errors ourselves via : and \? cases).
  • OPTARG — the argument value for options that take one.
  • OPTIND — the index of the next arg. shift $((OPTIND - 1)) consumes the processed options, leaving positional args in $@.
  • : case — triggered when an option is missing its required argument.
  • \? case — triggered for unknown options.

Limitations

getopts only handles short options (-v). No --verbose. No --output=file. If you want those, you need a different approach.

PRO TIP

For quick scripts where short options are fine, getopts is the simplest choice. It's portable (POSIX), handles combining (-vo file = -v -o file), and doesn't pull in dependencies.


Manual parsing — for long options

If you need --verbose and --output=file style flags, write a while loop yourself. This is the most common approach in modern production scripts:

#!/usr/bin/env bash
set -euo pipefail

verbose=false
output=""
tries=3
positional=()

usage() {
  cat <<EOF >&2
usage: $0 [options] <input>

Options:
  -v, --verbose         verbose output
  -o, --output FILE     output file
  -t, --tries N         number of tries (default: 3)
  -h, --help            show this help
EOF
  exit 2
}

while [[ $# -gt 0 ]]; do
  case "$1" in
    -v|--verbose)
      verbose=true
      shift
      ;;
    -o|--output)
      output="$2"
      shift 2
      ;;
    --output=*)
      output="${1#*=}"
      shift
      ;;
    -t|--tries)
      tries="$2"
      shift 2
      ;;
    --tries=*)
      tries="${1#*=}"
      shift
      ;;
    -h|--help)
      usage
      ;;
    --)
      shift
      positional+=("$@")
      break
      ;;
    -*)
      echo "error: unknown option: $1" >&2
      usage
      ;;
    *)
      positional+=("$1")
      shift
      ;;
  esac
done

set -- "${positional[@]}"

if [[ $# -lt 1 ]]; then
  usage
fi

input="$1"

Design choices

  • Handles both --output file (space) and --output=file (equals).
  • Handles short and long forms (-o and --output).
  • -- is the convention for "end of options, the rest are positional."
  • Unknown options (starting with -) error out; anything else is a positional.
  • set -- "${positional[@]}" restores the positional parameters (useful for the rest of the script).

Verbose, but complete. This is the idiomatic Bash pattern for serious CLIs.


GNU getopt (the external command, distinct from Bash's getopts builtin) supports long options with a more concise syntax:

#!/usr/bin/env bash
set -euo pipefail

# GNU getopt required for long options
PARSED=$(getopt --options vo:t:h --longoptions verbose,output:,tries:,help -- "$@") || exit 2
eval set -- "$PARSED"

verbose=false
output=""
tries=3

while true; do
  case "$1" in
    -v|--verbose) verbose=true; shift ;;
    -o|--output) output="$2"; shift 2 ;;
    -t|--tries) tries="$2"; shift 2 ;;
    -h|--help) usage ;;
    --) shift; break ;;
    *) echo "internal error"; exit 3 ;;
  esac
done

# positional args remain in $@

Caveats:

  • GNU getopt is Linux-only (well, installable on others). macOS ships BSD getopt which has different (worse) semantics.
  • Requires eval which is a footgun.
  • Portability issues mean teams often shy away from it.

Use it if you've standardized on Linux and want concise parsing. For cross-platform scripts, manual parsing is safer.


A decision tree

Parsing args?Start hereFixed positional only?(< 4 args, no flags)$1, $2, $3direct positionalNeed long options (--foo)?getoptsPOSIX, short onlyManual while loopthe idiomatic choice

Validating input

Accepting input without validation is how scripts break. Every user-supplied value should be checked.

Numeric validation

validate_number() {
  local value="$1"
  local name="$2"
  if [[ ! "$value" =~ ^[0-9]+$ ]]; then
    echo "error: $name must be a non-negative integer: $value" >&2
    exit 2
  fi
}

validate_number "$tries" "tries"
if (( tries > 100 )); then
  echo "error: tries must be <= 100" >&2
  exit 2
fi

Path validation

validate_input_file() {
  local f="$1"
  [[ -f "$f" ]] || { echo "not a file: $f" >&2; exit 2; }
  [[ -r "$f" ]] || { echo "not readable: $f" >&2; exit 2; }
}

validate_input_file "$input"

Allowed-values validation

validate_env() {
  case "$1" in
    dev|staging|prod) return 0 ;;
    *) echo "error: env must be dev, staging, or prod (got: $1)" >&2; exit 2 ;;
  esac
}

validate_env "$ENV"

Regex validation for identifiers

validate_name() {
  # Allow alphanumeric, hyphens, underscores; 1-64 chars
  if [[ ! "$1" =~ ^[A-Za-z0-9_-]{1,64}$ ]]; then
    echo "error: invalid name: $1" >&2
    exit 2
  fi
}

validate_name "$project_name"

Avoiding injection

Any time you interpolate user input into a command, you risk injection. Defensive patterns:

Never pass user input into eval

# DISASTER
eval "command --flag=$user_input"
# If $user_input is "value; rm -rf /", you just ran rm -rf /

# CORRECT — arrays preserve boundaries
cmd=(command --flag="$user_input")
"${cmd[@]}"

Rule: never use eval on user-controlled data. If you think you need eval, refactor using arrays.

Paths — reject traversal

# Ban .. and leading /
validate_safe_path() {
  local p="$1"
  if [[ "$p" == *..* || "$p" == /* ]]; then
    echo "error: path traversal detected: $p" >&2
    exit 2
  fi
}

validate_safe_path "$user_path"
cp "$source" "$target_dir/$user_path"

SQL, shell injection, command construction

Never paste user input into a shell command that gets re-parsed (ssh, sql, eval):

# BROKEN — user_filter could contain ; or quotes
psql -c "SELECT * FROM users WHERE name = '$user_filter'"

# Use parameterized query (psql supports -v)
psql -v filter="$user_filter" -f query.sql
# Then inside query.sql: SELECT * FROM users WHERE name = :'filter'
# BROKEN — user-controlled value in remote command
ssh host "systemctl restart $service"

# SAFER — validate and whitelist
case "$service" in
  nginx|redis|postgres) ;;
  *) echo "unsupported service: $service" >&2; exit 2 ;;
esac
ssh host "systemctl restart $service"

Reading interactively with read

Sometimes you want to prompt the user:

read -r -p "Enter your name: " name
echo "Hello, $name"
  • -r — no backslash escapes (same as with read < file).
  • -p prompt — print a prompt.
  • -s — silent (for passwords).
  • -t timeout — read with timeout.
  • -n N — read N characters without waiting for enter.
# Confirmation prompt
read -r -p "Really delete everything? (yes/no) " confirm
if [[ "$confirm" != "yes" ]]; then
  echo "aborted"
  exit 0
fi
# Password
read -rs -p "Password: " password
echo

Passing secrets

Do NOT pass secrets on the command line:

# BAD — password is visible in `ps`
myapp --password="$PASSWORD"

# BETTER — pass via env or file
PASSWORD="$PASSWORD" myapp --password-from-env
# or
myapp --password-file="$secret_file"

Command-line arguments show up in ps aux, in /proc/*/cmdline, in shell history. Never put secrets there.


Help and version flags

Every well-behaved script supports -h/--help and --version:

VERSION="1.2.3"

usage() {
  cat <<EOF
usage: $0 [OPTIONS] <input>

Description.

Options:
  -v, --verbose    verbose output
  -h, --help       show this help
  --version        show version
EOF
}

case "${1:-}" in
  -h|--help) usage; exit 0 ;;
  --version) echo "$VERSION"; exit 0 ;;
esac

# ... rest of parsing

#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'

readonly VERSION="1.0.0"

usage() {
  cat <<EOF
usage: ${0##*/} [options] <input>

Description of what the script does.

Options:
  -o, --output FILE    output file (default: stdout)
  -v, --verbose        verbose output
  -n, --dry-run        show what would happen
  -h, --help           show this help
  --version            show version

Examples:
  ${0##*/} --verbose input.txt
  ${0##*/} -o result.json input.txt
EOF
}

# Defaults
output=""
verbose=false
dry_run=false

# Parse
while [[ $# -gt 0 ]]; do
  case "$1" in
    -o|--output)    output="$2"; shift 2 ;;
    --output=*)     output="${1#*=}"; shift ;;
    -v|--verbose)   verbose=true; shift ;;
    -n|--dry-run)   dry_run=true; shift ;;
    -h|--help)      usage; exit 0 ;;
    --version)      echo "$VERSION"; exit 0 ;;
    --)             shift; break ;;
    -*)             echo "unknown option: $1" >&2; usage >&2; exit 2 ;;
    *)              break ;;
  esac
done

# Positional
if [[ $# -lt 1 ]]; then
  echo "error: input required" >&2
  usage >&2
  exit 2
fi
input="$1"

# Validate
[[ -f "$input" ]] || { echo "input not found: $input" >&2; exit 2; }

# Logging helper
log() { "$verbose" && echo "[$(date +%H:%M:%S)] $*" >&2; }

# ... work ...
log "processing $input"

if "$dry_run"; then
  log "dry-run: would write to ${output:-stdout}"
else
  # real work
  :
fi

Copy this skeleton. Modify for the specific script. Don't re-invent argument parsing every time.


Quiz

KNOWLEDGE CHECK

Your script takes a --config=FILE flag. A user invokes it as: ./script.sh --config=my settings.yml. What happens in a manual while-loop parser written correctly?


What to take away

  • Positional-only: use $1, $2 directly for 1-3 required args.
  • Short options only: getopts is POSIX and built-in.
  • Long options (the common case): manual while/case loop. Handles --foo bar, --foo=bar, -f, and --.
  • GNU getopt is concise but non-portable — use only on Linux-only projects.
  • Always support -h/--help, --version, and -- (end-of-options marker).
  • Validate every input: regex for IDs, range checks for numbers, file checks for paths, allowed-value checks for enums.
  • Never eval user input. Build commands as arrays and expand with "${cmd[@]}".
  • Secrets never go on the command line — use env vars or secret files.
  • Keep a copy of the "full-featured template" in your team's internal wiki or dotfiles.

Next lesson: working with files and paths — mktemp, race conditions, and filenames with weird characters.