Git Internals for Engineers

Rewriting History Safely

A developer opens a PR with 23 commits that tell a messy story of their two-day journey — wip, try again, revert, fix lint, actually works now, typo. The reviewer asks for a clean history. The developer panics: "I will have to redo everything." They do not. Thirty seconds of git rebase -i HEAD~23 with a few squash marks and edited messages produces a clean five-commit story, each commit a logical unit. Same code, a fraction of the review friction. History rewriting is not dark magic — it is a daily tool, as long as you know the one rule: never rewrite history that other people have already pulled.

This lesson covers the three everyday history-rewriting commands: git commit --amend, git commit --fixup, and git rebase -i. You will learn how each works at the object level, when to use each, and the specific recovery paths when you rewrite something you should not have.


The Golden Rule

Never rewrite commits that have been pushed to a shared branch.

When you rewrite history, the rewritten commits have new SHAs. If others have already pulled the old commits, their clones now disagree with yours about history. Force-pushing clobbers what they had, and their next pull either fails or silently destroys their local commits.

Specific sites for rewriting:

  • Personal / feature branches before push — rewrite freely.
  • After push but before PR review — rewrite is usually fine, force-push to your own branch.
  • During PR review — usually fine; reviewers expect cleanup. Communicate with reviewers.
  • After merge to main — never rewrite.
  • On a shared long-lived branch others branch from — never rewrite.
KEY CONCEPT

Your branch, your rules. Others' branches, their rules. Main/master/develop, nobody rewrites. This one principle eliminates ~100% of "I just destroyed everyone else's work" incidents.


git commit --amend — Update the Last Commit

Amend replaces the last commit with a new one. Uses:

Edit the last message

git commit -m "add loginn"    # typo
git commit --amend            # opens your editor; edit message, save
# Now the last commit has a fixed message and a NEW SHA

Add missed files to the last commit

git commit -m "feat: add login"
# ... oh, I forgot to add the test file
git add tests/test_login.py
git commit --amend --no-edit  # include staged changes, keep message

--no-edit reuses the existing message. --amend alone opens the editor for changes.

Update author info

git commit --amend --author='Sharon <sharon@example.com>' --no-edit

How amend works (at the object level)

Amend creates a new commit object with:

  • Your previous commit's parent as its parent (skipping the old commit entirely).
  • Whatever fields you modified (message, tree, author).
  • Updates HEAD/branch to point at the new commit.

The old commit is now unreferenced (but still in .git/objects/ and in the reflog).

# Before amend
git log --format=%h
# a1b2c3d  <- HEAD
# 789abc0

# After amend
git log --format=%h
# d4e5f6a  <- HEAD (new SHA)
# 789abc0  <- same parent

# The old a1b2c3d still exists in .git/objects and reflog:
git reflog
# d4e5f6a HEAD@{0}: commit (amend): feat: add login
# a1b2c3d HEAD@{1}: commit: feat: add loginn

If you amend a pushed commit, your local branch now disagrees with the remote. You either force-push (and coordinate), or revert your amend.

WARNING

git commit --amend is the one history-rewriting command people do by accident. You realize you had a typo, you amend, and if that commit was already pushed, you have now diverged from origin. Before every --amend, check: "has this been pushed?" If yes, prefer a new commit (git commit -m "fix typo in previous message") over amending.


git commit --fixup and Autosquash

When reviewing your work, you often realize commit X had a bug that you want to fix in commit X (not in a new commit). The modern workflow:

# Commit 1 (earlier)
git log --oneline
# d4e5f6a feat: add auth module
# a1b2c3d feat: add login route
# 789abc0 chore: bump deps

# You realize auth module had a bug. Fix it:
vim src/auth.py

# Stage the fix
git add src/auth.py

# Make a "fixup" commit that targets a1b2c3d
git commit --fixup=a1b2c3d
# Creates commit with message: "fixup! feat: add login route"

The commit history looks messy:

git log --oneline
# 22221111 fixup! feat: add login route
# d4e5f6a feat: add auth module
# a1b2c3d feat: add login route
# 789abc0 chore: bump deps

Now squash the fixup into the original:

git rebase -i --autosquash HEAD~3

Git opens the rebase editor with the fixup commit already marked fixup and placed immediately after its target:

pick a1b2c3d feat: add login route
fixup 22221111 fixup! feat: add login route
pick d4e5f6a feat: add auth module

Save. Git squashes the fixup into its target, preserves the target's message, and rewrites history. Result:

git log --oneline
# new-sha-1 feat: add auth module
# new-sha-2 feat: add login route    <- has the fix now
# 789abc0   chore: bump deps

Clean. The --fixup + --autosquash workflow is the cleanest way to iterate on a multi-commit PR during review.

Enable autosquash by default:

git config --global rebase.autosquash true
# Now `git rebase -i` always uses autosquash

Related: git commit --squash=<sha> creates a squash! commit that merges INTO the target and combines the messages (interactive rebase prompts for the final message).


git rebase -i — Interactive History Editor

Interactive rebase is the general-purpose history rewriting tool. Its mental model: "take these N commits, replay them one by one, letting me intervene at each step."

git rebase -i HEAD~5    # last 5 commits
# (or any target: main, a specific SHA, a branch)

Git opens an editor with a plan:

pick a1b2c3d feat: add login route
pick 789abc0 chore: bump deps
pick d4e5f6a feat: add auth module
pick 22221111 fix: typo in auth
pick 33334444 feat: add logout

# Rebase 8b2f4c1..33334444 onto 8b2f4c1 (5 commands)
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
# x, exec = run command (the rest of the line) using shell
# d, drop = remove commit
# ...

Edit the plan. Save. Git replays commits one by one, executing your instructions:

  • pick → replay the commit as-is.
  • reword → replay, then open editor for a new message.
  • edit → replay, then stop the rebase for you to amend freely (fix code, stage changes, git commit --amend, then git rebase --continue).
  • squash → replay, combining with the previous commit, prompting for a merged message.
  • fixup → like squash but keeps the previous commit's message.
  • drop → skip the commit entirely.
  • exec <cmd> → run a shell command after this commit (great for running tests in the middle of a rebase).

Reordering commits

Just reorder the lines in the plan:

pick 789abc0 chore: bump deps
pick 33334444 feat: add logout
pick a1b2c3d feat: add login route
pick d4e5f6a feat: add auth module

Git replays in new order. Only works if the commits do not conflict with each other — if logout depends on login, reordering causes a conflict that you must resolve mid-rebase.

Splitting a commit

Pick edit on the commit you want to split, then when rebase stops:

# You're now at the state of that commit, but it's staged as if you just authored it
git reset HEAD~          # un-commit, keeping changes in working dir
# Now stage and commit in multiple pieces
git add file1.py && git commit -m "part 1"
git add file2.py && git commit -m "part 2"
git rebase --continue

The Mechanics: What Rebase Does

For each commit in the plan:

  1. Compute a patch (diff) between the commit and its parent.
  2. Apply that patch on top of the current rebase HEAD.
  3. Create a new commit object with:
    • The newly applied tree.
    • Current rebase HEAD as parent.
    • The original commit's author info and message (unless you modified them).
  4. Move the rebase HEAD forward.

Because each new commit has a different parent, every rebased commit has a new SHA. The original commits are not deleted — they remain in .git/objects/ and in the reflog — but no ref points at them.

PRO TIP

Think of rebase as "cherry-picking commits in order onto a new base." That is literally what it is. Once this clicks, you understand why rebase creates new SHAs, why it can conflict commit-by-commit, and why aborting (--abort) instantly returns you to the pre-rebase state (the original commits are still there).


Rebasing Onto a Different Base

The most common rebase in team work: "rebase my feature branch onto the latest main."

# I'm on `feature`. main has advanced while I was working.
git fetch origin
git rebase origin/main

# Each of my commits is replayed on top of origin/main
# If a conflict arises in any commit, rebase pauses:
# (CONFLICT message from git)

# Resolve the conflict
edit <conflicted files>
git add <files>
git rebase --continue

# Or bail out entirely:
git rebase --abort

This is how you keep a feature branch up-to-date without polluting history with merge commits. After the rebase, your feature commits appear as if they were just written against the current main — clean, linear.

--onto for surgical rebases

# Move commits from one base to another
git rebase --onto <new-base> <old-base> <branch>

# Example: my branch was off `develop`, but I want to move it onto `main`
git rebase --onto main develop feature

# Breakdown:
# - take the commits on `feature` that are NOT on `develop`
# - replay them onto `main`

This is how you rescue commits made on the wrong base.


Safe Rewriting: Backup Before Risky Rebases

Before a rebase that rewrites many commits, snapshot:

git branch backup-feature-before-rebase

If the rebase goes badly:

git rebase --abort                    # try this first
# If that fails or produced wrong results:
git reset --hard backup-feature-before-rebase

You can delete the backup branch once the rebase looks good:

git branch -D backup-feature-before-rebase

Note: even without a backup branch, git reflog always has your pre-rebase HEAD. The backup branch just makes recovery one command instead of two.


Force-Pushing After Rewriting

After rewriting commits that were pushed, your branch has diverged from the remote. You need --force or --force-with-lease:

git push --force                      # dangerous — clobbers remote regardless of state
git push --force-with-lease           # safe — fails if remote has new commits
git push --force-with-lease --force-if-includes   # even safer (Git 2.30+)

Always prefer --force-with-lease over --force. It refuses the push if someone else has pushed to the branch since your last fetch — preventing you from silently overwriting their commits.

# I rebased my branch
git rebase -i HEAD~3

# Push with protection
git push --force-with-lease origin feature
# If someone pushed to feature since my last fetch, this FAILS
# I must: git fetch + decide what to do (rebase onto their commits? abandon their commits?)
WARNING

git push --force on a shared branch is the canonical way to lose other people's work. --force-with-lease is almost always what you want instead. Many teams globally alias git push --force to --force-with-lease — it has saved countless work-loss incidents.


git rebase vs git merge

Both integrate changes from one branch into another. The difference is what the history looks like after.

  • Merge creates a merge commit that joins two lines. History preserves the fact that a branch existed.
  • Rebase replays commits as if they were written on top of the target. History is linear.
# Before:
#   A---B---C (main)
#        \
#         D---E---F (feature)

# After `git merge feature` on main:
#   A---B---C-------M (main)
#        \         /
#         D---E---F (feature)

# After `git rebase main` on feature:
#   A---B---C (main)
#            \
#             D'---E'---F' (feature)      <- primed prefixes indicate new SHAs

Teams fall into two camps:

  • Merge-based (GitHub Flow default): preserves "branch lifecycle" history. Every merge commit is a semantic unit ("merge PR #123").
  • Rebase-based (linear history): clean git log --oneline that reads like a single line of development. Preferred when history is used heavily for git bisect and for narrative clarity.

Full treatment in Module 3 Lesson 1.


Anti-Patterns

Amending merge commits

git commit --amend  # on a merge commit

Technically works, but now the merge has different parents (or the same parents under a new commit SHA). Tooling that expects git log --first-parent may behave oddly. Usually avoid; if a merge commit has a bad message, consider recreating the merge.

Rebasing merges

git rebase -i HEAD~10  # with merge commits in the range

Default rebase drops merge commits (converts to linear). This is often what you want. If you need to preserve merges, use --rebase-merges (Git 2.22+) — the tool is there but it is complex; usually cleaner to rebase before merging.

Rewriting public history

See the Golden Rule. It is worth repeating: never rewrite shared branches. If you absolutely must (e.g., removing a leaked secret), coordinate with every team member to re-clone or hard-reset to the new history.


Recovery: "I Rewrote the Wrong Thing"

Every rewriting operation is undoable via reflog:

git reflog
# a1b2c3d (HEAD -> feature) HEAD@{0}: rebase -i (finish): refs/heads/feature ...
# 789abc0 HEAD@{1}: rebase -i (pick): some message
# ...
# d4e5f6a HEAD@{7}: commit: my original commit before rebase
# ...

# Reset to the pre-rebase state
git reset --hard d4e5f6a

Key point: git reflog is the log of HEAD movements. Every rebase, reset, amend, cherry-pick, and checkout appears. For the first 30 days (default), you can jump back to any state HEAD was in.

Full coverage in Module 4.


A Realistic PR Cleanup Workflow

# You've been working on feature for 2 days, 15 messy commits
git log --oneline
# 15 lines of wip / fix / typo

# Reviewer asks for clean history. Rebase interactively against main
git fetch origin
git rebase -i origin/main

# In editor:
# - squash related commits together
# - reword for clear messages
# - reorder into a logical story
# - drop accidental commits

# Save, possibly resolve conflicts commit-by-commit (git rebase --continue after each)

# Force-push safely
git push --force-with-lease origin feature

# Clean 4-commit PR:
# * feat(auth): add login endpoint
# * feat(auth): add logout endpoint
# * docs: document auth flow
# * test(auth): coverage for login + logout

The result is a PR that is easier to review, and a history that is easier to understand years later.

WAR STORY

A reviewer had to approve a 2000-line PR. The original author had 35 commits including "wip", "undo last", "lint", "revert revert". The reviewer asked for a rebase. The author refused, claiming it was too risky. The reviewer gave up and rubber-stamped. Six months later, a production bug traced back to that PR. git log -p of the area was a nightmare — 35 interleaved commits obscured the logic. Investigation took a day instead of an hour. Clean history is not vanity; it is a real tool for your future self. Teach your team to rebase confidently, back up with branches before risky rewrites, and always --force-with-lease instead of --force.


Key Concepts Summary

  • Never rewrite public history. Rewriting changes SHAs; others who pulled the old commits will conflict.
  • git commit --amend replaces the last commit with a new one. Useful for message fixes and missed files.
  • git commit --fixup=<sha> + git rebase -i --autosquash is the clean way to iterate on an earlier commit during a PR.
  • git rebase -i interactive editor: pick, reword, edit, squash, fixup, drop, exec, reorder — the full history editor.
  • Each rewritten commit has a new SHA. Originals persist in .git/objects/ and reflog but are unreferenced.
  • git rebase --onto relocates commits from one base to another — useful for "I branched off the wrong thing."
  • Backup branch before risky rebases: git branch backup-before-rebase. Recovery is one reset command.
  • --force-with-lease over --force — it fails if the remote has moved, preventing silent overwrites.
  • Rebase vs merge differ only in the resulting shape of history. Choose per team policy.
  • git reflog always tracks HEAD changes — every rewrite is recoverable within ~30 days.

Common Mistakes

  • git push --force to a shared branch. Use --force-with-lease. Better: do not rewrite shared branches at all.
  • Amending a pushed commit without realizing it will diverge from origin. Remember: amend = new SHA.
  • Running git rebase -i without a backup branch on a complicated rewrite. Cheap insurance.
  • Forgetting --autosquash exists; manually ordering fixup commits is tedious.
  • Trying to rebase with uncommitted changes (Git refuses). Stash first, then rebase, then pop.
  • Squashing commits across a PR boundary and losing co-authored-by lines in the process. Modern Git and platforms handle this better now; still, double-check credit.
  • Ignoring git reflog and believing work is lost. Almost never is, within 30 days.
  • Using git rebase -i on the default branch in a multi-developer team. Even if you are "the only one working on this," someone might have pulled. Prefer a PR-based merge workflow for main.
  • Setting rebase.autostash = true and forgetting; unexpected pops during rebases lose work. Either know it is on or leave it off and stash manually.
  • Rebasing a series of merges and losing the merge structure without --rebase-merges. If merges are semantic, you need the flag (or to restructure before rebasing).

KNOWLEDGE CHECK

You made 8 commits on feature branch, pushed them, and the PR reviewer said 'please clean up history.' You want to rebase interactively and force-push, but worry about the safety of other collaborators. What is the safest force-push variant and why?