Commits are snapshots, not diffs

Why Git feels confusing—and how the internals fix that

Git becomes dramatically easier once you stop thinking of it as “a tool that stores file diffs” and start seeing it as “a database of snapshots connected by pointers.” Understanding what a commit, branch, and merge are under the hood helps you predict outcomes (fast-forward vs merge commit), recover from mistakes (detached HEAD, lost commits), and choose workflows (merge vs rebase) with confidence.

The three building blocks Git stores

Git stores content as objects addressed by a hash (a content-derived ID). The key idea is that objects are immutable: if content changes, the hash changes, and you get a new object.

The objects you should recognize

  • Blob: the contents of a file (not the filename).
  • Tree: a directory listing: names → (blob/tree IDs) plus metadata.
  • Commit: points to a single tree (the project snapshot), plus parent commit(s), author info, and message.

A commit is therefore a pointer to a complete snapshot of your project at that time. Git can store snapshots efficiently because unchanged files can be reused: many commits can point (indirectly) to the same blobs.

History is a graph

Because each commit points to its parent(s), commits form a directed acyclic graph (DAG). Most commits have one parent; merge commits have two (or more).

The “history” you see is Git walking parent pointers between commits; the files you see come from the tree the commit points to.

Which description best matches what Git needs to reconstruct the full contents of your project at a given commit?

It’s common to assume Git is fundamentally “diff-based,” because that’s how many tools display changes. Internally, Git can recreate a commit by following the commit’s pointer to a tree (directory structure) and then to blobs (file contents). The patch-chain idea is tempting because diffs are how humans often think about changes, but Git doesn’t require replaying every change to materialize a snapshot. A branch name doesn’t store file contents—it's just a name pointing at a commit. The working directory and command history are outside Git’s core storage model; Git stores objects, not your terminal actions.

Like this? Learn anything you want — for free. Sign Up Free