Git Internals

Notes from "Git from the Ground Up" by Safia Abdalla.

Summary

  • Git represents key information as objects stored on the files system
  • Git compresses loose objects into packfiles to increase space efficiency (see also: Packfiles: How Git Repositories Stay so Small)
  • Rebases and merges differ in whether they give preference to maintaining a linear history or explicit branches

Types of Objects stored in .git/

Blobs represent file data.

Trees reference multiple blobs and other trees, similar to a directory structure.

Commits reference specific trees plus metadata, such as when the commit was made, the committer, and the commit message.

Tags are named commits.

Git objects have a type, size, and content.

The ./git/HEAD File

The .git/HEAD file contains a reference to a tag or SHA, which identifies a specific commit. The commit points to a Tree which contains one or more parents, plus

graph TD HEAD[".git/HEAD"] --> Ref["Ref (.git/<tag or SHA>)"] Ref --> Commit["<Commit SHA>"] Commit --> Tree Commit --> Author Commit --> Comment["Commit Comment"] Tree --> Parent["<Parent Commit SHA(s)>"] Tree --> Blob["Blob(s)"]

Resources

Broader Topics Related to Git Internals

Git

Git

A distributed version-control system to track changes, typically for software development projects

Git Internals Knowledge Graph