Skip to contentSkip to content
0/47 chapters completed (0%)

Database Engineering

"The database is never just a detail. It is the center of gravity of every production system."

What You'll Master

This guide covers the full spectrum of database engineering — from choosing the right data model to tuning queries under production load. You will understand why PostgreSQL uses MVCC, how sharding decisions break apart at the seams, and what Instagram actually changed to serve 1 billion photos a day.

Companion reading for System Design Ch09 — Databases: SQL and Ch10 — Databases: NoSQL, which cover the selection layer. This guide goes deeper: internals, operations, and architecture.

The Learning Path

Work through the parts sequentially — each builds on concepts from the previous.


Part 1 — Foundations

Storage engines, data models, indexes, and transactions. The non-negotiable bedrock every database engineer must own.

#ChapterDifficulty~Time
01The Database LandscapeBeginner30 min
02Data Modeling for ScaleIntermediate35 min
03Indexing StrategiesIntermediate40 min
04Transactions & Concurrency ControlAdvanced45 min

Part 2 — Engine Deep Dives

Under the hood of PostgreSQL, MySQL, the major NoSQL families, and specialized engines for time-series, search, and vectors.

#ChapterDifficulty~Time
05PostgreSQL in ProductionIntermediate40 min
06MySQL & Distributed SQLIntermediate35 min
07NoSQL at ScaleIntermediate45 min
08Specialized DatabasesIntermediate50 min

Part 3 — Scaling & Operations

How to keep databases alive, fast, and consistent when they grow beyond a single machine.

#ChapterDifficulty~Time
09Replication & High AvailabilityAdvanced50 min
10Sharding & PartitioningAdvanced45 min
11Query Optimization & PerformanceAdvanced50 min
12Backup, Migration & Disaster RecoveryIntermediate55 min

Part 4 — Real-World Design

Full case studies from companies that solved hard database problems at scale. Specific numbers, specific decisions, specific regrets.

#ChapterDifficulty~Time
13Instagram: PostgreSQL at ScaleAdvanced30 min
14Discord: Data Layer EvolutionAdvanced30 min
15Uber: Geospatial Database DesignAdvanced30 min
16Database Selection FrameworkIntermediate35 min

Prerequisites

Before starting Part 1, you should be comfortable with:

  • [ ] Basic SQL (SELECT, JOIN, GROUP BY)
  • [ ] What a primary key and foreign key are
  • [ ] General understanding of how a web application uses a database

You do not need to be a DBA. This guide teaches database internals from first principles.

If you want the big-picture view of database selection first, read System Design Ch09 and Ch10 before starting here.


How to Use This Guide

  1. Read Part 1 completely — indexes and transactions are referenced in every subsequent chapter
  2. Run the SQL examples — theory without execution is incomplete; use a local PostgreSQL instance
  3. Draw the diagrams — B-tree traversal, MVCC tuple chains, and replication topologies are learned by sketching
  4. Attempt the practice questions — each chapter has three difficulty tiers
  5. Return to case studies — Part 4 is most valuable after completing Parts 1–3

Total estimated time: ~11 hours across 16 chapters

$ cat handbook --sections
  database/
    part-1-foundations/    4 chapters  (~2.5 hrs)
    part-2-engines/        4 chapters  (~2.8 hrs)
    part-3-operations/     4 chapters  (~3.3 hrs)
    part-4-real-world/     4 chapters  (~2.1 hrs)

Comments powered by Giscus. Enable GitHub Discussions on the repo to activate.

Built with VitePress + Dracula Theme