Skip to contentSkip to content

Part 2: Engine Deep Dives

"Every database hides a machine underneath. The engineers who win are the ones who know the machine."

Overview

Four chapters that go inside the most important engines in production today:

#ChapterCore ConceptDifficultyRead Time
05PostgreSQL in ProductionWAL, VACUUM, connection pooling, extensionsIntermediate45 min
06MySQL & Distributed SQLInnoDB, Vitess, NewSQL trade-offsIntermediate40 min
07NoSQL at ScaleDynamoDB, Cassandra, MongoDB, Redis patternsAdvanced50 min
08Specialized DatabasesTime-series, search, graph, vector enginesAdvanced45 min

Key Themes

  • Internals matter for tuning — you cannot size shared_buffers correctly without knowing how PostgreSQL uses it
  • Schema design is engine-specific — a DynamoDB single-table design would be catastrophic in PostgreSQL, and vice versa
  • NewSQL is not a free upgrade — distributed SQL adds 10–100ms of consensus latency you cannot optimize away
  • Specialization wins at the extremes — beyond a certain scale, a general-purpose database is the wrong tool

What You'll Be Able to Do

After completing Part 2 you will be able to:

  • Tune a PostgreSQL configuration for a 64GB production server with annotated reasoning
  • Explain why PgBouncer transaction mode breaks LISTEN/NOTIFY and prepared statements
  • Design a DynamoDB single-table schema for an e-commerce access pattern
  • Choose between ClickHouse, TimescaleDB, and InfluxDB for a given ingestion rate
  • Explain when Neo4j traverses relationships faster than a SQL self-join

Prerequisites

Complete Part 1 — Foundations before starting here. The WAL mechanics from Ch01, the indexing concepts from Ch03, and the MVCC model from Ch04 are all assumed knowledge.

Continue to Part 3

After finishing these four chapters, continue to Part 3 — Operations for replication, sharding, and distributed transactions.

Comments powered by Giscus. Enable GitHub Discussions on the repo to activate.

Built with VitePress + Dracula Theme