Skip to contentSkip to content
0/47 chapters completed (0%)

Chapter 8: Content Delivery Networks

Chapter banner

Mind Map

Overview

A Content Delivery Network (CDN) is a geographically distributed network of proxy servers — called edge servers — that cache and serve content from locations physically close to end users. When a user in Tokyo requests an image hosted on a server in Virginia, a CDN can serve that image from an edge node in Tokyo, cutting the round-trip distance from 14,000 km to a few kilometers.

CDNs are the outermost caching layer in the full caching hierarchy covered in Chapter 7. Where application caches (Redis, Memcached) reduce load on databases, CDNs reduce load on origin servers and dramatically improve perceived latency for geographically dispersed users. DNS routing (covered in Chapter 5) determines which CDN edge node a user reaches — latency-based or geolocation DNS routing sends users to the nearest Point of Presence (PoP).


How CDNs Work

Every CDN request begins with a DNS lookup. The CDN provider uses anycast routing or latency-based DNS to direct the user to the geographically closest edge server. From there, the edge either serves from its cache or fetches from origin.

Edge Servers and Points of Presence

A Point of Presence (PoP) is a physical data center location in a CDN's network. Each PoP contains multiple edge servers — the actual machines that cache and serve content. Major CDN providers operate hundreds of PoPs globally:

  • Cloudflare: 300+ PoPs in 100+ countries
  • Akamai: 4,000+ PoPs — largest edge network
  • AWS CloudFront: 600+ edge locations and 13 regional edge caches
  • Fastly: 90+ PoPs with emphasis on programmable edge

The CDN uses anycast routing: multiple PoPs share the same IP address, and network routing automatically delivers traffic to the topologically nearest one. This is the same technique that makes Cloudflare's 1.1.1.1 DNS resolver consistently the fastest globally — requests are handled by the nearest datacenter without client-side logic.


Push vs Pull CDN

The two fundamental CDN delivery models differ in how content gets onto edge servers — whether you push it proactively or the CDN pulls it on demand.

Pull CDN — On-Demand Population

In a Pull CDN, content is fetched from origin the first time a user requests it. The edge caches the response and serves all subsequent requests from cache until the TTL expires. Setup is minimal — point the CDN at your origin and set Cache-Control headers.

Trade-off: The very first user to request any file pays the full origin latency. After that, all users at the same PoP get the cached version.

Push CDN — Pre-populated Cache

In a Push CDN, you upload content to the CDN explicitly — typically via API or storage bucket. The CDN replicates it globally. Every user at every PoP gets immediate cache hits from the first request.

Trade-off: You pay storage costs at every PoP. You must manage content lifecycle (when to delete/update). Not suitable for user-generated or highly dynamic content.

Push vs Pull Comparison

DimensionPull CDNPush CDN
Setup complexityLow — just point to originHigh — requires upload pipeline
Cold start latencyFirst request hits originZero — content pre-cached
Storage costOnly caches what's requestedAll content at all PoPs
Origin loadLow after warm-upZero (no origin fetches)
Content freshnessControlled by Cache-Control TTLControlled by manual push/delete
Best forWeb assets, APIs, general contentLarge files: video, software, game assets
Real-world exampleCloudflare with website assetsNetflix Open Connect video files

Rule of thumb: Pull CDN for most web workloads. Push CDN when files are large, static, and access patterns are predictable.


Origin Shielding

In a standard Pull CDN, a cache miss at any PoP triggers an origin request. For a global CDN with 300+ PoPs, a cache expiry or purge event can cause hundreds of simultaneous origin requests for the same file — a CDN-scale thundering herd problem.

Origin shielding inserts a designated regional hub server between the edge PoPs and the origin. Edge nodes that miss their local cache query the shield first; only the shield makes requests to origin.

Benefits of origin shielding:

  • Reduces origin requests by 90–99% during cache misses and purge events
  • Provides a second cache layer — if the shield has a cached copy, the edge gets it without hitting origin
  • Protects origin from traffic spikes during viral content or deployments
  • Concentrates origin access to known shield IPs, simplifying firewall rules

Cache Invalidation at Edge

CDN edge caches respect Cache-Control headers from origin, but there are scenarios requiring immediate invalidation before TTL expiry — bug fixes, sensitive data removal, content corrections.

Invalidation Methods

1. TTL Expiry (passive)

The standard approach: set Cache-Control: max-age=N on origin responses. Edge nodes serve cached content until TTL expires, then re-fetch from origin. Requires no CDN API calls but means stale content persists for up to TTL seconds after a change.

Cache-Control: max-age=3600         # 1 hour TTL
Cache-Control: max-age=86400        # 24 hour TTL
Cache-Control: s-maxage=600         # CDN TTL = 10 min, browser TTL separate

The s-maxage directive applies specifically to shared caches (CDNs, proxies), allowing you to set different TTLs for edge vs browser.

2. Manual Purge (active)

Issue a purge API call to the CDN to immediately invalidate a URL or path across all PoPs. Most CDNs propagate purges globally within seconds.

bash
# Cloudflare purge single URL
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
  -H "Authorization: Bearer {token}" \
  -d '{"files": ["https://example.com/images/logo.png"]}'

# Cloudflare purge entire cache
curl -X POST ... -d '{"purge_everything": true}'

Limitation: Purging everything on every deploy is expensive and causes a thundering herd as all edges simultaneously re-fetch from origin. Prefer targeted purges.

3. Versioned URLs (cache-busting)

Embed a content hash or version number in the file path or query string. When content changes, the URL changes — old URL and new URL are entirely independent cache entries. Old URL automatically ages out by TTL.

# Old: served from edge cache
/static/app.abc123.js

# New build with changed code: different hash = different URL = cache miss + new cache entry
/static/app.def456.js

This is the preferred strategy for static assets (JS, CSS, images) in production. Combined with long TTLs (1 year), it gives maximum caching while guaranteeing users always get the latest build.

4. Surrogate Keys / Cache Tags

Advanced feature: tag cached responses with logical keys, then purge all content matching a tag in one API call. Supported by Fastly, Cloudflare Enterprise, Varnish.

# Origin response includes tag header
Surrogate-Key: product-123 category-shoes

# CDN API: purge all pages tagged product-123
POST /purge {"tags": ["product-123"]}
# Invalidates: product page, category page, search results — anything tagged

Static vs Dynamic Content

CDNs were originally designed for static content but have expanded to handle dynamic, personalized, and even API responses.

Static Content

Files that do not change per user or per request: JavaScript bundles, CSS, images, fonts, video files, software downloads.

Best practices:

  • Use content-hashed filenames for fingerprinting (app.a3f9c2.js)
  • Set very long TTLs: Cache-Control: max-age=31536000, immutable (1 year)
  • The immutable directive tells browsers the file will never change and to skip revalidation
  • Serve from CDN with Vary: Accept-Encoding to cache both gzip and brotli variants
  • Use Push CDN for large media (video, game assets)

Dynamic Content

Responses that vary per user, session, or request state: HTML pages, API responses, search results, personalized recommendations.

Strategies for CDN-accelerating dynamic content:

StrategyMechanismTTLUse Case
Short TTL cachingCache for 1–30 secondsVery shortHigh-traffic landing pages with acceptable staleness
Stale-while-revalidateServe stale + refresh asyncFlexibleAPI responses where slight staleness is acceptable
Edge Side Includes (ESI)Cache page fragments separatelyPer-fragmentHeader/footer static, body dynamic
Vary headerSeparate cache per header valueNormalVary: Accept-Language for localized pages
Bypass for authenticatedSkip cache for cookie/auth headersN/ALogged-in user pages never served from cache
# Stale-while-revalidate: serve stale for up to 60s while refreshing in background
Cache-Control: max-age=30, stale-while-revalidate=60

# Bypass CDN cache for authenticated requests
Cache-Control: private, no-store

Static vs Dynamic Comparison

DimensionStatic ContentDynamic Content
TTLHours to 1 yearSeconds to minutes (or no-cache)
Cache keyURL onlyURL + headers (Accept, Cookie, Language)
CDN modelPush or PullPull only
PersonalizationNot applicableMust bypass or use Vary/ESI
InvalidationVersioned URLs (preferred)Purge or short TTL
Cache hit ratio95–99%+ achievable30–70% typical
ExamplesJS, CSS, images, videoAPI JSON, HTML pages, search results

Multi-CDN Strategies

Relying on a single CDN provider creates a vendor SPOF. In 2021, a Fastly outage took down large portions of the internet including Reddit, GitHub, The Guardian, and Amazon for ~1 hour. Multi-CDN architectures provide resilience and performance optimization.

Failover Multi-CDN

The simplest approach: use CDN A as primary, CDN B as backup. DNS health checks detect CDN A degradation and automatically switch to CDN B.

Key consideration: Both CDNs must have warm caches (or be able to fill quickly) for failover to be seamless. A cold CDN B serves every request from origin until its cache warms, potentially overloading origin during the worst moment — an outage event.

Performance Routing Multi-CDN

Use Real User Monitoring (RUM) data to continuously measure which CDN performs best for each geographic region and route traffic accordingly.

Benefits: Automatically adapts to CDN performance variations (peering issues, capacity), can achieve 10–20% TTFB improvement vs single CDN.


CDN Performance Metrics

Understanding CDN performance requires tracking the right metrics. An interview answer that includes these demonstrates operational depth.

Cache Hit Ratio

The percentage of requests served from edge cache without touching origin.

Cache Hit Ratio = (Cache Hits) / (Total Requests) × 100%

Example:
- 1,000,000 total requests
- 950,000 served from cache
- 50,000 fetched from origin
= 95% cache hit ratio

Target: 90%+ for static-heavy workloads. Below 80% suggests misconfigured Cache-Control headers, excessive cache variation (too many Vary header values), or high cache churn from low TTLs.

Factors that hurt cache hit ratio:

  • Query string parameters treated as unique cache keys (?v=1, ?v=2)
  • Cookie-based variation caching different responses per session
  • Low TTLs causing frequent cache expiry
  • Long-tail URLs that are rarely re-requested

Time to First Byte (TTFB)

The time elapsed between a user making a request and receiving the first byte of the response.

ScenarioTypical TTFB
Edge cache hit (same continent)5–20ms
Edge cache hit (same country)1–10ms
CDN miss → origin (intercontinental)200–400ms
Direct to origin (US to Asia)150–300ms
Direct to origin (same region)20–80ms

TTFB target: Under 200ms for cache hits, under 600ms for cache misses with origin shielding.

Origin Offload

The percentage of requests handled by the CDN without reaching origin. Similar to cache hit ratio but often measured in bandwidth terms (bytes delivered from edge vs bytes fetched from origin).

Origin Offload = 1 - (Origin Requests / Total CDN Requests)

A 99% origin offload means origin handles only 1% of total traffic — the CDN absorbs 99x the traffic the origin would see without a CDN.


Real-World Examples

Netflix Open Connect

Netflix operates its own CDN — Open Connect — rather than relying entirely on third-party CDN providers. Key design decisions:

  • ISP partnerships: Netflix places Open Connect Appliances (OCAs) — purpose-built storage servers — directly inside ISP data centers and internet exchange points, free of charge. The ISP benefits from reduced bandwidth costs; Netflix benefits from extremely low-latency delivery to that ISP's subscribers.
  • Push model: OCAs are pre-populated each night during off-peak hours with the movies and shows predicted to be popular the next day, based on viewing history data. This is a pure Push CDN at the edge.
  • Scale: Open Connect delivers 100% of Netflix video traffic, with ~90% served via direct ISP connections. At peak, Netflix accounts for ~15% of all downstream internet traffic in North America.
  • Format strategy: Files are transcoded into dozens of formats and bitrates, each stored as separate cached objects. Adaptive bitrate streaming means the CDN serves small video segments (~2-10 seconds each), not entire files.

Cloudflare Network

Cloudflare operates one of the most unique CDN architectures:

  • Anycast everywhere: All 300+ PoPs share the same IP addresses. BGP routing automatically directs users to the nearest PoP.
  • Argo Smart Routing: Cloudflare's paid tier routes origin requests through Cloudflare's own private backbone (avoiding public internet) to reduce latency and packet loss.
  • Workers at the edge: Cloudflare Workers allows JavaScript/WebAssembly execution at the edge — enabling dynamic content generation, A/B testing, and authentication without reaching origin. This blurs the boundary between CDN and serverless compute.
  • Cache everything: Unlike traditional CDNs focused on static assets, Cloudflare can cache HTML and API responses with fine-grained rules, effectively CDN-accelerating full-stack dynamic applications.

Key Takeaway

A CDN is a geographically distributed cache that trades storage cost at the edge for dramatic reductions in latency, origin load, and bandwidth. The core design decisions are: Push vs Pull (pre-populate vs on-demand), TTL vs purge (passive expiry vs active invalidation), and static vs dynamic (long cache vs bypass or micro-TTL). Origin shielding concentrates the thundering herd; versioned URLs eliminate cache invalidation complexity for static assets. At scale, a well-configured CDN can absorb 95–99% of all traffic before it ever reaches your origin.


CDN Comparison Tables

CDN Provider Comparison (as of 2026)

ProviderPoP CountPricing ModelDDoS ProtectionEdge ComputeBest For
Cloudflare300+ cities, 100+ countriesFlat-rate plans (free tier available); bandwidth-based enterpriseIndustry-leading — absorbs multi-Tbps attacks via anycast absorptionCloudflare Workers (JS/WASM at edge, ~0ms cold start)Security-first deployments, full-stack edge apps, cost-sensitive teams
AWS CloudFront600+ edge locations + 13 regional edge cachesPay-per-GB + per-10K requestsAWS Shield Standard (free); Shield Advanced ($3K+/mo)Lambda@Edge (Node.js/Python); CloudFront Functions (JS, sub-ms)AWS-native stacks; S3/ALB integration; enterprise compliance requirements
Akamai4,000+ PoPs — largest edge networkEnterprise contracts; volume-basedProlexic dedicated DDoS mitigation platformEdgeWorkers (JS)Largest enterprises; media delivery; regulated industries
Fastly90+ PoPsPay-per-GB; real-time billingStandard DDoS included; Signal Sciences WAFCompute@Edge (Rust/WASM, ultra-low cold start)Programmable edge; real-time purge (150ms propagation); API-first teams
Google Cloud CDN100+ PoPs (Google backbone)Pay-per-GB outbound; cache fill chargedCloud Armor WAF + DDoS integrationCloud Run at edge (preview)GCP-native workloads; YouTube-proven infrastructure; global anycast

Rule of thumb: Cloudflare for cost + security; CloudFront for AWS-native; Akamai for largest enterprise SLAs; Fastly for programmable edge + fast purge; Google CDN for GCP-native.


Push vs Pull CDN Detailed Comparison

DimensionPull CDNPush CDN
Content uploadAutomatic — CDN fetches from origin on first cache miss per PoPManual — developer/pipeline uploads to CDN storage or API
Storage costLow — only caches requested contentHigh — content stored at every PoP regardless of access frequency
Cache miss behaviorFirst request per PoP pays full origin round-trip latencyZero misses — content pre-cached before any user request
TTL managementSet via Cache-Control / s-maxage headers on origin responsesManaged explicitly; delete/replace API calls required
Origin dependencyOrigin must be available for cache fillsOrigin not needed after upload; fully decoupled
Ideal content typeWeb assets, APIs, HTML pages, user-generated thumbnailsLarge static files: video segments, software downloads, game assets
Setup complexityLow — point DNS at CDN, set Cache-Control headersHigh — requires upload pipeline + content lifecycle management
Real-world exampleCloudflare for website assets; CloudFront for REST APIsNetflix Open Connect for video; Steam for game downloads

ChapterRelevance
Ch05 — DNSDNS-based CDN routing and anycast selection
Ch06 — Load BalancingCDN sits in front of LB; origin shielding reduces LB load
Ch07 — CachingCDN is a specialized cache; shares TTL and invalidation concepts

Practice Questions

Beginner

  1. Push vs Pull CDN: Design a CDN caching strategy for a video streaming platform serving 100M DAU across 6 continents, with a library of 50M videos (most rarely watched). How do you decide between Push and Pull CDN for this workload? How do you handle cache invalidation when a video is removed for rights violations within 1 hour?

    Hint Pull CDN suits large catalogs with cold content (only cache what's requested); Push is better for small, hot content sets — rights removal requires an API-triggered purge, not waiting for TTL expiry.

Intermediate

  1. Low Cache Hit Ratio: A news website achieves only a 45% CDN cache hit ratio despite serving mostly static HTML pages. Identify three likely causes and explain how you would diagnose and fix each one.

    Hint Common culprits: `Cache-Control: no-store` headers set by the app framework, query string variations treated as unique cache keys, and personalized response headers (cookies) bypassing the cache.
  2. Origin Shielding: Explain origin shielding. Describe the request flow with and without shielding when a cache purge event fires across a CDN with 200 PoPs globally. How many origin requests does shielding prevent during the purge fill?

    Hint Without shielding, each PoP independently misses on the first request after purge (up to 200 origin hits per object); with shielding, all 200 PoPs pull from one shield PoP, reducing origin load to 1 hit per object.
  3. Partial Personalization: An e-commerce platform has product pages that include personalized pricing for logged-in members. How would you structure CDN caching to serve the base page quickly while keeping personalized sections accurate? Consider edge-side includes (ESI) and client-side hydration as options.

    Hint Cache the shell HTML at the CDN; deliver personalized data via a separate authenticated API call from the browser — or use ESI to stitch cached and non-cached fragments at the edge.

Advanced

  1. Cache-Busting Strategies: Compare versioned URL cache-busting (app.abc123.js) versus manual CDN purge on deploy. When would you use each, and what operational failures does each strategy introduce at scale (1,000 deploys/day, 500 PoPs)?

    Hint Versioned URLs never need purging (immutable by design) but require build tooling to inject hashes; manual purge is simpler to implement but introduces a propagation delay window where stale assets serve alongside new HTML.

References & Further Reading

  • Cloudflare documentation — CDN concepts
  • "High Performance Browser Networking" — Ilya Grigorik
  • AWS CloudFront Developer Guide
  • Fastly documentation — Edge Computing

Comments powered by Giscus. Enable GitHub Discussions on the repo to activate.

Built with VitePress + Dracula Theme