In every system design interview, you will be asked to choose a database. "It depends" is the correct philosophical answer — but interviewers want you to know exactly what it depends on. This guide gives you a concrete mental model for choosing the right database every time.
We'll cover the five major database types, their real-world representatives, and a decision framework that works for any system design problem.
Tables, rows, joins. ACID transactions. Schema-first. The default for most systems.
JSON documents. Flexible schema. Nested data. Good for 1:many in the same entity.
Rows of column families. Massive scale writes. Designed for time-series & IoT.
Key → value. Blazing fast O(1) lookups. No query language. Caching + sessions.
Nodes + edges. Relationship-first. Fraud detection, social graphs, recommendations.
# ACID guarantees (why SQL is trusted for money): # A — Atomicity: Transaction is all-or-nothing. "Transfer $100" = both debit AND credit succeed. # C — Consistency: DB is always in a valid state. Constraints enforced (FK, UNIQUE, NOT NULL). # I — Isolation: Concurrent transactions don't interfere. Read Committed, Repeatable Read, Serializable. # D — Durability: Committed data survives crashes. WAL (Write-Ahead Log) written to disk first. # Example: Bank transfer in PostgreSQL — ACID at work BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 'user_a'; -- debit UPDATE accounts SET balance = balance + 100 WHERE id = 'user_b'; -- credit COMMIT; -- both succeed or both roll back (atomicity) # If server crashes between the two UPDATEs → rollback on restart (durability + atomicity) # If balance would go negative → check constraint fires → COMMIT fails → no partial state (consistency)
# Use PostgreSQL/MySQL/Aurora when: ✓ Data has clear relationships (users → orders → line_items → products) ✓ Correctness is paramount (financial, medical, legal) ✓ You need ad-hoc reporting and analytics queries ✓ Team knows SQL well (0 ramp-up cost) ✓ Reasonable scale (tens of millions of rows per table, <10K writes/sec) # Real examples: - Banking / fintech (correctness critical) - E-commerce (orders, inventory, payments) - SaaS products (user accounts, subscriptions, billing) - Healthcare (patient records, prescriptions) - Most CRUD applications built by teams <20 engineers
# MongoDB document — JSON-like BSON:
{
"_id": ObjectId("6657e2f0..."),
"user_id": "u_1234",
"name": "Arjun Sharma",
"email": "arjun@example.com",
"addresses": [ # ← array of sub-documents
{"type": "home", "city": "Bengaluru", "pincode": "560001"},
{"type": "work", "city": "Mysuru", "pincode": "570001"}
],
"preferences": {
"newsletter": true,
"theme": "dark"
}
}
# This eliminates a JOIN that SQL would need (users + addresses tables)
# Read user with all addresses → single document fetch
# Tradeoff: if many users share an address (office building) → data is duplicated in every doc
# Cassandra table design — think "what queries will I run?" FIRST
# Design tables around query patterns, not entities
# Query: "Give me all messages in conversation X, newest first"
CREATE TABLE messages (
conversation_id UUID,
message_id TIMEUUID, -- time-sortable UUID (newest first)
sender_id UUID,
content TEXT,
PRIMARY KEY (conversation_id, message_id) -- partition by conversation
) WITH CLUSTERING ORDER BY (message_id DESC);
# Query executes as single partition lookup → O(1) → handles 1M msgs/sec
# Cassandra's write path (why it's fast):
# 1. Write to commit log (sequential disk write) — crash durability
# 2. Write to memtable (in-memory structure) — fast acknowledgement
# 3. memtable → SSTable flush periodically (background)
# Reads are slower: must check memtable + SSTables + bloom filters + compaction
# Tradeoff: Cassandra is "write-optimized" — writes faster than reads
# Replication factor = 3, consistency = QUORUM:
# Write: wait for 2/3 replicas to confirm → W=2
# Read: query 2/3 replicas and return newest → R=2
# W + R > RF → strong consistency
# Tunable: use ONE for max throughput, QUORUM for balanced, ALL for maximum safety
# Choose Cassandra / HBase when: ✓ Write throughput > 50K/sec consistently ✓ Data naturally fits time-series (IoT sensor readings, event logs, metrics) ✓ Need geo-distributed multi-active writes (Cassandra has no single master) ✓ Data access pattern is simple (no complex JOINs, always query by partition key) ✓ Storing billions of rows across many nodes (petabytes of data) # Real examples: - Netflix: stores streaming playback events (1B+ writes/day) - Uber: stores real-time trip/driver location data - Discord: chat message storage (switched FROM Cassandra due to hot partition issues) - Apple: stores 1B+ devices' iCloud backup metadata - Time-series: Metrics, IoT, recommendation scores with TTL expiry
# Redis — the Swiss Army knife of key-value stores
# Not just a key-value store: supports Strings, Hashes, Lists, Sets, Sorted Sets, Streams
# String (simple cache):
redis.setex("user:1234:session", 3600, json.dumps(session_data)) # TTL = 1 hour
# Hash (structured object):
redis.hset("user:1234", mapping={"name": "Anjali", "email": "anjali@example.com", "login_count": "42"})
redis.hincrby("user:1234", "login_count", 1) # atomic increment
# Sorted Set (leaderboard):
redis.zadd("leaderboard:2026-06", {"player_1": 9500, "player_2": 8800})
redis.zrevrange("leaderboard:2026-06", 0, 9, withscores=True) # top 10
# List (message queue, recent activity):
redis.lpush("feed:user_456", tweet_id) # prepend
redis.ltrim("feed:user_456", 0, 799) # keep last 800 items only
# Redis use cases:
# 1. Session store → SETEX session_id payload EX 3600
# 2. Rate limiting → INCR + EXPIRE (sliding window counter)
# 3. Cache → GET/SET/SETEX with eviction policy (LRU)
# 4. Pub/Sub → PUBLISH channel message; SUBSCRIBE channel
# 5. Distributed locks → SET key value NX EX 10
# 6. Leaderboards → Sorted Sets (ZADD, ZRANK, ZREVRANGE)
# 7. Queue/dequeue → RPUSH/BLPOP (blocking pop for workers)
# DynamoDB vs Redis:
# Redis: in-memory, nanosecond latency, limited by RAM, no complex queries
# DynamoDB: persistent, millisecond latency, unlimited storage, secondary indexes
| Feature | SQL (PostgreSQL) | Document (MongoDB) | Column-Family (Cassandra) | Key-Value (Redis) | Graph (Neo4j) |
|---|---|---|---|---|---|
| ACID transactions | ✅ Full | ⚠️ Limited | ❌ Tunable | ❌ Single key only | ✅ Full |
| Schema | Strict | Flexible | Semi-flexible | None | Flexible |
| JOIN support | ✅ Native | ❌ App-level | ❌ Design-time | ❌ None | ✅ Native |
| Write scalability | ⚠️ Vertical | ✅ Horizontal | ✅✅ Extreme | ✅ In-memory | ⚠️ Limited |
| Read scalability | ✅ Read replicas | ✅ Sharding | ⚠️ Slower reads | ✅✅ Sub-ms | ⚠️ Complex |
| Query expressiveness | ✅✅ SQL | ✅ MQL | ⚠️ CQL (limited) | ❌ Key-only | ✅ Cypher |
| Data model | Tables | JSON docs | Wide rows | Key→Value | Nodes+Edges |
# Q1: Do you need complex queries or ad-hoc reporting? # → YES: SQL (PostgreSQL/MySQL) # Q2: Is consistency / ACID critical? (money, health records, legal) # → YES: SQL # Q3: Is write throughput > 50K/sec or data > 10TB? # → YES: Cassandra or DynamoDB (depending on query patterns) # Q4: Is your data highly nested / schema frequently changes? # → YES: MongoDB / Firestore # Q5: Is this session management, caching, leaderboards, or pub/sub? # → YES: Redis # Q6: Is your primary use case finding relationships between entities? # → YES: Neo4j (fraud detection, recommendations, social graph traversal)
# Mistake 1: "We need scale, so use NoSQL" # Reality: PostgreSQL on a high-end instance handles 10M+ users. # Instagram ran on PostgreSQL for years with 400M users. Scale requires proof first. # Mistake 2: "MongoDB has no schema, so it's more flexible" # Reality: Schema-less = application enforces schema. # If your app code expects "email" to be a string, any other type causes bugs. # Schema is always there — it just lives in the code, not the DB. # Mistake 3: "Cassandra gives us eventual consistency, that's fine for user data" # Reality: Eventual consistency on user profile data means: # - User changes password → old password may be readable for seconds # - Dangerous for auth-related data — use Quorum reads or use SQL instead # Mistake 4: Choosing a DB based on the company's tech stack # "Netflix uses Cassandra, so we should too" # Netflix has 250M subscribers, IoT-scale event streams, and 50+ engineers on Cassandra. # Your startup with 10K users doesn't share the same requirements. # Right approach: explain tradeoffs # "I'd start with PostgreSQL. At our current scale (1M users, <5K writes/sec), # it handles everything with read replicas. If user events grow to 50K/sec # I'd move that specific table to Cassandra while keeping user profiles in Postgres."