API Design — REST vs GraphQL vs gRPC

When to use each protocol, REST best practices, versioning, idempotency, and pagination patterns

June 2026 18 min read All Levels

APIs are the contracts between systems. A well-designed API is backward-compatible, predictable, and efficient. A poorly designed API breaks clients, leaks internals, and becomes impossible to evolve without a major version bump every six months.

This guide covers the three dominant API paradigms — REST, GraphQL, and gRPC — with concrete guidance on when to use each, and deep coverage of REST best practices since it remains the default for public-facing APIs.

1 REST — Representational State Transfer

REST Principles

# REST is an architectural style, not a protocol
# 6 constraints (only the first 4 matter in practice):
# 1. Client-Server: UI and backend are independent
# 2. Stateless: each request contains all info needed (no server-side session)
# 3. Cacheable: responses declare whether they're cacheable (Cache-Control header)
# 4. Uniform Interface: consistent resource naming, standard HTTP methods
# 5. Layered System: client doesn't know if it talks to origin or load balancer
# 6. Code on Demand (optional): server can send executable code (rarely used)

# REST uses HTTP methods as verbs:
# GET    → read (safe, idempotent — can cache)
# POST   → create (not idempotent — multiple calls = multiple resources)
# PUT    → replace entire resource (idempotent)
# PATCH  → partial update (not strictly idempotent, but usually designed to be)
# DELETE → remove (idempotent)

Resource Naming — The Most Common Mistakes

# ✗ BAD: verb in URL (RPC-style)
GET  /getUser?id=123
POST /createOrder
POST /deleteUser/123
POST /updateUserEmail

# ✓ GOOD: nouns, plural, hierarchical
GET    /users/123              # get user
GET    /users                  # list users
POST   /users                  # create user
PUT    /users/123              # replace user
PATCH  /users/123              # partial update user
DELETE /users/123              # delete user

GET    /users/123/orders       # orders for user 123
GET    /users/123/orders/456   # specific order for user 123
POST   /users/123/orders       # create order for user 123

# Actions that don't fit CRUD → use sub-resources or POST with action
POST /users/123/activate       # activate account
POST /payments/456/refund      # refund a payment
POST /auth/token               # generate token (not a resource creation)

# Plural vs singular:
# Always use plural: /users, /orders, /products (not /user, /order)
# Consistent naming reduces cognitive overhead for API consumers

HTTP Status Codes — Use Them Correctly

# 2xx — Success
200 OK              # GET, PUT, PATCH success — returns body
201 Created         # POST success — include Location header pointing to new resource
204 No Content      # DELETE success — no body needed

# 3xx — Redirection
301 Moved Permanently   # resource has a new permanent URL (URL shortener redirect)
302 Found               # temporary redirect
304 Not Modified        # client cache is fresh (used with ETags/If-None-Match)

# 4xx — Client Error (client is doing something wrong)
400 Bad Request         # malformed JSON, missing required field, invalid format
401 Unauthorized        # not authenticated (no/invalid token) — name is misleading
403 Forbidden           # authenticated but lacks permission to this resource
404 Not Found           # resource doesn't exist
409 Conflict            # duplicate (unique constraint: username already taken)
410 Gone                # resource permanently deleted (vs 404 = maybe existed)
422 Unprocessable       # semantically invalid (valid JSON but business rule violated)
429 Too Many Requests   # rate limited — include Retry-After header

# 5xx — Server Error (server failed, not client's fault)
500 Internal Server Error   # unexpected server exception — never expose stack traces
503 Service Unavailable     # downstream dependency down, maintenance
504 Gateway Timeout         # upstream service took too long

# ✗ Avoid the "200 with error body" anti-pattern:
# HTTP/1.1 200 OK
# {"status": "error", "message": "user not found"}
# → breaks HTTP clients, caches, monitoring tools

Idempotency — Critical for Safety

# Idempotent: calling N times = same result as calling once
# Safe: no side effects (can retry freely)

# Method safety table:
# GET     → safe=YES, idempotent=YES
# HEAD    → safe=YES, idempotent=YES
# DELETE  → safe=NO,  idempotent=YES (deleting a deleted resource = still deleted)
# PUT     → safe=NO,  idempotent=YES (replacing with same data = same state)
# POST    → safe=NO,  idempotent=NO  (double-click → two orders created!)
# PATCH   → safe=NO,  idempotent=DEPENDS (set email = idempotent; increment count = not)

# Making POST idempotent — Idempotency Key:
# Client generates UUID per logical operation
# Sends it as a header: Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
# Server: check if key seen before → return cached result instead of processing again

# Server-side implementation:
def create_order(request):
    key = request.headers.get("Idempotency-Key")
    if key:
        cached = redis.get(f"idem:{key}")
        if cached:
            return json.loads(cached)  # replay cached response

    order = db.create_order(request.data)
    response = {"order_id": order.id, "status": "created"}

    if key:
        redis.setex(f"idem:{key}", 86400, json.dumps(response))  # cache 24h
    return response

2 Pagination Strategies

# Strategy 1: Offset Pagination (simplest, most common)
GET /orders?offset=0&limit=20    # page 1
GET /orders?offset=20&limit=20   # page 2

# Response:
{
  "data": [...],
  "pagination": {
    "offset": 20,
    "limit": 20,
    "total": 483,   # total count (expensive query — skip for large datasets)
    "has_next": true
  }
}

# Problems:
# 1. Slow for deep pages: SELECT * FROM orders LIMIT 20 OFFSET 10000 → full table scan
# 2. Data drift: if new orders inserted at page 1, old page 2 shows shifted items (duplicates/gaps)

# Strategy 2: Cursor Pagination (recommended for production)
GET /orders?cursor=eyJpZCI6MTIzfQ==&limit=20

# cursor = base64({"id": 123, "ts": "2026-06-24T10:00:00Z"})
# Query: SELECT * FROM orders WHERE id < 123 ORDER BY id DESC LIMIT 20

# Response:
{
  "data": [...],
  "pagination": {
    "next_cursor": "eyJpZCI6MTAzfQ==",
    "has_next": true
    # no total count — stable and fast
  }
}

# Benefits:
# → O(log N) with index, not O(N) offset scan
# → No drift: cursor is tied to position in index, not row count
# → Works for infinite scroll (Twitter, Instagram feed)

# Strategy 3: Page-Based (human-friendly)
GET /articles?page=1&per_page=20
# Internally translates to offset=(page-1)*per_page
# Good for numbered pages in admin dashboards; same problem as offset at scale

# Which to use:
# User-facing feeds, infinite scroll → Cursor pagination
# Admin dashboards, small datasets → Offset/Page
# Never offset for tables > 1M rows

3 API Versioning

# Three approaches:

# 1. URL versioning (most common — most discoverable)
GET /v1/users/123
GET /v2/users/123  # v2 changed response structure

# Pros: immediately obvious in logs, browser, curl
# Cons: URL "should" represent a resource, not a version — philosophical issue

# 2. Header versioning
GET /users/123
Accept: application/vnd.techoral.v2+json
# or: API-Version: 2

# Pros: clean URLs
# Cons: invisible in browser bar, harder to test, harder to cache

# 3. Query param versioning
GET /users/123?version=2

# Cons: pollutes query params, inconsistent with resource model

# ✓ Use URL versioning — pragmatic, widely adopted (Stripe, GitHub, Twilio all do this)

# How to version gracefully:
# Rule 1: Never break v1 when releasing v2 — run both in parallel
# Rule 2: Additive changes DON'T need a new version:
#   - Add a new optional field to response ✓
#   - Add a new optional query param ✓
#   - Add a new endpoint ✓
# Rule 3: Breaking changes REQUIRE a new version:
#   - Remove a field from response ✗
#   - Rename a field ✗
#   - Change a field's type (string → int) ✗
#   - Change behavior of existing endpoint ✗

# Stripe's approach: client declares the API version they built against
# Version is stored in their API key settings
# Stripe maintains compatibility forever for existing keys
# New features only available in new versions

4 GraphQL

The Core Problem GraphQL Solves

# REST problem: over-fetching and under-fetching
# Over-fetching: GET /users/123 returns 40 fields but mobile only needs name + avatar
# Under-fetching: mobile dashboard needs user + last 5 orders + unread notifications
#                 → 3 REST calls = 3 round trips

# GraphQL solution: client specifies exactly what it needs
query GetDashboard {
  user(id: "123") {           # ← client picks fields
    name
    avatarUrl
    recentOrders(last: 5) {
      id
      status
      total
    }
    unreadNotificationCount    # ← count only, not the full notifications
  }
}

# One request → one response → exactly the data needed
# No over-fetching (mobile doesn't get desktop fields)
# No under-fetching (one query traverses multiple types)

GraphQL vs REST

# Mutations (writes):
mutation CreateOrder($input: CreateOrderInput!) {
  createOrder(input: $input) {
    id
    status
    estimatedDelivery
  }
}

# Subscriptions (real-time):
subscription OrderStatusUpdates($orderId: ID!) {
  orderStatusChanged(orderId: $orderId) {
    status
    updatedAt
  }
}
# → WebSocket maintained, server pushes updates

GraphQL Pitfalls:

N+1 problem: query for 100 users each fetching their orders → 101 DB queries. Solve with DataLoader (batching).
Caching is hard: POST requests don't cache at CDN/browser level (all queries go to /graphql endpoint). Need persisted queries or GET-based queries for caching.
Security: deeply nested queries can be DoS vectors. Enforce query depth limits and complexity limits.
Schema versioning: no formal versioning — deprecate fields instead of removing them.

Use GraphQL when: Multiple clients (web, iOS, Android, TV) need different field sets from the same data. Internal BFF (Backend For Frontend) pattern. Rapid product iteration where API shape changes frequently. GitHub API v4, Shopify Storefront API, and Meta use GraphQL.

5 gRPC

Why gRPC Exists

# REST/JSON trade-offs:
# - JSON is human-readable but verbose: {"user_id": 12345} = 17 bytes
# - HTTP/1.1: one request per connection, head-of-line blocking
# - Loose typing: client and server can drift silently

# gRPC solves these by:
# - Protocol Buffers (Protobuf): binary serialization → 5-10× smaller than JSON
# - HTTP/2: multiplexed streams, header compression, server push
# - Strong contract: .proto schema shared between client and server
# - Code generation: stubs auto-generated for Java, Go, Python, Node, etc.

# Define service in .proto:
syntax = "proto3";

service UserService {
  rpc GetUser (GetUserRequest) returns (User);
  rpc ListUsers (ListUsersRequest) returns (stream User);      // server streaming
  rpc CreateUser (stream CreateUserRequest) returns (User);   // client streaming
  rpc Chat (stream ChatMessage) returns (stream ChatMessage); // bidirectional
}

message GetUserRequest {
  int64 user_id = 1;
}

message User {
  int64 id = 1;
  string name = 2;
  string email = 3;
  int64 created_at = 4;
}

# Generated Python client (auto-generated by protoc):
channel = grpc.insecure_channel('user-service:50051')
stub = UserServiceStub(channel)
user = stub.GetUser(GetUserRequest(user_id=123))
print(user.name)  # typed! IDE knows this is a User with .name, .email

gRPC vs REST Performance

# Benchmark (approximate, varies by payload):
# JSON/REST  → 100KB user list: 1.2ms serialization, 450ms transmission (JSON)
# gRPC/Proto → 100KB user list: 0.3ms serialization, 80ms transmission (binary)
# Improvement: ~4× faster serialization, ~5× less bandwidth

# gRPC streaming modes:
# Unary (standard):   client sends request → server sends one response
# Server streaming:   client sends request → server streams multiple responses (progress, live data)
# Client streaming:   client streams data → server sends one response (file upload, bulk insert)
# Bidirectional:      both stream simultaneously (chat, gaming, live collaboration)

# When to use gRPC:
✓ Internal microservice communication (never expose gRPC directly to browsers*)
✓ High-throughput APIs where bandwidth matters (ML inference, streaming pipelines)
✓ Bidirectional streaming (chat, real-time gaming)
✓ Polyglot systems — generated stubs ensure consistency across Java/Go/Python services
* gRPC-Web exists but requires a proxy; most teams use REST or GraphQL for browsers

6 Full Comparison & Decision Guide

Dimension	REST	GraphQL	gRPC
Protocol	HTTP/1.1 or HTTP/2	HTTP/1.1 or HTTP/2	HTTP/2 only
Payload format	JSON (text)	JSON (text)	Protobuf (binary)
Schema / typing	Optional (OpenAPI)	Required (GraphQL schema)	Required (.proto)
Client fetching	Fixed response shape	Client-specified fields	Fixed (generated)
Browser support	Native	Native	Needs gRPC-Web proxy
Streaming	SSE (one-way) / WebSocket	Subscriptions (WebSocket)	Native 4 modes
Caching	Easy (HTTP cache, CDN)	Complex (POST by default)	No HTTP caching
Versioning	URL/header versioning	Schema deprecation	.proto evolution rules
Performance	Moderate	Moderate	High (binary, HTTP/2)
Learning curve	Low	Medium	High (protoc toolchain)
Best for	Public APIs, CRUD	Multi-client apps, BFFs	Internal services, ML

Architecture Decision in Interviews

# Pattern: use multiple API styles in one system (common at large scale)

External (browser/mobile) ← REST JSON (easy to consume, CDN-cacheable)
                          ← GraphQL (if multiple clients with different needs)
      ↓
API Gateway
      ↓
Internal services ← gRPC (low latency, strong typing, streaming support)
      ↓
Analytics/reporting ← REST (simple, ad-hoc queries from BI tools)

# Real examples:
# Uber: REST for driver/rider apps, gRPC between internal services (Dispatch, Maps)
# Netflix: REST for client apps, gRPC for internal streaming microservices
# GitHub: REST v3 (public), GraphQL v4 (rich client querying), gRPC internally
# Shopify: REST Admin API, GraphQL Storefront API (per-client flexibility)

7 REST API Design Checklist

# ✓ RESOURCES
# □ Plural nouns: /users, /orders (not /user, /getOrders)
# □ Hierarchical: /users/123/orders/456 (not /getUserOrder?userId=123&orderId=456)
# □ Lowercase, hyphenated: /blog-posts (not /BlogPosts or /blogPosts)

# ✓ HTTP METHODS
# □ GET for reads, POST for create, PUT/PATCH for update, DELETE for delete
# □ No verbs in URLs — use sub-resources for actions (/users/123/activate)

# ✓ STATUS CODES
# □ 201 Created with Location header after POST
# □ 204 No Content for successful DELETE
# □ 400 for validation errors (with field-level error details)
# □ 401 for missing auth, 403 for insufficient permissions
# □ 429 for rate limiting with Retry-After header

# ✓ ERRORS
# □ Consistent error format across all endpoints:
{
  "error": {
    "code": "VALIDATION_FAILED",
    "message": "Request validation failed",
    "details": [
      {"field": "email", "message": "Must be a valid email address"},
      {"field": "age", "message": "Must be 18 or older"}
    ],
    "request_id": "req_8f2e9a1b"  # ← for support/debugging
  }
}

# ✓ PAGINATION
# □ Use cursor-based for feeds/large datasets
# □ Always include has_next (never force client to check empty array)

# ✓ SECURITY
# □ HTTPS only — no HTTP
# □ Auth in headers (Authorization: Bearer <token>), never in URL
# □ Validate all inputs at API layer (don't trust client)
# □ Rate limit all endpoints (especially auth endpoints)

# ✓ DOCUMENTATION
# □ OpenAPI spec (generates Swagger UI, client SDKs)
# □ Include example requests and responses
# □ Document all error codes

What to Study Next

Rate Limiter Design — implementing the 429 rate limiting mentioned above
Microservices vs Monolith — API design in the context of service architecture
Load Balancing Algorithms — routing API requests across service instances
Payment System Design — idempotency keys applied to real payment APIs
All System Design Topics →