AWS DynamoDB: NoSQL Database Design and Best Practices (2026)

DynamoDB is AWS's fully managed, serverless NoSQL database. It scales to millions of requests per second with single-digit millisecond latency — but only if your data model is right. Unlike relational databases, DynamoDB forces you to think about your access patterns upfront. Get the key design right and DynamoDB is phenomenally fast and cheap. Get it wrong and you'll fight it every step of the way. This guide covers everything from key design fundamentals to advanced patterns like single-table design and DAX caching.

Partition Key and Sort Key Design

Every DynamoDB item is identified by a primary key. The primary key is either a single partition key (PK) or a composite key of partition key + sort key (SK). DynamoDB hashes the partition key to determine which storage partition holds the item. Items with the same PK but different SKs are stored together and can be range-queried efficiently.

The cardinal rule: your partition key must have high cardinality. A hot partition (one key receiving most traffic) will throttle at ~3,000 RCUs or 1,000 WCUs regardless of your table's total capacity.

Bad PK ChoiceWhy It's BadBetter Choice
status (e.g., "active")Low cardinality, hot partitionuserId
date (e.g., "2026-06-05")All writes on same day go to one partitionuserId#date composite
countryUneven distribution (US gets 40% of traffic)Add a random suffix: US#3

Creating a table with a composite key using the AWS SDK (Python):

import boto3

dynamodb = boto3.resource('dynamodb', region_name='us-east-1')

table = dynamodb.create_table(
    TableName='Orders',
    KeySchema=[
        {'AttributeName': 'customerId', 'KeyType': 'HASH'},   # Partition key
        {'AttributeName': 'orderId',    'KeyType': 'RANGE'},  # Sort key
    ],
    AttributeDefinitions=[
        {'AttributeName': 'customerId', 'AttributeType': 'S'},
        {'AttributeName': 'orderId',    'AttributeType': 'S'},
    ],
    BillingMode='PAY_PER_REQUEST',
)
table.wait_until_exists()
print("Table created:", table.table_status)

Single-Table Design Pattern

Single-table design stores multiple entity types in one table. Items use generic PK/SK attribute names (often literally PK and SK) with a naming convention that encodes the entity type and ID. This eliminates cross-table joins and enables fetching related entities in a single query.

Example: an e-commerce app stores Customers, Orders, and OrderItems in one table:

# Customer item
PK = "CUSTOMER#cust-123"
SK = "METADATA"
email = "alice@example.com"
name = "Alice"

# Order item
PK = "CUSTOMER#cust-123"
SK = "ORDER#2026-06-05#ord-456"
total = 99.99
status = "shipped"

# OrderItem
PK = "ORDER#ord-456"
SK = "ITEM#sku-789"
quantity = 2
price = 49.99

Now you can fetch all orders for a customer with a single Query:

import boto3
from boto3.dynamodb.conditions import Key

table = dynamodb.Table('EcommerceApp')

# Get all orders for customer (sorted by date because SK starts with ORDER#date)
response = table.query(
    KeyConditionExpression=Key('PK').eq('CUSTOMER#cust-123') &
                           Key('SK').begins_with('ORDER#')
)
orders = response['Items']
Tip: Add a GSI1PK / GSI1SK attribute to items when you need a different access pattern. This "overloaded GSI" approach is the standard way to support multiple query dimensions without multiple tables.

GSI and LSI

A Local Secondary Index (LSI) uses the same partition key as the base table but a different sort key. It must be defined at table creation and shares the base table's capacity. Use it when you need to sort or filter the same partition by a different attribute.

A Global Secondary Index (GSI) can use any attributes as PK and SK, spans all partitions, and has its own capacity. It can be created or deleted after table creation.

# Add a GSI to query orders by status (e.g., find all "pending" orders globally)
aws dynamodb update-table \
  --table-name Orders \
  --attribute-definitions \
    AttributeName=status,AttributeType=S \
    AttributeName=createdAt,AttributeType=S \
  --global-secondary-index-updates '[{
    "Create": {
      "IndexName": "StatusIndex",
      "KeySchema": [
        {"AttributeName": "status", "KeyType": "HASH"},
        {"AttributeName": "createdAt", "KeyType": "RANGE"}
      ],
      "Projection": {"ProjectionType": "ALL"},
      "BillingMode": "PAY_PER_REQUEST"
    }
  }]'
GSI limitations: GSIs are eventually consistent — there's a brief replication lag from the base table. If you need strongly consistent reads, query the base table. Also, GSIs with low-cardinality partition keys suffer the same hot-partition problem as base tables.

Read/Write Capacity: Provisioned vs On-Demand

DynamoDB offers two billing modes:

ModeHow it WorksBest ForCost
On-DemandPay per request, auto-scales instantlyUnpredictable or spiky traffic~$1.25/million WRU, $0.25/million RRU
Provisioned + Auto ScalingSet RCU/WCU targets, auto scales within limitsSteady, predictable traffic~$0.65/WCU-hr, $0.13/RCU-hr (reserved saves ~75%)

For new applications, start with on-demand. After 4-6 weeks of production data, analyze your CloudWatch metrics and switch to provisioned with auto scaling if traffic is steady — you'll typically save 40-60%.

DynamoDB Streams

DynamoDB Streams captures a time-ordered sequence of item-level changes (INSERT, MODIFY, REMOVE) in a table. Records are available for 24 hours. The primary use case is triggering Lambda functions for event-driven architectures.

# Enable streams on an existing table
aws dynamodb update-table \
  --table-name Orders \
  --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES

# Lambda trigger (Python handler)
def handler(event, context):
    for record in event['Records']:
        if record['eventName'] == 'INSERT':
            new_item = record['dynamodb']['NewImage']
            order_id = new_item['orderId']['S']
            print(f"New order: {order_id}")
            # Send confirmation email, update analytics, etc.
        elif record['eventName'] == 'MODIFY':
            old_status = record['dynamodb']['OldImage']['status']['S']
            new_status = record['dynamodb']['NewImage']['status']['S']
            if old_status != new_status:
                print(f"Status changed: {old_status} -> {new_status}")

StreamViewType options: KEYS_ONLY, NEW_IMAGE, OLD_IMAGE, NEW_AND_OLD_IMAGES. Use NEW_AND_OLD_IMAGES when you need to detect what changed.

TTL (Time to Live)

TTL automatically deletes items when a Unix timestamp attribute passes the current time. Deletion is free (doesn't consume WCUs) and typically happens within 48 hours after expiry.

import time

# Store a session that expires in 24 hours
table.put_item(Item={
    'sessionId': 'sess-abc123',
    'userId': 'user-456',
    'data': {'cart': ['item1', 'item2']},
    'expiresAt': int(time.time()) + 86400  # 24 hours from now
})

# Enable TTL on the table (one-time setup)
aws dynamodb update-time-to-live \
  --table-name Sessions \
  --time-to-live-specification "Enabled=true,AttributeName=expiresAt"
Tip: TTL deletions appear in DynamoDB Streams with a userIdentity.type of Service and principalId of dynamodb.amazonaws.com. Filter them out in your stream processors to avoid re-processing expired items.

Transactions

DynamoDB transactions let you perform multiple reads or writes atomically across up to 100 items in one or more tables. They use two-phase commit under the hood and cost 2x the normal RCU/WCU.

import boto3

dynamodb = boto3.client('dynamodb', region_name='us-east-1')

# Transfer credits between two accounts atomically
dynamodb.transact_write(
    TransactItems=[
        {
            'Update': {
                'TableName': 'Accounts',
                'Key': {'accountId': {'S': 'acc-A'}},
                'UpdateExpression': 'SET balance = balance - :amount',
                'ConditionExpression': 'balance >= :amount',
                'ExpressionAttributeValues': {':amount': {'N': '100'}}
            }
        },
        {
            'Update': {
                'TableName': 'Accounts',
                'Key': {'accountId': {'S': 'acc-B'}},
                'UpdateExpression': 'SET balance = balance + :amount',
                'ExpressionAttributeValues': {':amount': {'N': '100'}}
            }
        }
    ]
)

If the ConditionExpression on account A fails (insufficient balance), the entire transaction is rolled back. This is exactly the ACID guarantee you need for financial operations.

DAX for Caching

DynamoDB Accelerator (DAX) is an in-memory cache for DynamoDB with microsecond read latency. It's API-compatible — you replace the DynamoDB client with the DAX client and the code is otherwise identical.

import amazondax
import boto3

# Replace boto3 DynamoDB client with DAX client
dax = amazondax.AmazonDaxClient(
    endpoints=['my-dax-cluster.abc123.dax-clusters.us-east-1.amazonaws.com:8111'],
    region_name='us-east-1'
)

# Usage is identical to boto3
response = dax.get_item(
    TableName='Products',
    Key={'productId': {'S': 'prod-123'}}
)
# First call hits DynamoDB, subsequent calls hit DAX cache (microseconds)

DAX is ideal for read-heavy workloads where the same items are read repeatedly. It does NOT cache queries or scans by default (only GetItem and BatchGetItem are cached in the item cache). Use the query cache for Query/Scan caching.

PartiQL

PartiQL is a SQL-compatible query language for DynamoDB. It lets you use familiar SELECT/INSERT/UPDATE/DELETE syntax without learning the DynamoDB expression language. However, it still performs the same underlying operations — a SELECT without a WHERE on the partition key will still do a full table scan.

# Query via PartiQL in AWS CLI
aws dynamodb execute-statement \
  --statement "SELECT * FROM Orders WHERE customerId = 'cust-123' AND begins_with(orderId, 'ORD-2026')"

# Batch statement for multiple operations
aws dynamodb batch-execute-statement \
  --statements '[
    {"Statement": "UPDATE Orders SET status = '\''shipped'\'' WHERE customerId = '\''cust-123'\'' AND orderId = '\''ORD-001'\''"},
    {"Statement": "UPDATE Orders SET status = '\''shipped'\'' WHERE customerId = '\''cust-456'\'' AND orderId = '\''ORD-002'\''"}
  ]'

Backup and Point-in-Time Recovery

Enable Point-in-Time Recovery (PITR) on all production tables. It continuously backs up your table and lets you restore to any second within the last 35 days. On-demand backups are also available for long-term retention (archived to S3, no expiry).

# Enable PITR
aws dynamodb update-continuous-backups \
  --table-name Orders \
  --point-in-time-recovery-specification PointInTimeRecoveryEnabled=true

# Restore to a specific time (creates a new table)
aws dynamodb restore-table-to-point-in-time \
  --source-table-name Orders \
  --target-table-name Orders-Restored-2026-06-05 \
  --restore-date-time 2026-06-05T12:00:00Z

Frequently Asked Questions

Q: When should I NOT use DynamoDB?

Avoid DynamoDB when: you have complex relational queries with multiple joins, your access patterns are unknown at design time, you need ad-hoc analytics (use Athena + S3 instead), or your team is more comfortable with SQL. DynamoDB shines for high-scale operational workloads with well-defined access patterns.

Q: What's the item size limit in DynamoDB?

Each item (including all attributes and key names) cannot exceed 400 KB. For larger objects, store the binary data in S3 and keep a reference (S3 key) in DynamoDB.

Q: How do I model many-to-many relationships?

Use an adjacency list pattern. Store both entities and their relationships in the same table. For example, for Users and Groups: store User items (PK=USER#id, SK=METADATA), Group items (PK=GROUP#id, SK=METADATA), and membership items (PK=USER#id, SK=GROUP#id and also PK=GROUP#id, SK=USER#id for reverse lookup via GSI).

Q: How do I handle DynamoDB throttling?

Use exponential backoff with jitter in your retry logic (the AWS SDKs do this automatically). For provisioned tables, enable auto scaling or switch to on-demand mode. For persistent throttling, analyze whether you have a hot partition key and consider adding a write sharding suffix.