AWS Redshift: Data Warehouse Architecture and Query Optimization
Amazon Redshift is AWS's fully managed petabyte-scale data warehouse — the engine behind some of the largest analytics workloads on the planet. It combines a massively parallel processing (MPP) architecture, columnar storage, and tight S3 integration to deliver sub-second query performance on hundreds of terabytes of structured data. Whether you are migrating an on-premises Oracle data warehouse, building a new analytics platform from scratch, or extending an existing S3 data lake with SQL query power, Redshift is almost certainly in the conversation. This guide covers everything you need to understand, deploy, and operate a production Redshift cluster in 2026: node types, distribution strategies, sort keys, data loading patterns, WLM tuning, Spectrum, Concurrency Scaling, data sharing, security, and maintenance — all with real SQL and Terraform examples.
Table of Contents
- Redshift vs Athena vs BigQuery vs Snowflake
- Architecture: Leader Node, Compute Nodes, RA3 vs DC2, Serverless
- Table Design: Distribution Styles, Sort Keys, Column Encoding
- Loading Data: COPY from S3, Firehose, Glue
- Query Optimization: EXPLAIN, WLM, Materialized Views, Result Cache
- Redshift Spectrum: Querying the S3 Data Lake
- Concurrency Scaling
- Data Sharing: Live Access Across Clusters and Accounts
- Security: Encryption, VPC, IAM, Column and Row Security
- Maintenance: VACUUM, ANALYZE, System Views, CloudWatch
- Terraform: Provision a Redshift Cluster
- Frequently Asked Questions
Redshift vs Athena vs BigQuery vs Snowflake
Choosing the right analytical database for your workload is a strategic decision. The four dominant options in 2026 — Redshift, Athena, BigQuery, and Snowflake — each have a different strength. Understanding the trade-offs prevents expensive migrations later.
| Dimension | Redshift | Athena | BigQuery | Snowflake |
|---|---|---|---|---|
| Pricing model | Per node-hour (provisioned) or per RPU-second (Serverless) | $5/TB scanned | $6.25/TB scanned (on-demand) or flat-rate slots | Per virtual warehouse credit-hour |
| Query latency | Sub-second on warm data; excellent for BI concurrency | Seconds to minutes; ad hoc only | 1–30 s for most queries | Sub-second to seconds depending on warehouse size |
| Data location | Redshift-managed storage + S3 via Spectrum | S3 only (no local storage) | BigQuery managed storage | Snowflake managed storage |
| Concurrency | Excellent with WLM + Concurrency Scaling | Limited (20 concurrent DML by default) | Excellent (slot-based) | Excellent (separate warehouses) |
| ETL ecosystem | Native COPY, Glue, Firehose, dbt | Glue, Spark, CTAS | Dataflow, BigQuery DTS, dbt | Snowpipe, Streams, Tasks, dbt |
| AWS integration | Native (IAM, S3, KMS, VPC, CloudWatch, Lake Formation) | Native | Requires cross-cloud setup | Available on AWS but not native |
| Best for | High-concurrency BI on structured data, AWS-native shops | Ad hoc S3 exploration, infrequent queries | Google Cloud shops, serverless massive scale | Multi-cloud, separation of storage/compute |
Architecture: Leader Node, Compute Nodes, RA3 vs DC2, Serverless
Redshift's MPP architecture splits work across a leader node and one or more compute nodes. Understanding how these pieces interact is essential for sizing your cluster correctly and troubleshooting slow queries.
Leader Node
The leader node is the entry point for every SQL query. It parses and compiles the query, generates an execution plan, distributes work to the compute nodes, collects partial results, and assembles the final result set for the client. The leader node does not store user data — it only holds cluster metadata, the system catalog, and query state. Because all client connections go through the leader node, it can become a bottleneck if you run many simultaneous small queries without WLM (Workload Management) configuration.
Compute Nodes and Slices
Compute nodes store the actual data in columnar format on local SSDs (DC2) or S3-backed managed storage (RA3). Each compute node is subdivided into slices — independent processing units that work in parallel. A 4-node ra3.4xlarge cluster has 4 nodes × 4 slices = 16 slices. Data is distributed across slices according to the distribution style you choose for each table; this is the single biggest tuning decision in Redshift design.
RA3 vs DC2 Node Types
| Property | RA3 (ra3.xlplus / ra3.4xlarge / ra3.16xlarge) | DC2 (dc2.large / dc2.8xlarge) |
|---|---|---|
| Storage | Redshift Managed Storage (RMS) — S3-backed, decoupled from compute | Local NVMe SSD, fixed to node count |
| Scaling storage | Independent of compute — add TB without adding nodes | Must add nodes to add storage |
| Best for | Datasets over 1 TB, variable query patterns, growth workloads | Sub-1 TB datasets with consistent high-performance needs |
| Spectrum | Supported | Supported |
| Cross-AZ restore | Fast (data in S3) | Slower (EBS snapshot) |
| 2026 recommendation | Default choice for new clusters | Legacy; prefer RA3 for new deployments |
Redshift Serverless
Redshift Serverless eliminates cluster management entirely. You define a namespace (logical grouping for databases, schemas, users) and a workgroup (compute configuration: base RPUs, max RPUs, VPC settings). Redshift Serverless scales RPUs (Redshift Processing Units) automatically in response to query demand and charges per RPU-second — with a minimum of 8 RPUs (base capacity). It is the right choice for variable or intermittent workloads, development environments, and situations where you want zero operations overhead. For sustained, predictable high-concurrency workloads, provisioned RA3 clusters with reserved pricing still offer better cost efficiency.
# Create a Redshift Serverless namespace and workgroup via AWS CLI
aws redshift-serverless create-namespace \
--namespace-name prod-analytics \
--admin-username admin \
--admin-user-password 'Ch@ngeMe2026!' \
--db-name analytics \
--default-iam-role-arn arn:aws:iam::123456789012:role/RedshiftS3Role \
--iam-roles arn:aws:iam::123456789012:role/RedshiftS3Role \
--log-exports '["userlog","connectionlog","useractivitylog"]'
aws redshift-serverless create-workgroup \
--workgroup-name prod-wg \
--namespace-name prod-analytics \
--base-capacity 32 \
--max-capacity 512 \
--subnet-ids subnet-aaa subnet-bbb subnet-ccc \
--security-group-ids sg-xxxxxxxx \
--publicly-accessible false
Table Design: Distribution Styles, Sort Keys, Column Encoding
Table design is where Redshift performance is won or lost. Getting distribution keys and sort keys right at table creation time saves enormous query optimization effort later. Changing them after the fact requires a full table rebuild.
Distribution Styles
Redshift distributes each row of data to a specific slice based on the table's distribution style. The goal is to place rows that will be joined together on the same slice, eliminating expensive data redistribution (network shuffles) at query time.
| Style | How it distributes | Use when |
|---|---|---|
DISTKEY(col) | Hash of the distribution column value | Large fact tables; join partner has same DISTKEY |
ALL | Full copy on every node | Small dimension tables (<5M rows) joined to large facts |
EVEN | Round-robin across slices | No clear join key; no filtering benefit needed |
AUTO | Redshift chooses; changes over time | New tables where you are unsure; let Redshift learn |
Sort Keys
Sort keys control the physical order of rows on disk. When you filter on a sort key column, Redshift can skip entire 1 MB blocks using zone maps (min/max metadata per block) without scanning them — the Redshift equivalent of partition pruning.
- Compound sort key: Most selective column first. Efficient for queries that filter on the leading columns in order. Example:
SORTKEY(event_date, user_id)is excellent forWHERE event_date = '2026-06-01'but less useful forWHERE user_id = 42alone. - Interleaved sort key: All columns equally weighted. Better when queries filter on any single column in the key — but VACUUM and ANALYZE are slower and more expensive. Avoid on very large tables. AWS recommends compound sort keys for most workloads as of 2026.
Column Encoding
Redshift stores data in columnar format and applies compression encoding per column. Use ANALYZE COMPRESSION on a loaded table to get Redshift's encoding recommendations.
-- Example: well-designed orders fact table
CREATE TABLE orders (
order_id BIGINT NOT NULL ENCODE DELTA,
customer_id BIGINT NOT NULL ENCODE LZO,
product_id INTEGER NOT NULL ENCODE LZO,
order_date DATE NOT NULL ENCODE DELTA32K,
amount_usd DECIMAL(12,2) NOT NULL ENCODE BYTEDICT,
status VARCHAR(20) NOT NULL ENCODE BYTEDICT,
region VARCHAR(10) NOT NULL ENCODE BYTEDICT,
created_at TIMESTAMP NOT NULL ENCODE DELTA32K
)
DISTSTYLE KEY
DISTKEY (customer_id) -- distributes on join key
COMPOUND SORTKEY (order_date, customer_id) -- most common filter first
;
-- Small dimension table: replicated to every node for co-located JOINs
CREATE TABLE customers (
customer_id BIGINT NOT NULL ENCODE DELTA,
name VARCHAR(200) NOT NULL ENCODE LZO,
email VARCHAR(200) NOT NULL ENCODE LZO,
country VARCHAR(50) NOT NULL ENCODE BYTEDICT,
tier VARCHAR(20) NOT NULL ENCODE BYTEDICT,
created_date DATE NOT NULL ENCODE DELTA32K
)
DISTSTYLE ALL
SORTKEY (customer_id)
;
-- After loading data, get encoding recommendations
ANALYZE COMPRESSION orders;
ENCODE AUTO (the default for new tables), Redshift automatically selects and adjusts encodings as data is loaded. This is a safe default — it learns from your actual data distribution. Override with explicit encodings only when you have profiling data showing a specific encoding is significantly better.Loading Data: COPY from S3, Firehose, AWS Glue
The COPY command is the fastest and most flexible way to load data into Redshift. It reads in parallel across all compute node slices, supports multiple file formats, and integrates natively with S3, DynamoDB, EMR, and SSH. Never use row-by-row INSERTs for bulk loading — they bypass MPP parallelism and are 10–100x slower.
COPY from S3 — Core Patterns
-- 1. Basic COPY from CSV files in S3
COPY orders
FROM 's3://my-data-lake/raw/orders/2026/06/'
IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftS3Role'
FORMAT AS CSV
IGNOREHEADER 1
DATEFORMAT 'YYYY-MM-DD'
TIMEFORMAT 'auto'
MAXERROR 100 -- tolerate up to 100 parse errors before aborting
COMPUPDATE ON -- analyze compression after load
STATUPDATE ON; -- refresh table statistics after load
-- 2. COPY from Parquet (preferred format — no type inference needed)
COPY orders
FROM 's3://my-data-lake/processed/orders/2026/'
IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftS3Role'
FORMAT AS PARQUET;
-- 3. COPY with a manifest file — precise control over which files to load
-- Create manifest at s3://my-data-lake/manifests/orders_june.json
COPY orders
FROM 's3://my-data-lake/manifests/orders_june.json'
IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftS3Role'
FORMAT AS PARQUET
MANIFEST;
The manifest file is a JSON document listing the exact S3 objects to load — useful when you want to load a specific set of files rather than an entire prefix:
{
"entries": [
{ "url": "s3://my-data-lake/processed/orders/2026/06/part-000.parquet", "mandatory": true },
{ "url": "s3://my-data-lake/processed/orders/2026/06/part-001.parquet", "mandatory": true },
{ "url": "s3://my-data-lake/processed/orders/2026/06/part-002.parquet", "mandatory": true }
]
}
Kinesis Firehose to Redshift
Kinesis Data Firehose can deliver streaming records directly to Redshift via an intermediate S3 landing zone. Firehose buffers records (by size up to 128 MB or time up to 900 seconds), writes Parquet/JSON/CSV to S3, and issues a COPY command to load them into Redshift. This pattern is ideal for near-real-time operational data — clickstreams, application events, IoT telemetry — that need to be queryable in Redshift within minutes of being produced.
# Create a Firehose delivery stream targeting Redshift
aws firehose create-delivery-stream \
--delivery-stream-name orders-to-redshift \
--delivery-stream-type DirectPut \
--redshift-destination-configuration '{
"RoleARN": "arn:aws:iam::123456789012:role/FirehoseRedshiftRole",
"ClusterJDBCURL": "jdbc:redshift://mycluster.xxxx.us-east-1.redshift.amazonaws.com:5439/analytics",
"CopyCommand": {
"DataTableName": "orders_staging",
"DataTableColumns": "order_id,customer_id,product_id,amount_usd,status,created_at",
"CopyOptions": "FORMAT AS JSON '\''auto'\'' TIMEFORMAT '\''auto'\''"
},
"Username": "firehose_loader",
"Password": "{{resolve:secretsmanager:redshift/firehose:SecretString:password}}",
"S3Configuration": {
"RoleARN": "arn:aws:iam::123456789012:role/FirehoseRedshiftRole",
"BucketARN": "arn:aws:s3:::my-firehose-staging",
"Prefix": "redshift/orders/",
"BufferingHints": { "SizeInMBs": 64, "IntervalInSeconds": 300 },
"CompressionFormat": "SNAPPY"
}
}'
AWS Glue to Redshift
AWS Glue ETL jobs can write directly to Redshift using the Glue JDBC connector or the native Redshift connector (which internally uses the COPY command via S3 for performance). The Glue Redshift connector is the recommended approach for large-scale ETL loads because it avoids row-by-row JDBC inserts.
# AWS Glue PySpark job — transform CSV to Redshift via S3 COPY
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
# Read raw CSV from S3 via Glue Data Catalog
raw = glueContext.create_dynamic_frame.from_catalog(
database='ecommerce_raw',
table_name='orders_csv'
)
# Apply transformations
from awsglue.transforms import ApplyMapping
mapped = ApplyMapping.apply(frame=raw, mappings=[
('order_id', 'string', 'order_id', 'long'),
('customer_id', 'string', 'customer_id', 'long'),
('product_id', 'string', 'product_id', 'int'),
('amount', 'string', 'amount_usd', 'decimal'),
('status', 'string', 'status', 'string'),
('created_at', 'string', 'created_at', 'timestamp'),
])
# Write to Redshift using the native connector (internally uses COPY)
glueContext.write_dynamic_frame.from_jdbc_conf(
frame=mapped,
catalog_connection='redshift-prod',
connection_options={
'dbtable': 'orders',
'database': 'analytics',
'aws_iam_role': 'arn:aws:iam::123456789012:role/RedshiftS3Role',
'extracopyoptions': 'TRUNCATECOLUMNS COMPUPDATE ON STATUPDATE ON'
},
redshift_tmp_dir='s3://my-glue-temp/redshift-staging/'
)
job.commit()
Query Optimization: EXPLAIN, WLM, Materialized Views, Result Cache
Once your cluster is running and data is loaded, query performance tuning is an iterative discipline. Redshift provides rich tools: EXPLAIN plans, WLM queues, query monitoring rules, materialized views, and result caching. Use them in combination.
Reading the EXPLAIN Plan
-- Prefix any query with EXPLAIN to get the execution plan without running it
EXPLAIN
SELECT
c.country,
SUM(o.amount_usd) AS total_revenue,
COUNT(DISTINCT o.customer_id) AS unique_customers
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date BETWEEN '2026-01-01' AND '2026-06-30'
AND o.status = 'COMPLETED'
GROUP BY c.country
ORDER BY total_revenue DESC;
Key plan nodes to look for:
- DS_DIST_NONE — no data redistribution needed (collocated JOIN — best case)
- DS_DIST_ALL — broadcast the smaller table to all nodes (acceptable for small dimensions)
- DS_DIST_INNER — redistribute the inner table by join key (expensive on large tables — fix with DISTKEY)
- DS_BCAST_INNER — broadcast inner table to all nodes (expensive — indicates wrong distribution design)
- SORTKEY FILTER — zone map pruning applied (good — sort key is being used)
- rows=N — estimated row count; large overestimates suggest stale statistics (run ANALYZE)
Workload Management (WLM)
WLM controls how Redshift allocates memory and concurrency slots across competing queries. Without WLM tuning, a single runaway analytical query can consume all available memory and starve short dashboard queries. Configure WLM via the parameter group.
[
{
"name": "Short queries",
"query_group": ["dashboard", "short"],
"query_group_wild_card": 0,
"user_group": [],
"user_group_wild_card": 0,
"concurrency_scaling": "auto",
"concurrency": 15,
"target_query_execution_time": 30,
"memory_percent_to_use": 40
},
{
"name": "ETL / bulk loads",
"query_group": ["etl", "load"],
"user_group": ["etl_users"],
"user_group_wild_card": 0,
"concurrency": 3,
"memory_percent_to_use": 35
},
{
"name": "Default",
"concurrency_scaling": "off",
"concurrency": 5,
"memory_percent_to_use": 25
}
]
Materialized Views
-- Create a materialized view for a common heavy aggregation
CREATE MATERIALIZED VIEW mv_daily_revenue AS
SELECT
o.order_date,
c.country,
c.tier,
SUM(o.amount_usd) AS revenue,
COUNT(*) AS order_count,
COUNT(DISTINCT o.customer_id) AS unique_customers
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.status = 'COMPLETED'
GROUP BY o.order_date, c.country, c.tier;
-- Refresh manually or on a schedule
REFRESH MATERIALIZED VIEW mv_daily_revenue;
-- Query the materialized view (much faster than the base query)
SELECT country, SUM(revenue) AS total
FROM mv_daily_revenue
WHERE order_date >= '2026-01-01'
GROUP BY country
ORDER BY total DESC;
Redshift also supports automatic query rewriting: if a query matches the definition of an existing materialized view, Redshift transparently rewrites the query to use the MV even if you didn't reference it explicitly. Enable this with SET mv_enable_aqmv_for_session = TRUE or at the cluster level via the parameter group.
Result Caching
Redshift caches the result of a query. If the exact same query is resubmitted by any user and the underlying tables have not changed, the result is returned from cache with sub-millisecond latency. Result caching is enabled by default and requires no configuration. It is particularly valuable for BI dashboards that run the same aggregation queries every time a user loads a report page. Monitor cache hits via the SVL_QLOG system view: SELECT * FROM SVL_QLOG WHERE from_sp_call IS NULL AND query_type = 'cache' LIMIT 20;
Redshift Spectrum: Querying the S3 Data Lake
Redshift Spectrum extends Redshift's SQL engine to query data sitting directly in S3 without loading it into Redshift-managed storage. The data stays in S3 in open formats (Parquet, ORC, CSV, JSON); Spectrum spins up a fleet of thousands of Spectrum nodes managed by AWS to handle the compute. You pay $5 per TB scanned — the same pricing as Athena — in addition to your Redshift node charges.
This makes Spectrum ideal for the "hot/cold" data pattern: recent, frequently queried data lives in Redshift tables for speed; historical data lives in S3 and is queried via Spectrum when needed. A single JOIN across both is transparent to the BI tool.
Setting Up External Tables
-- Step 1: Create an external schema pointing at the Glue Data Catalog
CREATE EXTERNAL SCHEMA spectrum_ecommerce
FROM DATA CATALOG
DATABASE 'ecommerce_lake'
IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftSpectrumRole'
CREATE EXTERNAL DATABASE IF NOT EXISTS;
-- Step 2: Create an external table for historical orders in S3
CREATE EXTERNAL TABLE spectrum_ecommerce.orders_historical (
order_id BIGINT,
customer_id BIGINT,
product_id INTEGER,
order_date DATE,
amount_usd DECIMAL(12,2),
status VARCHAR(20),
region VARCHAR(10)
)
PARTITIONED BY (year INT, month INT)
STORED AS PARQUET
LOCATION 's3://my-data-lake/historical/orders/'
TABLE PROPERTIES ('numRows'='5000000000');
-- Step 3: Register partitions
ALTER TABLE spectrum_ecommerce.orders_historical
ADD PARTITION (year=2024, month=1)
LOCATION 's3://my-data-lake/historical/orders/year=2024/month=1/';
-- Step 4: JOIN historical S3 data with current Redshift data in a single query
SELECT
c.country,
SUM(CASE WHEN o.order_date >= '2026-01-01' THEN o.amount_usd ELSE 0 END) AS ytd_revenue,
SUM(CASE WHEN h.order_date >= '2024-01-01' THEN h.amount_usd ELSE 0 END) AS two_yr_revenue
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id -- Redshift local table
LEFT JOIN spectrum_ecommerce.orders_historical h
ON c.customer_id = h.customer_id -- S3 via Spectrum
GROUP BY c.country
ORDER BY two_yr_revenue DESC;
TABLE PROPERTIES ('numRows'='N') so the Redshift query planner has accurate cardinality estimates for optimal join ordering.Concurrency Scaling
Concurrency Scaling automatically adds transient read-capacity clusters when queue wait times exceed a threshold. The primary cluster handles writes; Concurrency Scaling clusters handle read queries from the WLM queue overflow. This lets you handle unpredictable peaks — end-of-month reporting bursts, BI dashboard refreshes during business hours — without provisioning for peak capacity permanently.
Pricing: the first one hour of Concurrency Scaling cluster usage per day is free. Beyond that, you pay per RPU-second consumed by Concurrency Scaling clusters — typically far cheaper than permanently adding nodes.
-- Enable Concurrency Scaling for a WLM queue via the AWS CLI parameter group
-- (The WLM config is JSON; set concurrency_scaling: "auto" for the target queue)
-- Monitor Concurrency Scaling activity
SELECT
query,
service_class,
concurrency_scaling_status,
starttime,
endtime,
DATEDIFF(seconds, starttime, endtime) AS duration_sec
FROM SVL_QUERY_REPORT
WHERE concurrency_scaling_status IN (1, 2) -- 1=sent to CS, 2=returned from CS
AND starttime > GETDATE() - INTERVAL '1 day'
ORDER BY starttime DESC
LIMIT 50;
-- See daily Concurrency Scaling usage to predict costs
SELECT
TRUNC(starttime) AS query_date,
COUNT(*) AS cs_queries,
SUM(DATEDIFF(seconds, starttime, endtime)) AS total_cs_seconds
FROM STL_QUERY
WHERE concurrency_scaling_status = 1
AND starttime > GETDATE() - INTERVAL '30 days'
GROUP BY 1
ORDER BY 1;
Data Sharing: Live Access Across Clusters and Accounts
Redshift Data Sharing lets you share live, read-only access to data between Redshift clusters — within the same AWS account or across different accounts — without copying or moving the data. The producer cluster grants access to specific databases, schemas, or tables; the consumer cluster queries them as if they were local tables with sub-second latency.
This is powerful for the data mesh pattern: each domain team owns its Redshift cluster (producer), and other teams (consumers) can query authoritative domain data directly without waiting for ETL pipelines to replicate it.
-- On the PRODUCER cluster:
-- Step 1: Create a datashare
CREATE DATASHARE analytics_share;
-- Step 2: Add objects to the share
ALTER DATASHARE analytics_share ADD SCHEMA public;
ALTER DATASHARE analytics_share ADD ALL TABLES IN SCHEMA public;
-- Or add specific tables:
-- ALTER DATASHARE analytics_share ADD TABLE public.orders;
-- ALTER DATASHARE analytics_share ADD TABLE public.customers;
-- Step 3: Grant access to the consumer account (cross-account)
GRANT USAGE ON DATASHARE analytics_share
TO ACCOUNT '987654321098' VIA DATA CATALOG;
-- On the CONSUMER cluster (in the receiving AWS account):
-- After accepting the share in the console or via CLI:
-- Step 4: Create a local database from the share
CREATE DATABASE prod_analytics_shared
FROM DATASHARE analytics_share
OF ACCOUNT '123456789012' NAMESPACE 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx';
-- Step 5: Query shared data as if it were local — no data movement
SELECT c.country, SUM(o.amount_usd) AS revenue
FROM prod_analytics_shared.public.orders o
JOIN prod_analytics_shared.public.customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2026-01-01'
GROUP BY c.country
ORDER BY revenue DESC;
Security: Encryption, VPC, IAM, Column and Row Security
Redshift supports multiple layers of security that can — and should — be deployed together in production.
Encryption
Enable encryption at cluster creation using AWS KMS (recommended) or HSM. KMS-encrypted clusters use AES-256 encryption for all data at rest including snapshots and audit logs. Encryption cannot be enabled on an existing unencrypted cluster — you must create a new encrypted cluster and migrate data. Always enable encryption for new clusters.
VPC Isolation
Launch Redshift clusters in a private VPC subnet with no public IP. Use VPC security groups to allow inbound traffic only on port 5439 from specific CIDR ranges (BI tools, ETL servers, bastion hosts). Enable the enhanced VPC routing option to force all COPY and UNLOAD traffic through your VPC instead of over the public internet — this ensures S3 traffic stays on the AWS backbone and can be controlled by VPC flow logs.
IAM Roles for COPY and UNLOAD
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "RedshiftCopyFromS3",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::my-data-lake",
"arn:aws:s3:::my-data-lake/*"
]
},
{
"Sid": "RedshiftUnloadToS3",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::my-redshift-unload/*"
},
{
"Sid": "GlueCatalogAccess",
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetTable",
"glue:GetPartitions"
],
"Resource": "*"
}
]
}
Column-Level Security
-- Grant access to specific columns only (deny salary and ssn to analyst role)
GRANT SELECT (order_id, customer_id, order_date, amount_usd, status)
ON TABLE orders
TO GROUP analysts;
-- Alternatively, use a view to expose only allowed columns
CREATE VIEW orders_public AS
SELECT order_id, customer_id, order_date, amount_usd, status, region
FROM orders;
GRANT SELECT ON orders_public TO GROUP analysts;
REVOKE SELECT ON TABLE orders FROM GROUP analysts;
Row-Level Security
-- Create a row-level security policy: each regional manager sees only their region
CREATE RLS POLICY regional_filter
WITH (region VARCHAR(10))
USING (region = current_setting('app.user_region'));
-- Attach the policy to the orders table for the regional_managers group
ATTACH RLS POLICY regional_filter
ON orders
TO GROUP regional_managers;
-- Enable RLS on the table
ALTER TABLE orders ROW LEVEL SECURITY ON;
-- Users in regional_managers group set their region context at session start:
-- SET app.user_region = 'US-WEST';
-- SELECT * FROM orders; -- automatically filtered to US-WEST rows only
Maintenance: VACUUM, ANALYZE, System Views, CloudWatch
Redshift requires periodic maintenance to keep performance optimal. Two operations matter most: VACUUM (reclaims disk space from deleted/updated rows and re-sorts out-of-order data) and ANALYZE (updates table statistics used by the query planner).
VACUUM
-- Full VACUUM: re-sort + reclaim space (most thorough, most I/O)
VACUUM FULL orders;
-- Sort-only VACUUM: re-sort rows without reclaiming space (faster)
VACUUM SORT ONLY orders;
-- Delete-only VACUUM: reclaim space without re-sorting (good for append-only tables)
VACUUM DELETE ONLY orders;
-- Reindex: rebuild interleaved sort key index (only for interleaved sort keys)
VACUUM REINDEX orders;
-- Check which tables need VACUUM (unsorted rows > 5%)
SELECT
schema_name,
table_name,
pct_unsorted,
size AS size_mb,
skew_sortkey1
FROM svv_table_info
WHERE pct_unsorted > 5
ORDER BY pct_unsorted DESC;
SVV_TABLE_INFO.pct_unsorted; if it consistently stays above 20% on heavily queried tables, schedule a manual VACUUM SORT ONLY during off-peak hours.ANALYZE
-- Update statistics for the whole database (run after large loads)
ANALYZE;
-- Update statistics for a specific table only
ANALYZE orders;
-- Update statistics for specific columns (faster, targets planner-critical columns)
ANALYZE orders(customer_id, order_date, status);
-- Check table statistics staleness
SELECT
schema_name,
table_name,
stats_off AS stats_pct_stale,
pct_unsorted,
size AS size_mb
FROM svv_table_info
WHERE stats_off > 10
ORDER BY stats_off DESC;
Key System Views for Monitoring
-- Find the slowest recent queries
SELECT
q.query,
TRIM(q.querytxt) AS sql_text,
q.starttime,
DATEDIFF(seconds, q.starttime, q.endtime) AS duration_sec,
q.aborted
FROM STL_QUERY q
WHERE q.starttime > GETDATE() - INTERVAL '1 day'
AND q.userid > 1 -- exclude system queries
ORDER BY duration_sec DESC
LIMIT 20;
-- Find queries with most data scanned (candidates for sort key optimization)
SELECT
query,
SUM(rows_pre_filter) AS rows_scanned,
SUM(rows) AS rows_returned,
ROUND(100.0 * SUM(rows) / NULLIF(SUM(rows_pre_filter), 0), 2) AS selectivity_pct
FROM SVL_QUERY_REPORT
WHERE query IN (
SELECT query FROM STL_QUERY
WHERE starttime > GETDATE() - INTERVAL '1 day' AND userid > 1
)
GROUP BY query
HAVING SUM(rows_pre_filter) > 1000000
ORDER BY rows_scanned DESC
LIMIT 20;
-- Check queue wait times by WLM service class
SELECT
service_class,
COUNT(*) AS queries,
AVG(total_queue_time / 1000000.0) AS avg_queue_sec,
MAX(total_queue_time / 1000000.0) AS max_queue_sec,
AVG(total_exec_time / 1000000.0) AS avg_exec_sec
FROM STL_WLM_QUERY
WHERE service_class > 4
AND start_time > GETDATE() - INTERVAL '1 day'
GROUP BY service_class
ORDER BY service_class;
For CloudWatch monitoring, the key metrics to alarm on are: DatabaseConnections (connection pool exhaustion), CPUUtilization (sustained > 80% indicates need to scale), ReadLatency / WriteLatency (disk I/O health), and QueryDuration (P99 query time). Set CloudWatch alarms on all four for proactive alerting.
Terraform: Provision a Redshift Cluster
provider "aws" {
region = "us-east-1"
}
# --- KMS key for Redshift encryption ---
resource "aws_kms_key" "redshift" {
description = "Redshift encryption key"
deletion_window_in_days = 30
enable_key_rotation = true
}
# --- Subnet group ---
resource "aws_redshift_subnet_group" "main" {
name = "prod-redshift-subnet-group"
subnet_ids = var.private_subnet_ids
}
# --- Parameter group with WLM config ---
resource "aws_redshift_parameter_group" "main" {
name = "prod-redshift-params"
family = "redshift-1.0"
parameter {
name = "wlm_json_configuration"
value = jsonencode([
{
name = "Dashboard"
concurrency = 15
concurrency_scaling = "auto"
memory_percent_to_use = 40
query_group = ["dashboard"]
target_query_execution_time = 30
},
{
name = "ETL"
concurrency = 3
memory_percent_to_use = 35
user_group = ["etl_users"]
},
{
name = "Default"
concurrency = 5
memory_percent_to_use = 25
}
])
}
parameter {
name = "enable_user_activity_logging"
value = "true"
}
parameter {
name = "require_ssl"
value = "true"
}
}
# --- IAM role for COPY/UNLOAD/Spectrum ---
resource "aws_iam_role" "redshift_s3" {
name = "RedshiftS3Role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { Service = "redshift.amazonaws.com" }
Action = "sts:AssumeRole"
}]
})
}
resource "aws_iam_role_policy_attachment" "redshift_s3_full" {
role = aws_iam_role.redshift_s3.name
policy_arn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
}
resource "aws_iam_role_policy_attachment" "redshift_glue" {
role = aws_iam_role.redshift_s3.name
policy_arn = "arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess"
}
# --- Security group ---
resource "aws_security_group" "redshift" {
name = "redshift-sg"
vpc_id = var.vpc_id
ingress {
from_port = 5439
to_port = 5439
protocol = "tcp"
cidr_blocks = var.allowed_cidr_blocks
description = "Redshift JDBC port"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
# --- Redshift cluster ---
resource "aws_redshift_cluster" "prod" {
cluster_identifier = "prod-analytics"
node_type = "ra3.4xlarge"
number_of_nodes = 4
database_name = "analytics"
master_username = "admin"
master_password = var.redshift_master_password # from Secrets Manager in practice
cluster_subnet_group_name = aws_redshift_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.redshift.id]
cluster_parameter_group_name = aws_redshift_parameter_group.main.name
iam_roles = [aws_iam_role.redshift_s3.arn]
encrypted = true
kms_key_id = aws_kms_key.redshift.arn
enhanced_vpc_routing = true
publicly_accessible = false
skip_final_snapshot = false
final_snapshot_identifier = "prod-analytics-final"
automated_snapshot_retention_period = 7
logging {
enable = true
bucket_name = var.audit_log_bucket
s3_key_prefix = "redshift-logs/"
}
tags = {
Environment = "production"
ManagedBy = "terraform"
}
}
output "jdbc_url" {
value = "jdbc:redshift://${aws_redshift_cluster.prod.endpoint.address}:5439/analytics"
}
Frequently Asked Questions
How many nodes should I start with for a new Redshift cluster?
Start with 2 nodes of ra3.4xlarge for datasets under 5 TB and add nodes as query latency or CPU exceeds targets. RA3's decoupled storage means you can grow storage without adding compute nodes. Use the Redshift Advisor in the console — it analyzes your actual query patterns and recommends node type changes, DISTKEY additions, and compression improvements with estimated impact. Redshift Serverless is an even better starting point if your workload is new and unpredictable; switch to provisioned RA3 when you can characterize your peak concurrency needs.
What is the difference between a DISTKEY and a SORTKEY?
A DISTKEY controls where each row is stored — which slice across the cluster receives a given row, based on a hash of the distribution column. A SORTKEY controls in what order rows are stored within each slice on disk. They solve different problems: DISTKEY eliminates inter-node data shuffles during JOINs (by co-locating joined rows on the same slice); SORTKEY enables zone-map pruning to skip disk blocks during range filters. You typically need both: DISTKEY on your most common JOIN key, SORTKEY on your most common WHERE filter column (usually a date).
When should I use Redshift Spectrum instead of loading data into Redshift?
Use Spectrum for historical data that is rarely queried (cold tier), data that is shared with other systems (Athena, EMR) and copying it into Redshift would create duplication, or data that arrives in open S3 formats from third parties. Load data directly into Redshift when you need the fastest possible query performance (Spectrum has higher per-query overhead), when you need to JOIN heavily across the cold data with warm data, or when the data is exclusively used by Redshift and freshness requirements make S3-only latency acceptable. The common production pattern is: load 90-day rolling data into Redshift; keep everything older in S3 accessible via Spectrum.
How do I handle concurrent write conflicts in Redshift?
Redshift uses MVCC (multi-version concurrency control) with serializable isolation. Concurrent writers to the same table will queue — Redshift does not support row-level locking. Avoid many small INSERT/UPDATE/DELETE transactions; instead, batch changes and use the standard Redshift pattern: load new data into a staging table, then merge into the target with a DELETE + INSERT or using the MERGE statement (available since Redshift engine 1.0.47772). Monitor lock contention with SELECT * FROM SVV_TRANSACTIONS WHERE lockable_object_type = 'relation'.
Is Redshift Serverless suitable for production BI?
Yes, Redshift Serverless is production-ready as of 2026 for most BI workloads. The key consideration is cold-start latency: if the workgroup has been idle for an extended period, the first query may take 10–30 seconds to warm up. Enable the base-capacity setting (minimum 8 RPUs) to keep a warm footprint. For 24/7 BI tools with consistent concurrency above 20–30 users, provisioned RA3 with reserved pricing will be more cost-effective. For batch analytical workloads, sporadic dashboards, and development environments, Serverless is the clear winner on operational simplicity and cost.