AWS Redshift: Data Warehouse Architecture and Query Optimization

AWS Redshift Data Warehouse

Amazon Redshift is AWS's fully managed petabyte-scale data warehouse — the engine behind some of the largest analytics workloads on the planet. It combines a massively parallel processing (MPP) architecture, columnar storage, and tight S3 integration to deliver sub-second query performance on hundreds of terabytes of structured data. Whether you are migrating an on-premises Oracle data warehouse, building a new analytics platform from scratch, or extending an existing S3 data lake with SQL query power, Redshift is almost certainly in the conversation. This guide covers everything you need to understand, deploy, and operate a production Redshift cluster in 2026: node types, distribution strategies, sort keys, data loading patterns, WLM tuning, Spectrum, Concurrency Scaling, data sharing, security, and maintenance — all with real SQL and Terraform examples.

Redshift vs Athena vs BigQuery vs Snowflake

Choosing the right analytical database for your workload is a strategic decision. The four dominant options in 2026 — Redshift, Athena, BigQuery, and Snowflake — each have a different strength. Understanding the trade-offs prevents expensive migrations later.

DimensionRedshiftAthenaBigQuerySnowflake
Pricing modelPer node-hour (provisioned) or per RPU-second (Serverless)$5/TB scanned$6.25/TB scanned (on-demand) or flat-rate slotsPer virtual warehouse credit-hour
Query latencySub-second on warm data; excellent for BI concurrencySeconds to minutes; ad hoc only1–30 s for most queriesSub-second to seconds depending on warehouse size
Data locationRedshift-managed storage + S3 via SpectrumS3 only (no local storage)BigQuery managed storageSnowflake managed storage
ConcurrencyExcellent with WLM + Concurrency ScalingLimited (20 concurrent DML by default)Excellent (slot-based)Excellent (separate warehouses)
ETL ecosystemNative COPY, Glue, Firehose, dbtGlue, Spark, CTASDataflow, BigQuery DTS, dbtSnowpipe, Streams, Tasks, dbt
AWS integrationNative (IAM, S3, KMS, VPC, CloudWatch, Lake Formation)NativeRequires cross-cloud setupAvailable on AWS but not native
Best forHigh-concurrency BI on structured data, AWS-native shopsAd hoc S3 exploration, infrequent queriesGoogle Cloud shops, serverless massive scaleMulti-cloud, separation of storage/compute
When Redshift wins: Choose Redshift when (1) you need consistent sub-second query latency for 50–500 concurrent BI users, (2) your data is structured and loads into Redshift-managed storage, (3) you are already on AWS and want native IAM/VPC/KMS integration, or (4) you need to query your S3 data lake alongside structured warehouse data with Spectrum — a single query engine for both. For purely ad hoc, infrequent queries on S3 files, Athena is cheaper. For greenfield multi-cloud, evaluate Snowflake.

Architecture: Leader Node, Compute Nodes, RA3 vs DC2, Serverless

Redshift's MPP architecture splits work across a leader node and one or more compute nodes. Understanding how these pieces interact is essential for sizing your cluster correctly and troubleshooting slow queries.

Leader Node

The leader node is the entry point for every SQL query. It parses and compiles the query, generates an execution plan, distributes work to the compute nodes, collects partial results, and assembles the final result set for the client. The leader node does not store user data — it only holds cluster metadata, the system catalog, and query state. Because all client connections go through the leader node, it can become a bottleneck if you run many simultaneous small queries without WLM (Workload Management) configuration.

Compute Nodes and Slices

Compute nodes store the actual data in columnar format on local SSDs (DC2) or S3-backed managed storage (RA3). Each compute node is subdivided into slices — independent processing units that work in parallel. A 4-node ra3.4xlarge cluster has 4 nodes × 4 slices = 16 slices. Data is distributed across slices according to the distribution style you choose for each table; this is the single biggest tuning decision in Redshift design.

RA3 vs DC2 Node Types

PropertyRA3 (ra3.xlplus / ra3.4xlarge / ra3.16xlarge)DC2 (dc2.large / dc2.8xlarge)
StorageRedshift Managed Storage (RMS) — S3-backed, decoupled from computeLocal NVMe SSD, fixed to node count
Scaling storageIndependent of compute — add TB without adding nodesMust add nodes to add storage
Best forDatasets over 1 TB, variable query patterns, growth workloadsSub-1 TB datasets with consistent high-performance needs
SpectrumSupportedSupported
Cross-AZ restoreFast (data in S3)Slower (EBS snapshot)
2026 recommendationDefault choice for new clustersLegacy; prefer RA3 for new deployments

Redshift Serverless

Redshift Serverless eliminates cluster management entirely. You define a namespace (logical grouping for databases, schemas, users) and a workgroup (compute configuration: base RPUs, max RPUs, VPC settings). Redshift Serverless scales RPUs (Redshift Processing Units) automatically in response to query demand and charges per RPU-second — with a minimum of 8 RPUs (base capacity). It is the right choice for variable or intermittent workloads, development environments, and situations where you want zero operations overhead. For sustained, predictable high-concurrency workloads, provisioned RA3 clusters with reserved pricing still offer better cost efficiency.

# Create a Redshift Serverless namespace and workgroup via AWS CLI
aws redshift-serverless create-namespace \
  --namespace-name prod-analytics \
  --admin-username admin \
  --admin-user-password 'Ch@ngeMe2026!' \
  --db-name analytics \
  --default-iam-role-arn arn:aws:iam::123456789012:role/RedshiftS3Role \
  --iam-roles arn:aws:iam::123456789012:role/RedshiftS3Role \
  --log-exports '["userlog","connectionlog","useractivitylog"]'

aws redshift-serverless create-workgroup \
  --workgroup-name prod-wg \
  --namespace-name prod-analytics \
  --base-capacity 32 \
  --max-capacity 512 \
  --subnet-ids subnet-aaa subnet-bbb subnet-ccc \
  --security-group-ids sg-xxxxxxxx \
  --publicly-accessible false

Table Design: Distribution Styles, Sort Keys, Column Encoding

Table design is where Redshift performance is won or lost. Getting distribution keys and sort keys right at table creation time saves enormous query optimization effort later. Changing them after the fact requires a full table rebuild.

Distribution Styles

Redshift distributes each row of data to a specific slice based on the table's distribution style. The goal is to place rows that will be joined together on the same slice, eliminating expensive data redistribution (network shuffles) at query time.

StyleHow it distributesUse when
DISTKEY(col)Hash of the distribution column valueLarge fact tables; join partner has same DISTKEY
ALLFull copy on every nodeSmall dimension tables (<5M rows) joined to large facts
EVENRound-robin across slicesNo clear join key; no filtering benefit needed
AUTORedshift chooses; changes over timeNew tables where you are unsure; let Redshift learn

Sort Keys

Sort keys control the physical order of rows on disk. When you filter on a sort key column, Redshift can skip entire 1 MB blocks using zone maps (min/max metadata per block) without scanning them — the Redshift equivalent of partition pruning.

  • Compound sort key: Most selective column first. Efficient for queries that filter on the leading columns in order. Example: SORTKEY(event_date, user_id) is excellent for WHERE event_date = '2026-06-01' but less useful for WHERE user_id = 42 alone.
  • Interleaved sort key: All columns equally weighted. Better when queries filter on any single column in the key — but VACUUM and ANALYZE are slower and more expensive. Avoid on very large tables. AWS recommends compound sort keys for most workloads as of 2026.

Column Encoding

Redshift stores data in columnar format and applies compression encoding per column. Use ANALYZE COMPRESSION on a loaded table to get Redshift's encoding recommendations.

-- Example: well-designed orders fact table
CREATE TABLE orders (
  order_id      BIGINT        NOT NULL ENCODE DELTA,
  customer_id   BIGINT        NOT NULL ENCODE LZO,
  product_id    INTEGER       NOT NULL ENCODE LZO,
  order_date    DATE          NOT NULL ENCODE DELTA32K,
  amount_usd    DECIMAL(12,2) NOT NULL ENCODE BYTEDICT,
  status        VARCHAR(20)   NOT NULL ENCODE BYTEDICT,
  region        VARCHAR(10)   NOT NULL ENCODE BYTEDICT,
  created_at    TIMESTAMP     NOT NULL ENCODE DELTA32K
)
DISTSTYLE KEY
DISTKEY (customer_id)          -- distributes on join key
COMPOUND SORTKEY (order_date, customer_id)  -- most common filter first
;

-- Small dimension table: replicated to every node for co-located JOINs
CREATE TABLE customers (
  customer_id   BIGINT        NOT NULL ENCODE DELTA,
  name          VARCHAR(200)  NOT NULL ENCODE LZO,
  email         VARCHAR(200)  NOT NULL ENCODE LZO,
  country       VARCHAR(50)   NOT NULL ENCODE BYTEDICT,
  tier          VARCHAR(20)   NOT NULL ENCODE BYTEDICT,
  created_date  DATE          NOT NULL ENCODE DELTA32K
)
DISTSTYLE ALL
SORTKEY (customer_id)
;

-- After loading data, get encoding recommendations
ANALYZE COMPRESSION orders;
AUTO encoding: If you specify ENCODE AUTO (the default for new tables), Redshift automatically selects and adjusts encodings as data is loaded. This is a safe default — it learns from your actual data distribution. Override with explicit encodings only when you have profiling data showing a specific encoding is significantly better.

Loading Data: COPY from S3, Firehose, AWS Glue

The COPY command is the fastest and most flexible way to load data into Redshift. It reads in parallel across all compute node slices, supports multiple file formats, and integrates natively with S3, DynamoDB, EMR, and SSH. Never use row-by-row INSERTs for bulk loading — they bypass MPP parallelism and are 10–100x slower.

COPY from S3 — Core Patterns

-- 1. Basic COPY from CSV files in S3
COPY orders
FROM 's3://my-data-lake/raw/orders/2026/06/'
IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftS3Role'
FORMAT AS CSV
IGNOREHEADER 1
DATEFORMAT 'YYYY-MM-DD'
TIMEFORMAT 'auto'
MAXERROR 100          -- tolerate up to 100 parse errors before aborting
COMPUPDATE ON         -- analyze compression after load
STATUPDATE ON;        -- refresh table statistics after load

-- 2. COPY from Parquet (preferred format — no type inference needed)
COPY orders
FROM 's3://my-data-lake/processed/orders/2026/'
IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftS3Role'
FORMAT AS PARQUET;

-- 3. COPY with a manifest file — precise control over which files to load
--    Create manifest at s3://my-data-lake/manifests/orders_june.json
COPY orders
FROM 's3://my-data-lake/manifests/orders_june.json'
IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftS3Role'
FORMAT AS PARQUET
MANIFEST;

The manifest file is a JSON document listing the exact S3 objects to load — useful when you want to load a specific set of files rather than an entire prefix:

{
  "entries": [
    { "url": "s3://my-data-lake/processed/orders/2026/06/part-000.parquet", "mandatory": true },
    { "url": "s3://my-data-lake/processed/orders/2026/06/part-001.parquet", "mandatory": true },
    { "url": "s3://my-data-lake/processed/orders/2026/06/part-002.parquet", "mandatory": true }
  ]
}

Kinesis Firehose to Redshift

Kinesis Data Firehose can deliver streaming records directly to Redshift via an intermediate S3 landing zone. Firehose buffers records (by size up to 128 MB or time up to 900 seconds), writes Parquet/JSON/CSV to S3, and issues a COPY command to load them into Redshift. This pattern is ideal for near-real-time operational data — clickstreams, application events, IoT telemetry — that need to be queryable in Redshift within minutes of being produced.

# Create a Firehose delivery stream targeting Redshift
aws firehose create-delivery-stream \
  --delivery-stream-name orders-to-redshift \
  --delivery-stream-type DirectPut \
  --redshift-destination-configuration '{
    "RoleARN": "arn:aws:iam::123456789012:role/FirehoseRedshiftRole",
    "ClusterJDBCURL": "jdbc:redshift://mycluster.xxxx.us-east-1.redshift.amazonaws.com:5439/analytics",
    "CopyCommand": {
      "DataTableName": "orders_staging",
      "DataTableColumns": "order_id,customer_id,product_id,amount_usd,status,created_at",
      "CopyOptions": "FORMAT AS JSON '\''auto'\'' TIMEFORMAT '\''auto'\''"
    },
    "Username": "firehose_loader",
    "Password": "{{resolve:secretsmanager:redshift/firehose:SecretString:password}}",
    "S3Configuration": {
      "RoleARN": "arn:aws:iam::123456789012:role/FirehoseRedshiftRole",
      "BucketARN": "arn:aws:s3:::my-firehose-staging",
      "Prefix": "redshift/orders/",
      "BufferingHints": { "SizeInMBs": 64, "IntervalInSeconds": 300 },
      "CompressionFormat": "SNAPPY"
    }
  }'

AWS Glue to Redshift

AWS Glue ETL jobs can write directly to Redshift using the Glue JDBC connector or the native Redshift connector (which internally uses the COPY command via S3 for performance). The Glue Redshift connector is the recommended approach for large-scale ETL loads because it avoids row-by-row JDBC inserts.

# AWS Glue PySpark job — transform CSV to Redshift via S3 COPY
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

# Read raw CSV from S3 via Glue Data Catalog
raw = glueContext.create_dynamic_frame.from_catalog(
    database='ecommerce_raw',
    table_name='orders_csv'
)

# Apply transformations
from awsglue.transforms import ApplyMapping
mapped = ApplyMapping.apply(frame=raw, mappings=[
    ('order_id',    'string', 'order_id',    'long'),
    ('customer_id', 'string', 'customer_id', 'long'),
    ('product_id',  'string', 'product_id',  'int'),
    ('amount',      'string', 'amount_usd',  'decimal'),
    ('status',      'string', 'status',      'string'),
    ('created_at',  'string', 'created_at',  'timestamp'),
])

# Write to Redshift using the native connector (internally uses COPY)
glueContext.write_dynamic_frame.from_jdbc_conf(
    frame=mapped,
    catalog_connection='redshift-prod',
    connection_options={
        'dbtable': 'orders',
        'database': 'analytics',
        'aws_iam_role': 'arn:aws:iam::123456789012:role/RedshiftS3Role',
        'extracopyoptions': 'TRUNCATECOLUMNS COMPUPDATE ON STATUPDATE ON'
    },
    redshift_tmp_dir='s3://my-glue-temp/redshift-staging/'
)

job.commit()

Query Optimization: EXPLAIN, WLM, Materialized Views, Result Cache

Once your cluster is running and data is loaded, query performance tuning is an iterative discipline. Redshift provides rich tools: EXPLAIN plans, WLM queues, query monitoring rules, materialized views, and result caching. Use them in combination.

Reading the EXPLAIN Plan

-- Prefix any query with EXPLAIN to get the execution plan without running it
EXPLAIN
SELECT
  c.country,
  SUM(o.amount_usd) AS total_revenue,
  COUNT(DISTINCT o.customer_id) AS unique_customers
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date BETWEEN '2026-01-01' AND '2026-06-30'
  AND o.status = 'COMPLETED'
GROUP BY c.country
ORDER BY total_revenue DESC;

Key plan nodes to look for:

  • DS_DIST_NONE — no data redistribution needed (collocated JOIN — best case)
  • DS_DIST_ALL — broadcast the smaller table to all nodes (acceptable for small dimensions)
  • DS_DIST_INNER — redistribute the inner table by join key (expensive on large tables — fix with DISTKEY)
  • DS_BCAST_INNER — broadcast inner table to all nodes (expensive — indicates wrong distribution design)
  • SORTKEY FILTER — zone map pruning applied (good — sort key is being used)
  • rows=N — estimated row count; large overestimates suggest stale statistics (run ANALYZE)

Workload Management (WLM)

WLM controls how Redshift allocates memory and concurrency slots across competing queries. Without WLM tuning, a single runaway analytical query can consume all available memory and starve short dashboard queries. Configure WLM via the parameter group.

[
  {
    "name": "Short queries",
    "query_group": ["dashboard", "short"],
    "query_group_wild_card": 0,
    "user_group": [],
    "user_group_wild_card": 0,
    "concurrency_scaling": "auto",
    "concurrency": 15,
    "target_query_execution_time": 30,
    "memory_percent_to_use": 40
  },
  {
    "name": "ETL / bulk loads",
    "query_group": ["etl", "load"],
    "user_group": ["etl_users"],
    "user_group_wild_card": 0,
    "concurrency": 3,
    "memory_percent_to_use": 35
  },
  {
    "name": "Default",
    "concurrency_scaling": "off",
    "concurrency": 5,
    "memory_percent_to_use": 25
  }
]
Short-query acceleration (SQA): Enable SQA on your cluster to automatically detect and route queries predicted to complete in under 30 seconds to a dedicated fast lane, bypassing WLM queue waits. SQA is enabled by default on new clusters and can significantly improve BI tool responsiveness for exploratory queries without any manual query routing.

Materialized Views

-- Create a materialized view for a common heavy aggregation
CREATE MATERIALIZED VIEW mv_daily_revenue AS
SELECT
  o.order_date,
  c.country,
  c.tier,
  SUM(o.amount_usd)        AS revenue,
  COUNT(*)                 AS order_count,
  COUNT(DISTINCT o.customer_id) AS unique_customers
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.status = 'COMPLETED'
GROUP BY o.order_date, c.country, c.tier;

-- Refresh manually or on a schedule
REFRESH MATERIALIZED VIEW mv_daily_revenue;

-- Query the materialized view (much faster than the base query)
SELECT country, SUM(revenue) AS total
FROM mv_daily_revenue
WHERE order_date >= '2026-01-01'
GROUP BY country
ORDER BY total DESC;

Redshift also supports automatic query rewriting: if a query matches the definition of an existing materialized view, Redshift transparently rewrites the query to use the MV even if you didn't reference it explicitly. Enable this with SET mv_enable_aqmv_for_session = TRUE or at the cluster level via the parameter group.

Result Caching

Redshift caches the result of a query. If the exact same query is resubmitted by any user and the underlying tables have not changed, the result is returned from cache with sub-millisecond latency. Result caching is enabled by default and requires no configuration. It is particularly valuable for BI dashboards that run the same aggregation queries every time a user loads a report page. Monitor cache hits via the SVL_QLOG system view: SELECT * FROM SVL_QLOG WHERE from_sp_call IS NULL AND query_type = 'cache' LIMIT 20;

Redshift Spectrum: Querying the S3 Data Lake

Redshift Spectrum extends Redshift's SQL engine to query data sitting directly in S3 without loading it into Redshift-managed storage. The data stays in S3 in open formats (Parquet, ORC, CSV, JSON); Spectrum spins up a fleet of thousands of Spectrum nodes managed by AWS to handle the compute. You pay $5 per TB scanned — the same pricing as Athena — in addition to your Redshift node charges.

This makes Spectrum ideal for the "hot/cold" data pattern: recent, frequently queried data lives in Redshift tables for speed; historical data lives in S3 and is queried via Spectrum when needed. A single JOIN across both is transparent to the BI tool.

Setting Up External Tables

-- Step 1: Create an external schema pointing at the Glue Data Catalog
CREATE EXTERNAL SCHEMA spectrum_ecommerce
FROM DATA CATALOG
DATABASE 'ecommerce_lake'
IAM_ROLE 'arn:aws:iam::123456789012:role/RedshiftSpectrumRole'
CREATE EXTERNAL DATABASE IF NOT EXISTS;

-- Step 2: Create an external table for historical orders in S3
CREATE EXTERNAL TABLE spectrum_ecommerce.orders_historical (
  order_id      BIGINT,
  customer_id   BIGINT,
  product_id    INTEGER,
  order_date    DATE,
  amount_usd    DECIMAL(12,2),
  status        VARCHAR(20),
  region        VARCHAR(10)
)
PARTITIONED BY (year INT, month INT)
STORED AS PARQUET
LOCATION 's3://my-data-lake/historical/orders/'
TABLE PROPERTIES ('numRows'='5000000000');

-- Step 3: Register partitions
ALTER TABLE spectrum_ecommerce.orders_historical
ADD PARTITION (year=2024, month=1)
LOCATION 's3://my-data-lake/historical/orders/year=2024/month=1/';

-- Step 4: JOIN historical S3 data with current Redshift data in a single query
SELECT
  c.country,
  SUM(CASE WHEN o.order_date >= '2026-01-01' THEN o.amount_usd ELSE 0 END) AS ytd_revenue,
  SUM(CASE WHEN h.order_date >= '2024-01-01' THEN h.amount_usd ELSE 0 END) AS two_yr_revenue
FROM customers c
LEFT JOIN orders o             ON c.customer_id = o.customer_id      -- Redshift local table
LEFT JOIN spectrum_ecommerce.orders_historical h
                               ON c.customer_id = h.customer_id      -- S3 via Spectrum
GROUP BY c.country
ORDER BY two_yr_revenue DESC;
Spectrum performance tips: (1) Use Parquet or ORC — Spectrum only reads the columns your query needs, giving 60–90% scan reduction vs CSV. (2) Partition your S3 data by the columns you filter on most (usually date). (3) Push predicates into the external query — Spectrum applies WHERE filters before sending data to Redshift compute nodes. (4) Target 512 MB–1 GB Parquet files to maximize Spectrum parallelism. (5) Use TABLE PROPERTIES ('numRows'='N') so the Redshift query planner has accurate cardinality estimates for optimal join ordering.

Concurrency Scaling

Concurrency Scaling automatically adds transient read-capacity clusters when queue wait times exceed a threshold. The primary cluster handles writes; Concurrency Scaling clusters handle read queries from the WLM queue overflow. This lets you handle unpredictable peaks — end-of-month reporting bursts, BI dashboard refreshes during business hours — without provisioning for peak capacity permanently.

Pricing: the first one hour of Concurrency Scaling cluster usage per day is free. Beyond that, you pay per RPU-second consumed by Concurrency Scaling clusters — typically far cheaper than permanently adding nodes.

-- Enable Concurrency Scaling for a WLM queue via the AWS CLI parameter group
-- (The WLM config is JSON; set concurrency_scaling: "auto" for the target queue)

-- Monitor Concurrency Scaling activity
SELECT
  query,
  service_class,
  concurrency_scaling_status,
  starttime,
  endtime,
  DATEDIFF(seconds, starttime, endtime) AS duration_sec
FROM SVL_QUERY_REPORT
WHERE concurrency_scaling_status IN (1, 2)  -- 1=sent to CS, 2=returned from CS
  AND starttime > GETDATE() - INTERVAL '1 day'
ORDER BY starttime DESC
LIMIT 50;

-- See daily Concurrency Scaling usage to predict costs
SELECT
  TRUNC(starttime) AS query_date,
  COUNT(*) AS cs_queries,
  SUM(DATEDIFF(seconds, starttime, endtime)) AS total_cs_seconds
FROM STL_QUERY
WHERE concurrency_scaling_status = 1
  AND starttime > GETDATE() - INTERVAL '30 days'
GROUP BY 1
ORDER BY 1;

Data Sharing: Live Access Across Clusters and Accounts

Redshift Data Sharing lets you share live, read-only access to data between Redshift clusters — within the same AWS account or across different accounts — without copying or moving the data. The producer cluster grants access to specific databases, schemas, or tables; the consumer cluster queries them as if they were local tables with sub-second latency.

This is powerful for the data mesh pattern: each domain team owns its Redshift cluster (producer), and other teams (consumers) can query authoritative domain data directly without waiting for ETL pipelines to replicate it.

-- On the PRODUCER cluster:

-- Step 1: Create a datashare
CREATE DATASHARE analytics_share;

-- Step 2: Add objects to the share
ALTER DATASHARE analytics_share ADD SCHEMA public;
ALTER DATASHARE analytics_share ADD ALL TABLES IN SCHEMA public;
-- Or add specific tables:
-- ALTER DATASHARE analytics_share ADD TABLE public.orders;
-- ALTER DATASHARE analytics_share ADD TABLE public.customers;

-- Step 3: Grant access to the consumer account (cross-account)
GRANT USAGE ON DATASHARE analytics_share
  TO ACCOUNT '987654321098' VIA DATA CATALOG;

-- On the CONSUMER cluster (in the receiving AWS account):

-- After accepting the share in the console or via CLI:
-- Step 4: Create a local database from the share
CREATE DATABASE prod_analytics_shared
FROM DATASHARE analytics_share
OF ACCOUNT '123456789012' NAMESPACE 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx';

-- Step 5: Query shared data as if it were local — no data movement
SELECT c.country, SUM(o.amount_usd) AS revenue
FROM prod_analytics_shared.public.orders o
JOIN prod_analytics_shared.public.customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2026-01-01'
GROUP BY c.country
ORDER BY revenue DESC;
Data Sharing consistency: Data Sharing provides transactionally consistent reads. If a producer transaction commits new rows while a consumer query is running, the consumer sees a consistent snapshot as of the query start time. There is no replication lag — consumers always query live data in the producer's managed storage.

Security: Encryption, VPC, IAM, Column and Row Security

Redshift supports multiple layers of security that can — and should — be deployed together in production.

Encryption

Enable encryption at cluster creation using AWS KMS (recommended) or HSM. KMS-encrypted clusters use AES-256 encryption for all data at rest including snapshots and audit logs. Encryption cannot be enabled on an existing unencrypted cluster — you must create a new encrypted cluster and migrate data. Always enable encryption for new clusters.

VPC Isolation

Launch Redshift clusters in a private VPC subnet with no public IP. Use VPC security groups to allow inbound traffic only on port 5439 from specific CIDR ranges (BI tools, ETL servers, bastion hosts). Enable the enhanced VPC routing option to force all COPY and UNLOAD traffic through your VPC instead of over the public internet — this ensures S3 traffic stays on the AWS backbone and can be controlled by VPC flow logs.

IAM Roles for COPY and UNLOAD

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "RedshiftCopyFromS3",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::my-data-lake",
        "arn:aws:s3:::my-data-lake/*"
      ]
    },
    {
      "Sid": "RedshiftUnloadToS3",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::my-redshift-unload/*"
    },
    {
      "Sid": "GlueCatalogAccess",
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:GetTable",
        "glue:GetPartitions"
      ],
      "Resource": "*"
    }
  ]
}

Column-Level Security

-- Grant access to specific columns only (deny salary and ssn to analyst role)
GRANT SELECT (order_id, customer_id, order_date, amount_usd, status)
  ON TABLE orders
  TO GROUP analysts;

-- Alternatively, use a view to expose only allowed columns
CREATE VIEW orders_public AS
SELECT order_id, customer_id, order_date, amount_usd, status, region
FROM orders;

GRANT SELECT ON orders_public TO GROUP analysts;
REVOKE SELECT ON TABLE orders  FROM GROUP analysts;

Row-Level Security

-- Create a row-level security policy: each regional manager sees only their region
CREATE RLS POLICY regional_filter
WITH (region VARCHAR(10))
USING (region = current_setting('app.user_region'));

-- Attach the policy to the orders table for the regional_managers group
ATTACH RLS POLICY regional_filter
ON orders
TO GROUP regional_managers;

-- Enable RLS on the table
ALTER TABLE orders ROW LEVEL SECURITY ON;

-- Users in regional_managers group set their region context at session start:
-- SET app.user_region = 'US-WEST';
-- SELECT * FROM orders; -- automatically filtered to US-WEST rows only

Maintenance: VACUUM, ANALYZE, System Views, CloudWatch

Redshift requires periodic maintenance to keep performance optimal. Two operations matter most: VACUUM (reclaims disk space from deleted/updated rows and re-sorts out-of-order data) and ANALYZE (updates table statistics used by the query planner).

VACUUM

-- Full VACUUM: re-sort + reclaim space (most thorough, most I/O)
VACUUM FULL orders;

-- Sort-only VACUUM: re-sort rows without reclaiming space (faster)
VACUUM SORT ONLY orders;

-- Delete-only VACUUM: reclaim space without re-sorting (good for append-only tables)
VACUUM DELETE ONLY orders;

-- Reindex: rebuild interleaved sort key index (only for interleaved sort keys)
VACUUM REINDEX orders;

-- Check which tables need VACUUM (unsorted rows > 5%)
SELECT
  schema_name,
  table_name,
  pct_unsorted,
  size AS size_mb,
  skew_sortkey1
FROM svv_table_info
WHERE pct_unsorted > 5
ORDER BY pct_unsorted DESC;
Automatic VACUUM: Redshift runs automatic VACUUM in the background during cluster idle periods. For RA3 clusters with auto-vacuum enabled, you rarely need to run VACUUM manually. Monitor SVV_TABLE_INFO.pct_unsorted; if it consistently stays above 20% on heavily queried tables, schedule a manual VACUUM SORT ONLY during off-peak hours.

ANALYZE

-- Update statistics for the whole database (run after large loads)
ANALYZE;

-- Update statistics for a specific table only
ANALYZE orders;

-- Update statistics for specific columns (faster, targets planner-critical columns)
ANALYZE orders(customer_id, order_date, status);

-- Check table statistics staleness
SELECT
  schema_name,
  table_name,
  stats_off       AS stats_pct_stale,
  pct_unsorted,
  size            AS size_mb
FROM svv_table_info
WHERE stats_off > 10
ORDER BY stats_off DESC;

Key System Views for Monitoring

-- Find the slowest recent queries
SELECT
  q.query,
  TRIM(q.querytxt) AS sql_text,
  q.starttime,
  DATEDIFF(seconds, q.starttime, q.endtime) AS duration_sec,
  q.aborted
FROM STL_QUERY q
WHERE q.starttime > GETDATE() - INTERVAL '1 day'
  AND q.userid > 1           -- exclude system queries
ORDER BY duration_sec DESC
LIMIT 20;

-- Find queries with most data scanned (candidates for sort key optimization)
SELECT
  query,
  SUM(rows_pre_filter) AS rows_scanned,
  SUM(rows)            AS rows_returned,
  ROUND(100.0 * SUM(rows) / NULLIF(SUM(rows_pre_filter), 0), 2) AS selectivity_pct
FROM SVL_QUERY_REPORT
WHERE query IN (
  SELECT query FROM STL_QUERY
  WHERE starttime > GETDATE() - INTERVAL '1 day' AND userid > 1
)
GROUP BY query
HAVING SUM(rows_pre_filter) > 1000000
ORDER BY rows_scanned DESC
LIMIT 20;

-- Check queue wait times by WLM service class
SELECT
  service_class,
  COUNT(*) AS queries,
  AVG(total_queue_time / 1000000.0) AS avg_queue_sec,
  MAX(total_queue_time / 1000000.0) AS max_queue_sec,
  AVG(total_exec_time  / 1000000.0) AS avg_exec_sec
FROM STL_WLM_QUERY
WHERE service_class > 4
  AND start_time > GETDATE() - INTERVAL '1 day'
GROUP BY service_class
ORDER BY service_class;

For CloudWatch monitoring, the key metrics to alarm on are: DatabaseConnections (connection pool exhaustion), CPUUtilization (sustained > 80% indicates need to scale), ReadLatency / WriteLatency (disk I/O health), and QueryDuration (P99 query time). Set CloudWatch alarms on all four for proactive alerting.

Terraform: Provision a Redshift Cluster

provider "aws" {
  region = "us-east-1"
}

# --- KMS key for Redshift encryption ---
resource "aws_kms_key" "redshift" {
  description             = "Redshift encryption key"
  deletion_window_in_days = 30
  enable_key_rotation     = true
}

# --- Subnet group ---
resource "aws_redshift_subnet_group" "main" {
  name       = "prod-redshift-subnet-group"
  subnet_ids = var.private_subnet_ids
}

# --- Parameter group with WLM config ---
resource "aws_redshift_parameter_group" "main" {
  name   = "prod-redshift-params"
  family = "redshift-1.0"

  parameter {
    name  = "wlm_json_configuration"
    value = jsonencode([
      {
        name                    = "Dashboard"
        concurrency             = 15
        concurrency_scaling     = "auto"
        memory_percent_to_use   = 40
        query_group             = ["dashboard"]
        target_query_execution_time = 30
      },
      {
        name                  = "ETL"
        concurrency           = 3
        memory_percent_to_use = 35
        user_group            = ["etl_users"]
      },
      {
        name                  = "Default"
        concurrency           = 5
        memory_percent_to_use = 25
      }
    ])
  }

  parameter {
    name  = "enable_user_activity_logging"
    value = "true"
  }

  parameter {
    name  = "require_ssl"
    value = "true"
  }
}

# --- IAM role for COPY/UNLOAD/Spectrum ---
resource "aws_iam_role" "redshift_s3" {
  name = "RedshiftS3Role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "redshift.amazonaws.com" }
      Action    = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_role_policy_attachment" "redshift_s3_full" {
  role       = aws_iam_role.redshift_s3.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
}

resource "aws_iam_role_policy_attachment" "redshift_glue" {
  role       = aws_iam_role.redshift_s3.name
  policy_arn = "arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess"
}

# --- Security group ---
resource "aws_security_group" "redshift" {
  name   = "redshift-sg"
  vpc_id = var.vpc_id

  ingress {
    from_port   = 5439
    to_port     = 5439
    protocol    = "tcp"
    cidr_blocks = var.allowed_cidr_blocks
    description = "Redshift JDBC port"
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# --- Redshift cluster ---
resource "aws_redshift_cluster" "prod" {
  cluster_identifier        = "prod-analytics"
  node_type                 = "ra3.4xlarge"
  number_of_nodes           = 4
  database_name             = "analytics"
  master_username           = "admin"
  master_password           = var.redshift_master_password  # from Secrets Manager in practice
  cluster_subnet_group_name = aws_redshift_subnet_group.main.name
  vpc_security_group_ids    = [aws_security_group.redshift.id]
  cluster_parameter_group_name = aws_redshift_parameter_group.main.name
  iam_roles                 = [aws_iam_role.redshift_s3.arn]
  encrypted                 = true
  kms_key_id                = aws_kms_key.redshift.arn
  enhanced_vpc_routing      = true
  publicly_accessible       = false
  skip_final_snapshot       = false
  final_snapshot_identifier = "prod-analytics-final"
  automated_snapshot_retention_period = 7

  logging {
    enable        = true
    bucket_name   = var.audit_log_bucket
    s3_key_prefix = "redshift-logs/"
  }

  tags = {
    Environment = "production"
    ManagedBy   = "terraform"
  }
}

output "jdbc_url" {
  value = "jdbc:redshift://${aws_redshift_cluster.prod.endpoint.address}:5439/analytics"
}

Frequently Asked Questions

How many nodes should I start with for a new Redshift cluster?

Start with 2 nodes of ra3.4xlarge for datasets under 5 TB and add nodes as query latency or CPU exceeds targets. RA3's decoupled storage means you can grow storage without adding compute nodes. Use the Redshift Advisor in the console — it analyzes your actual query patterns and recommends node type changes, DISTKEY additions, and compression improvements with estimated impact. Redshift Serverless is an even better starting point if your workload is new and unpredictable; switch to provisioned RA3 when you can characterize your peak concurrency needs.

What is the difference between a DISTKEY and a SORTKEY?

A DISTKEY controls where each row is stored — which slice across the cluster receives a given row, based on a hash of the distribution column. A SORTKEY controls in what order rows are stored within each slice on disk. They solve different problems: DISTKEY eliminates inter-node data shuffles during JOINs (by co-locating joined rows on the same slice); SORTKEY enables zone-map pruning to skip disk blocks during range filters. You typically need both: DISTKEY on your most common JOIN key, SORTKEY on your most common WHERE filter column (usually a date).

When should I use Redshift Spectrum instead of loading data into Redshift?

Use Spectrum for historical data that is rarely queried (cold tier), data that is shared with other systems (Athena, EMR) and copying it into Redshift would create duplication, or data that arrives in open S3 formats from third parties. Load data directly into Redshift when you need the fastest possible query performance (Spectrum has higher per-query overhead), when you need to JOIN heavily across the cold data with warm data, or when the data is exclusively used by Redshift and freshness requirements make S3-only latency acceptable. The common production pattern is: load 90-day rolling data into Redshift; keep everything older in S3 accessible via Spectrum.

How do I handle concurrent write conflicts in Redshift?

Redshift uses MVCC (multi-version concurrency control) with serializable isolation. Concurrent writers to the same table will queue — Redshift does not support row-level locking. Avoid many small INSERT/UPDATE/DELETE transactions; instead, batch changes and use the standard Redshift pattern: load new data into a staging table, then merge into the target with a DELETE + INSERT or using the MERGE statement (available since Redshift engine 1.0.47772). Monitor lock contention with SELECT * FROM SVV_TRANSACTIONS WHERE lockable_object_type = 'relation'.

Is Redshift Serverless suitable for production BI?

Yes, Redshift Serverless is production-ready as of 2026 for most BI workloads. The key consideration is cold-start latency: if the workgroup has been idle for an extended period, the first query may take 10–30 seconds to warm up. Enable the base-capacity setting (minimum 8 RPUs) to keep a warm footprint. For 24/7 BI tools with consistent concurrency above 20–30 users, provisioned RA3 with reserved pricing will be more cost-effective. For batch analytical workloads, sporadic dashboards, and development environments, Serverless is the clear winner on operational simplicity and cost.