SQL Server Columnstore Indexes: Delta Stores, Row Groups and Batch Mode (2026)
Columnstore indexes store data column-by-column rather than row-by-row. A query that aggregates three columns out of fifty reads only those three columns from disk — typically 10–100× less I/O than a row-store scan. Combined with batch-mode execution (processing 900 rows at a time instead of one), columnstore indexes can accelerate analytical queries by 10–100× over traditional B-tree indexes with no hardware changes.
- Row-Store vs Column-Store: The Core Difference
- Row Groups: 1,048,576 Rows per Segment
- VertiPaq Compression: Encoding Schemes
- The Delta Store: Handling DML on Columnstore
- The Tuple Mover: Delta to Compressed
- Batch Mode Execution
- Segment Elimination: Skipping Row Groups
- Clustered Columnstore Index
- Non-Clustered Columnstore Index (HTAP)
- Creating Columnstore Indexes
- Monitoring Columnstore Health
- Columnstore Maintenance: Rebuild vs Reorganize
- When to Use Columnstore Indexes
- Read Next
Row-Store vs Column-Store: The Core Difference
A traditional B-tree index stores rows: all columns of a row are physically adjacent on the same 8 KB page. To read 3 columns from 1 million rows, SQL Server must read every page containing those rows — including all the other columns you didn't ask for.
A columnstore index stores columns: all values of a single column are stored together, compressed, in a column segment. To read 3 columns from 1 million rows, SQL Server reads only the 3 column segments for those columns — skipping the other 47 columns entirely.
-- Row store layout (simplified)
Page 1: [Row1: Col1,Col2,...,Col50] [Row2: Col1,Col2,...,Col50] ...
Page 2: [Row101: ...]
-- Columnstore layout
Col1_Segment: [Row1.Col1, Row2.Col1, ..., Row1048576.Col1] ← compressed
Col2_Segment: [Row1.Col2, Row2.Col2, ..., Row1048576.Col2] ← compressed
...
Col50_Segment: [Row1.Col50, ..., Row1048576.Col50] ← compressed
Row Groups: 1,048,576 Rows per Segment
Data in a columnstore index is organised into row groups — each holding up to 1,048,576 rows (2²⁰). For each row group, there is one column segment per column.
Row group states:
- COMPRESSED — fully loaded, VertiPaq-compressed, immutable. The normal state for bulk-loaded data.
- OPEN — a delta store row group accepting new inserts, stored as a B-tree. Up to 1,048,576 rows, then closed.
- CLOSED — delta store row group full, waiting for the tuple mover to compress it.
- TOMBSTONE — a compressed row group that has been deleted and is awaiting cleanup.
Row groups with fewer than ~102,400 rows (10% of max) are called trimmed row groups — they compress poorly and slow scans. Many small row groups are a sign of row-by-row inserts rather than bulk loads.
VertiPaq Compression: Encoding Schemes
SQL Server applies one of several encoding schemes per column segment, choosing the one that produces the best compression for that column's data:
| Encoding | Best For | Example |
|---|---|---|
| Value encoding | Numeric columns with small range | Status codes (1–10) |
| Dictionary encoding | Low-cardinality strings/numerics | Country codes, product categories |
| Run-length encoding (RLE) | Sorted columns with repeated values | Date partitions, sorted status columns |
| Bit-packing | Columns that fit in few bits after range reduction | Month (1–12 fits in 4 bits) |
These encodings are applied before the segment is also compressed with LZ4 (COLUMNSTORE_ARCHIVE compression) or left with just the encoding (COLUMNSTORE). You can choose:
-- Standard columnstore compression (default)
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sales ON dbo.Sales;
-- Archive compression (higher ratio, slower access — for cold partitions)
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sales ON dbo.Sales
WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE);
The Delta Store: Handling DML on Columnstore
Columnstore segments are immutable — you cannot insert a row into a compressed segment. Instead, inserts go to the delta store, which is a traditional B-tree rowstore structure. The delta store accumulates rows until it is full (1,048,576 rows), then the tuple mover compresses it into a new columnstore segment.
DELETE from a columnstore: the row is not physically removed from the compressed segment. Instead, a delete bitmap records the deleted rows. Queries skip rows flagged in the delete bitmap. Physical removal happens during an index rebuild.
UPDATE = DELETE (marks old row in delete bitmap) + INSERT (adds new row to delta store).
-- Check delta store and delete bitmap state
SELECT
rg.object_id,
OBJECT_NAME(rg.object_id) AS table_name,
rg.row_group_id,
rg.state_description,
rg.total_rows,
rg.deleted_rows,
rg.size_in_bytes / 1024 AS size_kb,
rg.trim_reason_desc
FROM sys.dm_db_column_store_row_group_physical_stats rg
WHERE rg.object_id = OBJECT_ID('dbo.Sales')
ORDER BY rg.row_group_id;
The Tuple Mover: Delta to Compressed
A background thread called the tuple mover monitors CLOSED delta store row groups and compresses them into columnstore segments. It runs automatically and requires no configuration.
Triggering conditions:
- A delta store row group reaches 1,048,576 rows (CLOSED state)
- The tuple mover background thread wakes up (runs approximately every 5 minutes)
You can force immediate compression of closed row groups with an index reorganize:
-- Force tuple mover to compress all CLOSED delta stores
ALTER INDEX CCI_Sales ON dbo.Sales REORGANIZE
WITH (COMPRESS_ALL_ROW_GROUPS = ON);
Batch Mode Execution
Batch mode is a query execution model where operators process batches of ~900 rows at a time rather than one row at a time. This dramatically reduces per-row overhead (function call, null check, type conversion) and enables SIMD CPU vectorization.
Batch mode is automatically used when:
- A query accesses a columnstore index (SQL 2012+)
- A query is eligible for batch mode on rowstore — SQL Server 2019+ with compatibility level 150 can use batch mode for hash joins and aggregates even without a columnstore index, when batch mode is beneficial
You can see batch mode in an execution plan: operators show Estimated Execution Mode: Batch in their properties. The "Columnstore Index Scan" and "Hash Match" are the most common batch-mode operators.
-- Force batch mode on a query (useful for testing)
SELECT SUM(Amount), COUNT(*), ProductID
FROM dbo.Sales
GROUP BY ProductID
OPTION (USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE'));
-- Check if batch mode is being used
-- Look for "Actual Execution Mode = Batch" in the actual execution plan
Segment Elimination: Skipping Row Groups
Each column segment stores min/max statistics (minimum and maximum value in the segment). When a query has a predicate like WHERE SaleDate = '2026-06-11', SQL Server compares the predicate value against each row group's min/max for the SaleDate column. Row groups where the predicate value falls outside the min/max range are skipped entirely — no I/O, no decompression.
This is called segment elimination and is the second major performance advantage of columnstore (after column projection). On a 5-year fact table partitioned by month, a query for one month skips 98% of row groups.
-- Check segment elimination effectiveness (after running a query)
-- Look for "Segment Reads" vs "Segment Skips" in the execution plan XML
-- Or use Extended Events:
SELECT * FROM sys.dm_db_column_store_row_group_physical_stats
WHERE object_id = OBJECT_ID('dbo.Sales')
ORDER BY row_group_id;
-- High deleted_rows + many OPEN/CLOSED groups = needs REBUILD
Clustered Columnstore Index
A Clustered Columnstore Index (CCI) is the primary storage structure — it replaces the heap or clustered B-tree. All rows in the table are stored in columnstore format. There is no separate copy of the data.
-- Create a new table with CCI as primary storage
CREATE TABLE dbo.SalesFact (
SaleID BIGINT NOT NULL,
ProductID INT NOT NULL,
CustomerID INT NOT NULL,
SaleDate DATE NOT NULL,
Amount DECIMAL(18,2) NOT NULL,
Quantity INT NOT NULL,
RegionCode CHAR(3) NOT NULL
);
CREATE CLUSTERED COLUMNSTORE INDEX CCI_SalesFact ON dbo.SalesFact;
-- Convert existing rowstore table to CCI (drops existing clustered index)
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Orders ON dbo.Orders
WITH (DROP_EXISTING = OFF, ONLINE = ON); -- ONLINE requires Enterprise
CCI is ideal for:
- Data warehouse fact tables with 10M+ rows
- Tables accessed primarily by aggregations, not single-row lookups
- Historical/archive tables with mostly read traffic
Non-Clustered Columnstore Index (HTAP)
A Non-Clustered Columnstore Index (NCCI) adds columnstore storage alongside an existing B-tree (clustered or heap). The original row-store remains for OLTP point lookups; the NCCI accelerates analytical aggregations on the same table. This is the HTAP (Hybrid Transactional/Analytical Processing) pattern.
-- Add NCCI to an existing OLTP table
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Orders_Analytics
ON dbo.Orders (OrderDate, ProductID, CustomerID, Amount, Quantity)
WHERE OrderDate >= '2024-01-01'; -- filtered: only recent data for analytics
The optimizer automatically chooses between the B-tree and the NCCI based on estimated cost. OLTP queries (seek by OrderID) use the clustered B-tree. Analytics queries (SUM(Amount) GROUP BY ProductID) use the NCCI with batch mode.
Creating Columnstore Indexes
-- Clustered columnstore (replaces existing storage)
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sales ON dbo.Sales
WITH (MAXDOP = 8, ONLINE = ON); -- ONLINE = ON requires Enterprise
-- Non-clustered columnstore (alongside B-tree)
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Orders
ON dbo.Orders (SaleDate, ProductID, Amount, Quantity, CustomerID);
-- With row group ordering hint (SQL 2022+)
-- Sorts data by SaleDate before building segments → better segment elimination
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sales ON dbo.Sales
ORDER (SaleDate);
-- Archive compression for cold data
CREATE CLUSTERED COLUMNSTORE INDEX CCI_SalesArchive ON dbo.SalesArchive
WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE);
Monitoring Columnstore Health
-- Row group health overview
SELECT
OBJECT_NAME(object_id) AS table_name,
index_id,
state_description,
COUNT(*) AS row_group_count,
SUM(total_rows) AS total_rows,
SUM(deleted_rows) AS deleted_rows,
100.0 * SUM(deleted_rows)
/ NULLIF(SUM(total_rows),0) AS deleted_pct,
AVG(total_rows) AS avg_rows_per_group,
SUM(size_in_bytes)/1048576 AS total_mb
FROM sys.dm_db_column_store_row_group_physical_stats
WHERE object_id = OBJECT_ID('dbo.Sales')
GROUP BY object_id, index_id, state_description
ORDER BY state_description;
-- Identify poorly filled row groups (< 100K rows = trimmed)
SELECT
OBJECT_NAME(object_id) AS table_name,
row_group_id,
total_rows,
trim_reason_desc,
state_description
FROM sys.dm_db_column_store_row_group_physical_stats
WHERE total_rows < 102400
AND state_description = 'COMPRESSED'
AND object_id = OBJECT_ID('dbo.Sales')
ORDER BY total_rows;
Columnstore Maintenance: Rebuild vs Reorganize
REORGANIZE — online, incremental:
- Compresses all CLOSED delta stores (runs the tuple mover)
- With
COMPRESS_ALL_ROW_GROUPS = ON: also compresses small OPEN delta stores - Merges small compressed row groups together (SQL 2016+)
- Does NOT remove deleted rows from compressed segments
REBUILD — offline (or online with Enterprise):
- Removes all deleted rows from compressed segments
- Merges small row groups into full ones
- Re-applies optimal compression to all segments
- Resets statistics
-- Reorganize (online, minimal blocking)
ALTER INDEX CCI_Sales ON dbo.Sales REORGANIZE
WITH (COMPRESS_ALL_ROW_GROUPS = ON);
-- Rebuild (offline by default; ONLINE requires Enterprise edition)
ALTER INDEX CCI_Sales ON dbo.Sales REBUILD
WITH (MAXDOP = 8, ONLINE = ON);
-- Rebuild recommendation: when deleted_pct > 20% or avg_rows_per_group < 500K
When to Use Columnstore Indexes
| Scenario | Recommendation |
|---|---|
| Data warehouse fact table (50M+ rows, aggregation-heavy) | Clustered Columnstore Index |
| OLTP table with analytics queries running alongside OLTP | Non-Clustered Columnstore (HTAP) |
| Historical/archive data rarely modified | CCI with COLUMNSTORE_ARCHIVE compression |
| Table with frequent single-row lookups and joins by PK | Keep B-tree; add NCCI only if aggregation queries are slow |
| Table with <1M rows | Usually not worth it; B-tree scan is fast enough |
| String columns with high cardinality (GUIDs, email addresses) | Compress poorly; include in CCI only if needed for queries |