SQL Server Columnstore Indexes: Delta Stores, Row Groups and Batch Mode (2026)

SQL Server Columnstore Indexes

Columnstore indexes store data column-by-column rather than row-by-row. A query that aggregates three columns out of fifty reads only those three columns from disk — typically 10–100× less I/O than a row-store scan. Combined with batch-mode execution (processing 900 rows at a time instead of one), columnstore indexes can accelerate analytical queries by 10–100× over traditional B-tree indexes with no hardware changes.

Row-Store vs Column-Store: The Core Difference

A traditional B-tree index stores rows: all columns of a row are physically adjacent on the same 8 KB page. To read 3 columns from 1 million rows, SQL Server must read every page containing those rows — including all the other columns you didn't ask for.

A columnstore index stores columns: all values of a single column are stored together, compressed, in a column segment. To read 3 columns from 1 million rows, SQL Server reads only the 3 column segments for those columns — skipping the other 47 columns entirely.

-- Row store layout (simplified)
Page 1: [Row1: Col1,Col2,...,Col50] [Row2: Col1,Col2,...,Col50] ...
Page 2: [Row101: ...]

-- Columnstore layout
Col1_Segment: [Row1.Col1, Row2.Col1, ..., Row1048576.Col1]  ← compressed
Col2_Segment: [Row1.Col2, Row2.Col2, ..., Row1048576.Col2]  ← compressed
...
Col50_Segment: [Row1.Col50, ..., Row1048576.Col50]           ← compressed
Typical compression ratio: Columnstore segments achieve 5–10× compression versus row-store pages for typical OLAP data, because consecutive values in a column are often similar or repetitive (e.g., a SalesRegion column with only 4 distinct values across 50 million rows compresses to near nothing with dictionary encoding).

Row Groups: 1,048,576 Rows per Segment

Data in a columnstore index is organised into row groups — each holding up to 1,048,576 rows (2²⁰). For each row group, there is one column segment per column.

Row group states:

  • COMPRESSED — fully loaded, VertiPaq-compressed, immutable. The normal state for bulk-loaded data.
  • OPEN — a delta store row group accepting new inserts, stored as a B-tree. Up to 1,048,576 rows, then closed.
  • CLOSED — delta store row group full, waiting for the tuple mover to compress it.
  • TOMBSTONE — a compressed row group that has been deleted and is awaiting cleanup.

Row groups with fewer than ~102,400 rows (10% of max) are called trimmed row groups — they compress poorly and slow scans. Many small row groups are a sign of row-by-row inserts rather than bulk loads.

VertiPaq Compression: Encoding Schemes

SQL Server applies one of several encoding schemes per column segment, choosing the one that produces the best compression for that column's data:

EncodingBest ForExample
Value encodingNumeric columns with small rangeStatus codes (1–10)
Dictionary encodingLow-cardinality strings/numericsCountry codes, product categories
Run-length encoding (RLE)Sorted columns with repeated valuesDate partitions, sorted status columns
Bit-packingColumns that fit in few bits after range reductionMonth (1–12 fits in 4 bits)

These encodings are applied before the segment is also compressed with LZ4 (COLUMNSTORE_ARCHIVE compression) or left with just the encoding (COLUMNSTORE). You can choose:

-- Standard columnstore compression (default)
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sales ON dbo.Sales;

-- Archive compression (higher ratio, slower access — for cold partitions)
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sales ON dbo.Sales
WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE);

The Delta Store: Handling DML on Columnstore

Columnstore segments are immutable — you cannot insert a row into a compressed segment. Instead, inserts go to the delta store, which is a traditional B-tree rowstore structure. The delta store accumulates rows until it is full (1,048,576 rows), then the tuple mover compresses it into a new columnstore segment.

DELETE from a columnstore: the row is not physically removed from the compressed segment. Instead, a delete bitmap records the deleted rows. Queries skip rows flagged in the delete bitmap. Physical removal happens during an index rebuild.

UPDATE = DELETE (marks old row in delete bitmap) + INSERT (adds new row to delta store).

-- Check delta store and delete bitmap state
SELECT
    rg.object_id,
    OBJECT_NAME(rg.object_id)   AS table_name,
    rg.row_group_id,
    rg.state_description,
    rg.total_rows,
    rg.deleted_rows,
    rg.size_in_bytes / 1024     AS size_kb,
    rg.trim_reason_desc
FROM sys.dm_db_column_store_row_group_physical_stats rg
WHERE rg.object_id = OBJECT_ID('dbo.Sales')
ORDER BY rg.row_group_id;

The Tuple Mover: Delta to Compressed

A background thread called the tuple mover monitors CLOSED delta store row groups and compresses them into columnstore segments. It runs automatically and requires no configuration.

Triggering conditions:

  • A delta store row group reaches 1,048,576 rows (CLOSED state)
  • The tuple mover background thread wakes up (runs approximately every 5 minutes)

You can force immediate compression of closed row groups with an index reorganize:

-- Force tuple mover to compress all CLOSED delta stores
ALTER INDEX CCI_Sales ON dbo.Sales REORGANIZE
WITH (COMPRESS_ALL_ROW_GROUPS = ON);
Bulk load bypass: Bulk inserts of 102,400+ rows in a single batch bypass the delta store entirely and write directly to compressed segments. This is why bulk loading a staging table into a columnstore target is much faster than row-by-row inserts.

Batch Mode Execution

Batch mode is a query execution model where operators process batches of ~900 rows at a time rather than one row at a time. This dramatically reduces per-row overhead (function call, null check, type conversion) and enables SIMD CPU vectorization.

Batch mode is automatically used when:

  • A query accesses a columnstore index (SQL 2012+)
  • A query is eligible for batch mode on rowstore — SQL Server 2019+ with compatibility level 150 can use batch mode for hash joins and aggregates even without a columnstore index, when batch mode is beneficial

You can see batch mode in an execution plan: operators show Estimated Execution Mode: Batch in their properties. The "Columnstore Index Scan" and "Hash Match" are the most common batch-mode operators.

-- Force batch mode on a query (useful for testing)
SELECT SUM(Amount), COUNT(*), ProductID
FROM dbo.Sales
GROUP BY ProductID
OPTION (USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE'));

-- Check if batch mode is being used
-- Look for "Actual Execution Mode = Batch" in the actual execution plan
Batch mode on rowstore (SQL 2019+): Even without a columnstore index, the optimizer can choose batch mode for hash joins and aggregates on large tables when it estimates the batch mode savings outweigh the overhead of switching modes. This is called batch mode on rowstore and requires compatibility level 150.

Segment Elimination: Skipping Row Groups

Each column segment stores min/max statistics (minimum and maximum value in the segment). When a query has a predicate like WHERE SaleDate = '2026-06-11', SQL Server compares the predicate value against each row group's min/max for the SaleDate column. Row groups where the predicate value falls outside the min/max range are skipped entirely — no I/O, no decompression.

This is called segment elimination and is the second major performance advantage of columnstore (after column projection). On a 5-year fact table partitioned by month, a query for one month skips 98% of row groups.

-- Check segment elimination effectiveness (after running a query)
-- Look for "Segment Reads" vs "Segment Skips" in the execution plan XML
-- Or use Extended Events:
SELECT * FROM sys.dm_db_column_store_row_group_physical_stats
WHERE object_id = OBJECT_ID('dbo.Sales')
ORDER BY row_group_id;
-- High deleted_rows + many OPEN/CLOSED groups = needs REBUILD
Maximize segment elimination by sorting: If data is loaded in SaleDate order, each row group covers a narrow date range, maximising the number of row groups that can be eliminated for date-range queries. If data is loaded in random order, min/max spans the full date range in every row group — no elimination possible.

Clustered Columnstore Index

A Clustered Columnstore Index (CCI) is the primary storage structure — it replaces the heap or clustered B-tree. All rows in the table are stored in columnstore format. There is no separate copy of the data.

-- Create a new table with CCI as primary storage
CREATE TABLE dbo.SalesFact (
    SaleID       BIGINT NOT NULL,
    ProductID    INT NOT NULL,
    CustomerID   INT NOT NULL,
    SaleDate     DATE NOT NULL,
    Amount       DECIMAL(18,2) NOT NULL,
    Quantity     INT NOT NULL,
    RegionCode   CHAR(3) NOT NULL
);
CREATE CLUSTERED COLUMNSTORE INDEX CCI_SalesFact ON dbo.SalesFact;

-- Convert existing rowstore table to CCI (drops existing clustered index)
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Orders ON dbo.Orders
WITH (DROP_EXISTING = OFF, ONLINE = ON);  -- ONLINE requires Enterprise

CCI is ideal for:

  • Data warehouse fact tables with 10M+ rows
  • Tables accessed primarily by aggregations, not single-row lookups
  • Historical/archive tables with mostly read traffic
CCI and OLTP: Single-row lookups on a CCI require scanning all segments (no B-tree seek). For OLTP tables with frequent point lookups, use a non-clustered columnstore (see below) on top of a B-tree instead.

Non-Clustered Columnstore Index (HTAP)

A Non-Clustered Columnstore Index (NCCI) adds columnstore storage alongside an existing B-tree (clustered or heap). The original row-store remains for OLTP point lookups; the NCCI accelerates analytical aggregations on the same table. This is the HTAP (Hybrid Transactional/Analytical Processing) pattern.

-- Add NCCI to an existing OLTP table
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Orders_Analytics
ON dbo.Orders (OrderDate, ProductID, CustomerID, Amount, Quantity)
WHERE OrderDate >= '2024-01-01';  -- filtered: only recent data for analytics

The optimizer automatically chooses between the B-tree and the NCCI based on estimated cost. OLTP queries (seek by OrderID) use the clustered B-tree. Analytics queries (SUM(Amount) GROUP BY ProductID) use the NCCI with batch mode.

Updateable NCCI (SQL 2014+): In SQL Server 2014+, non-clustered columnstore indexes are updateable. Earlier versions required a table with no non-clustered columnstore to be read-only (set to readonly filegroup) to allow the NCCI.

Creating Columnstore Indexes

-- Clustered columnstore (replaces existing storage)
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sales ON dbo.Sales
WITH (MAXDOP = 8, ONLINE = ON);  -- ONLINE = ON requires Enterprise

-- Non-clustered columnstore (alongside B-tree)
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Orders
ON dbo.Orders (SaleDate, ProductID, Amount, Quantity, CustomerID);

-- With row group ordering hint (SQL 2022+)
-- Sorts data by SaleDate before building segments → better segment elimination
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sales ON dbo.Sales
ORDER (SaleDate);

-- Archive compression for cold data
CREATE CLUSTERED COLUMNSTORE INDEX CCI_SalesArchive ON dbo.SalesArchive
WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE);

Monitoring Columnstore Health

-- Row group health overview
SELECT
    OBJECT_NAME(object_id)      AS table_name,
    index_id,
    state_description,
    COUNT(*)                    AS row_group_count,
    SUM(total_rows)             AS total_rows,
    SUM(deleted_rows)           AS deleted_rows,
    100.0 * SUM(deleted_rows)
          / NULLIF(SUM(total_rows),0) AS deleted_pct,
    AVG(total_rows)             AS avg_rows_per_group,
    SUM(size_in_bytes)/1048576  AS total_mb
FROM sys.dm_db_column_store_row_group_physical_stats
WHERE object_id = OBJECT_ID('dbo.Sales')
GROUP BY object_id, index_id, state_description
ORDER BY state_description;

-- Identify poorly filled row groups (< 100K rows = trimmed)
SELECT
    OBJECT_NAME(object_id)  AS table_name,
    row_group_id,
    total_rows,
    trim_reason_desc,
    state_description
FROM sys.dm_db_column_store_row_group_physical_stats
WHERE total_rows < 102400
  AND state_description = 'COMPRESSED'
  AND object_id = OBJECT_ID('dbo.Sales')
ORDER BY total_rows;

Columnstore Maintenance: Rebuild vs Reorganize

REORGANIZE — online, incremental:

  • Compresses all CLOSED delta stores (runs the tuple mover)
  • With COMPRESS_ALL_ROW_GROUPS = ON: also compresses small OPEN delta stores
  • Merges small compressed row groups together (SQL 2016+)
  • Does NOT remove deleted rows from compressed segments

REBUILD — offline (or online with Enterprise):

  • Removes all deleted rows from compressed segments
  • Merges small row groups into full ones
  • Re-applies optimal compression to all segments
  • Resets statistics
-- Reorganize (online, minimal blocking)
ALTER INDEX CCI_Sales ON dbo.Sales REORGANIZE
WITH (COMPRESS_ALL_ROW_GROUPS = ON);

-- Rebuild (offline by default; ONLINE requires Enterprise edition)
ALTER INDEX CCI_Sales ON dbo.Sales REBUILD
WITH (MAXDOP = 8, ONLINE = ON);

-- Rebuild recommendation: when deleted_pct > 20% or avg_rows_per_group < 500K

When to Use Columnstore Indexes

ScenarioRecommendation
Data warehouse fact table (50M+ rows, aggregation-heavy)Clustered Columnstore Index
OLTP table with analytics queries running alongside OLTPNon-Clustered Columnstore (HTAP)
Historical/archive data rarely modifiedCCI with COLUMNSTORE_ARCHIVE compression
Table with frequent single-row lookups and joins by PKKeep B-tree; add NCCI only if aggregation queries are slow
Table with <1M rowsUsually not worth it; B-tree scan is fast enough
String columns with high cardinality (GUIDs, email addresses)Compress poorly; include in CCI only if needed for queries

Read Next

← SQL Server Hub