SQL Server Statistics: How They Work, Auto-Update and Maintenance (2026)

SQL Server Statistics Guide

Statistics are the query optimizer's view of your data. When statistics are accurate, the optimizer picks efficient plans with minimal I/O. When they are stale, it mis-estimates row counts and chooses plans that scan millions of rows when a seek would do — a silent cause of sudden performance regressions after bulk loads.

This guide explains how SQL Server builds and stores statistics objects, what the histogram and density vector actually contain, how auto-update thresholds work (and where they break down), and how to build a maintenance routine that keeps statistics fresh without drowning your server in update overhead.

What Are Statistics Objects

A statistics object is a binary blob stored in the system catalog that describes the distribution of values in one or more columns. SQL Server creates statistics objects automatically when:

  • You create an index (statistics are created on the index key columns)
  • The query optimizer needs selectivity information for a column used in a WHERE clause or JOIN condition and AUTO_CREATE_STATISTICS is ON

You can also create them manually with CREATE STATISTICS. Each statistics object contains:

  • A header — row count, page count, sample rate, last update time
  • A density vector — selectivity for column prefixes
  • A histogram — up to 200 steps showing value distribution for the leading column

To list all statistics objects on a table:

SELECT
    s.name           AS stats_name,
    s.auto_created,
    s.user_created,
    s.has_filter,
    s.filter_definition,
    sp.last_updated,
    sp.rows,
    sp.rows_sampled,
    sp.modification_counter
FROM sys.stats s
CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) sp
WHERE s.object_id = OBJECT_ID('dbo.Orders')
ORDER BY sp.last_updated DESC;
Statistics names: Auto-created single-column statistics get names like _WA_Sys_00000003_0425A276 (the hex suffix encodes the object ID). Index statistics share the index name.

The Histogram: RANGE_HI_KEY and EQ_ROWS

The histogram captures value distribution for the leading column of a statistics object, divided into up to 200 steps. Each step covers a value range and contains five values:

DBCC SHOW_STATISTICS ('dbo.Orders', 'IX_Orders_OrderDate') WITH HISTOGRAM;

Sample output:

RANGE_HI_KEY            EQ_ROWS   RANGE_ROWS   DISTINCT_RANGE_ROWS   AVG_RANGE_ROWS
----------------------  --------  -----------  --------------------  ---------------
2024-01-31 00:00:00.000    1250      0.0              0                   1.0
2024-02-29 00:00:00.000    1190   38400.0            30                1280.0
2024-03-31 00:00:00.000    1310   41200.0            28                1471.4
...
  • RANGE_HI_KEY — the upper boundary value of this step
  • EQ_ROWS — estimated rows equal to RANGE_HI_KEY
  • RANGE_ROWS — estimated rows between the previous step's RANGE_HI_KEY and this one (exclusive)
  • DISTINCT_RANGE_ROWS — number of distinct values in RANGE_ROWS
  • AVG_RANGE_ROWS — RANGE_ROWS / DISTINCT_RANGE_ROWS — what the optimizer uses for equality predicates on non-boundary values

When you write WHERE OrderDate = '2024-03-15', the optimizer falls between two RANGE_HI_KEY boundaries and uses AVG_RANGE_ROWS as its row estimate. A value at a boundary uses EQ_ROWS exactly.

200-step limit: If a column has thousands of distinct values the histogram can only capture 200 of them. Values in dense ranges get averaged, which can produce poor estimates for highly skewed data. Filtered statistics (covered later) solve this.

The Density Vector: Multi-Column Selectivity

The density vector contains one row per column prefix. For a statistics object on (CustomerID, OrderDate, Status), you get three density rows:

DBCC SHOW_STATISTICS ('dbo.Orders', 'IX_Orders_Customer_Date') WITH DENSITY_VECTOR;
All density   Average Length  Columns
------------- ---------------  ---------------------------------
0.0001        4.0              CustomerID
0.000025      12.0             CustomerID, OrderDate
0.0000083     13.0             CustomerID, OrderDate, Status

All density = 1 / (number of distinct values). Lower density means higher selectivity. Density for CustomerID alone of 0.0001 means 10,000 distinct values in the table (1/0.0001). The optimizer multiplies density by the table's row count to estimate rows for equality predicates beyond the first column.

Why it matters for JOINs: When joining two tables, the optimizer uses the density vector of the join key to estimate the result cardinality. Wrong densities → wrong join order → wrong join algorithm (hash match when nested loops would win, or vice versa).

Reading Statistics with DBCC SHOW_STATISTICS

The full syntax returns all three parts of a statistics object:

-- All three parts
DBCC SHOW_STATISTICS ('dbo.Orders', 'IX_Orders_OrderDate');

-- Only the header (most useful for freshness checks)
DBCC SHOW_STATISTICS ('dbo.Orders', 'IX_Orders_OrderDate') WITH STAT_HEADER;

-- Only histogram
DBCC SHOW_STATISTICS ('dbo.Orders', 'IX_Orders_OrderDate') WITH HISTOGRAM;

-- Only density vector
DBCC SHOW_STATISTICS ('dbo.Orders', 'IX_Orders_OrderDate') WITH DENSITY_VECTOR;

The header shows the most important freshness indicators:

Name                   Updated                   Rows    Rows Sampled  Steps  Density  ...
---------------------  ------------------------  ------- ------------  -----  -------  ---
IX_Orders_OrderDate    Jun 10 2026  3:15PM        9824033   984403       200   1.0E-7   ...
  • Rows — row count when statistics were last built
  • Rows Sampled — rows actually read (sample vs full scan)
  • Updated — timestamp of last statistics update
Rows Sampled vs Rows: When Rows Sampled is much less than Rows, the histogram was built on a sample. For skewed distributions, a small sample can produce very inaccurate step heights. Use WITH FULLSCAN for critical columns.

Auto-Create Statistics

With AUTO_CREATE_STATISTICS ON (the default), the optimizer silently creates single-column statistics on any column referenced in a WHERE or JOIN predicate when it has no existing statistics for that column. This happens during query compilation — the first execution of the query is delayed while statistics are built.

-- Check current setting
SELECT name, is_auto_create_stats_on, is_auto_update_stats_on
FROM sys.databases
WHERE name = DB_NAME();

-- Disable (rarely advisable)
ALTER DATABASE YourDB SET AUTO_CREATE_STATISTICS OFF;

Disabling auto-create is useful only on OLAP systems where you control all statistics manually with full-scan updates. For OLTP, leave it ON.

Auto-Update Threshold: The 20% Rule and Trace Flag 2371

By default, SQL Server auto-updates statistics after approximately 20% of rows have changed (inserts, updates, deletes). For a 1-million-row table that means 200,000 modifications before statistics refresh. The exact formula (pre-SQL 2016) is:

threshold = MAX(500, 0.20 × table_row_count)

This was fine for small tables but catastrophic for large ones. A 500-million-row table requires 100 million modifications before auto-update fires — meanwhile plans degrade as data drifts from the histogram.

SQL Server 2016+ fix: A dynamic threshold was introduced (enabled by default since 2016 with compatibility level 130+):

-- Check compatibility level
SELECT compatibility_level FROM sys.databases WHERE name = DB_NAME();

-- For older compat levels, enable the dynamic threshold via trace flag
DBCC TRACEON(2371, -1);  -- server-wide
-- Or per-database:
ALTER DATABASE YourDB SET AUTO_UPDATE_STATISTICS_ASYNC ON;

The dynamic threshold formula is approximately SQRT(1000 × row_count), which scales sub-linearly. A 500-million-row table now updates after ~22,000 modifications rather than 100 million.

Trace Flag 2371 deprecation: In SQL Server 2016+ with compatibility level 130+, the dynamic threshold is built in. TF2371 is only needed for databases running an older compat level on a 2016+ instance.

Asynchronous Statistics Update

By default, statistics auto-update is synchronous: the query that triggers the update is blocked until statistics are rebuilt, then recompiles and executes. On a 500M-row table with WITH FULLSCAN this could block for seconds.

Asynchronous update lets the triggering query use the existing (stale) statistics and compile immediately. The update runs in the background:

ALTER DATABASE YourDB SET AUTO_UPDATE_STATISTICS_ASYNC ON;

Trade-off: the query that triggers the async update uses a potentially stale plan. The next query compilation after the background update completes will get the fresh plan. For OLTP workloads with frequent short queries, async update is almost always better — it eliminates the latency spike at the cost of one additional stale-plan query execution.

Diagnosing Stale Statistics

Several DMVs help identify statistics that need updating:

-- Tables with high modification counters relative to row count
SELECT
    OBJECT_NAME(s.object_id)    AS table_name,
    s.name                       AS stats_name,
    sp.last_updated,
    sp.rows,
    sp.modification_counter,
    CAST(100.0 * sp.modification_counter / NULLIF(sp.rows,0) AS DECIMAL(10,2)) AS pct_modified
FROM sys.stats s
CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) sp
WHERE OBJECTPROPERTY(s.object_id,'IsUserTable') = 1
  AND sp.modification_counter > 0
ORDER BY pct_modified DESC;
-- Queries with large estimated vs actual row count differences (plan cache)
SELECT TOP 20
    qs.total_elapsed_time / qs.execution_count       AS avg_elapsed_us,
    qs.execution_count,
    SUBSTRING(qt.text, qs.statement_start_offset/2+1,
        (CASE WHEN qs.statement_end_offset = -1 THEN LEN(qt.text)*2
              ELSE qs.statement_end_offset END - qs.statement_start_offset)/2+1) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE qp.query_plan LIKE '%StatisticsInfo%'
ORDER BY avg_elapsed_us DESC;
Quick check per table: sys.dm_db_stats_properties returns modification_counter = number of leading-column modifications since the last statistics update. Any value approaching 20% of rows means auto-update is about to fire — or already missed if the table is large.

Manual UPDATE STATISTICS Options

The basic syntax updates all statistics on a table:

UPDATE STATISTICS dbo.Orders;

Key options:

-- Full scan (most accurate, most I/O)
UPDATE STATISTICS dbo.Orders WITH FULLSCAN;

-- Specific statistics object only
UPDATE STATISTICS dbo.Orders IX_Orders_OrderDate WITH FULLSCAN;

-- Sample percentage (trades accuracy for speed)
UPDATE STATISTICS dbo.Orders WITH SAMPLE 30 PERCENT;

-- Sample row count
UPDATE STATISTICS dbo.Orders WITH SAMPLE 500000 ROWS;

-- Resample — reuse the same sample rate as the previous update
UPDATE STATISTICS dbo.Orders WITH RESAMPLE;

-- No recompute — update but suppress auto-update flag (prevents future auto-updates)
UPDATE STATISTICS dbo.Orders WITH NORECOMPUTE;  -- use with caution
NORECOMPUTE warning: Setting NORECOMPUTE disables auto-update for that statistics object permanently until you explicitly update it again without NORECOMPUTE. Use only in very controlled scenarios (e.g., you own the maintenance window entirely).

Filtered Statistics for Skewed Data

When a column has highly skewed distribution — for example, 95% of orders have Status = 'Completed' and 5% have Status = 'Pending' — a single histogram averages across the entire distribution. Queries filtering on Status = 'Pending' get poor estimates.

A filtered statistics object focuses the histogram on a specific subset:

-- Create filtered statistics on the sparse value
CREATE STATISTICS stat_Orders_Pending
ON dbo.Orders (OrderDate, CustomerID)
WHERE Status = 'Pending'
WITH FULLSCAN;

-- Filtered index (automatically creates filtered statistics)
CREATE INDEX IX_Orders_Pending_Date
ON dbo.Orders (OrderDate)
INCLUDE (CustomerID, Total)
WHERE Status = 'Pending';

With filtered statistics, a query WHERE Status = 'Pending' AND OrderDate > '2026-01-01' can use the filtered histogram and get an accurate estimate of the small result set, leading to a seek + lookup instead of a full table scan.

Multi-Column Statistics

SQL Server auto-creates only single-column statistics. If your queries filter on multiple columns with correlated values, the optimizer multiplies individual selectivities and underestimates the result set.

-- Example: Country + City are correlated — most Cities belong to one Country
-- Single-column stats will underestimate: P(Country='US') × P(City='New York')
-- Multi-column stats capture the correlation

CREATE STATISTICS stat_Customer_Country_City
ON dbo.Customers (Country, City)
WITH FULLSCAN;

SQL Server 2014+ introduced incremental cardinality estimation improvements, but multi-column stats on correlated columns remain valuable especially when the new cardinality estimator (CE 120+) is in use.

Incremental Statistics on Partitioned Tables

On large partitioned tables, updating statistics requires reading the entire table unless you enable incremental statistics (SQL Server 2014+). With incremental statistics, each partition maintains its own statistics fragment, and you can update just the modified partition:

-- Enable incremental statistics at table level
CREATE STATISTICS stat_Sales_Date
ON dbo.Sales (SaleDate)
WITH FULLSCAN, INCREMENTAL = ON;

-- Update only partition 12 (the current month)
UPDATE STATISTICS dbo.Sales stat_Sales_Date
WITH RESAMPLE ON PARTITIONS(12);

-- Check which partitions have stale stats
SELECT
    object_name(s.object_id)   AS table_name,
    s.name                      AS stats_name,
    ddsp.partition_number,
    ddsp.last_updated,
    ddsp.rows,
    ddsp.modification_counter
FROM sys.stats s
CROSS APPLY sys.dm_db_incremental_stats_properties(s.object_id, s.stats_id) ddsp
WHERE s.object_id = OBJECT_ID('dbo.Sales')
ORDER BY ddsp.partition_number;
Benefit: On a 5-year-old partitioned sales table with 60 partitions, incremental stats lets you update only the current month's partition after nightly loads — cutting statistics maintenance time from hours to minutes.

Statistics Maintenance Script

A practical maintenance routine updates statistics with FULLSCAN for objects with high modification ratios, and uses sampling for the rest:

-- Update all statistics where > 10% of rows modified (FULLSCAN)
-- and > 1% but <= 10% with 30% sample
DECLARE @sql NVARCHAR(MAX) = N'';

SELECT @sql += N'UPDATE STATISTICS ' +
    QUOTENAME(OBJECT_SCHEMA_NAME(s.object_id)) + '.' +
    QUOTENAME(OBJECT_NAME(s.object_id)) + ' ' +
    QUOTENAME(s.name) +
    CASE
        WHEN CAST(100.0 * sp.modification_counter / NULLIF(sp.rows,0) AS DECIMAL(10,2)) > 10
        THEN ' WITH FULLSCAN;' + CHAR(10)
        ELSE ' WITH SAMPLE 30 PERCENT;' + CHAR(10)
    END
FROM sys.stats s
CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) sp
WHERE OBJECTPROPERTY(s.object_id,'IsUserTable') = 1
  AND sp.modification_counter > 0
  AND CAST(100.0 * sp.modification_counter / NULLIF(sp.rows,0) AS DECIMAL(10,2)) > 1
ORDER BY CAST(100.0 * sp.modification_counter / NULLIF(sp.rows,0) AS DECIMAL(10,2)) DESC;

PRINT @sql;
-- EXEC sp_executesql @sql;  -- Uncomment to execute
Alternative: Ola Hallengren's IndexOptimize handles statistics maintenance with sophisticated sampling logic and partition awareness — production-proven and free.

Read Next

← SQL Server Hub