SQL Server Memory Internals: Buffer Pool, Plan Cache and Memory Grants (2026)
SQL Server is a memory-hungry engine by design. Given enough RAM, it caches every frequently read data page and every reusable query plan — eliminating physical I/O entirely for warm workloads. Understanding how the buffer pool, plan cache, and workspace memory pools work lets you size memory correctly, diagnose pressure before it becomes a crisis, and tune the right knobs when something goes wrong.
- Memory Architecture Overview
- The Buffer Pool: Data and Index Page Cache
- Buffer Pool Internals: Lazywriter and Clock Algorithm
- NUMA-Aware Memory
- Plan Cache: Structure and Memory Clerks
- Plan Cache Pressure and Single-Use Plans
- Workspace Memory: Grants for Sort and Hash
- RESOURCE_SEMAPHORE: Memory Grant Waits
- max server memory: Sizing and Configuration
- Diagnosing Memory Pressure
- Key Memory Clerks and What They Mean
- Read Next
Memory Architecture Overview
SQL Server manages its own memory within the limits set by max server memory. The total memory pool is divided between several consumers:
Total SQL Server Memory (bounded by max server memory)
├── Buffer Pool
│ ├── Data / index page cache (largest consumer)
│ ├── Free buffer list
│ └── Stolen memory (non-page-pool allocations below)
├── Plan Cache (stolen from buffer pool)
├── Workspace Memory (sort, hash, index build grants)
├── Connection memory (per-connection overhead)
├── Lock manager memory
├── CLR / Extended Stored Procedures
└── Other clerks (XML, full-text, etc.)
The key insight is that stolen memory (everything except data pages) comes out of the same pool as the buffer pool. When plan cache or workspace memory grows, fewer data pages can be cached. SQL Server's memory broker tries to balance these consumers dynamically.
The Buffer Pool: Data and Index Page Cache
The buffer pool is a hash-table-indexed cache of 8 KB data and index pages. Every read from disk goes through the buffer pool:
- SQL Server checks the buffer pool hash table for the requested page (database_id + file_id + page_id)
- Buffer hit: page found → return immediately (no I/O)
- Buffer miss: allocate a free buffer frame, read the page from disk into it, update the hash table, return to the caller (PAGEIOLATCH wait)
The buffer hit ratio is the most important single memory metric:
SELECT
cntr_value AS buffer_cache_hit_ratio
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Buffer cache hit ratio'
AND object_name LIKE '%Buffer Manager%';
A healthy OLTP system should have a hit ratio above 99%. Values below 95% indicate that the working set does not fit in the buffer pool — add RAM or reduce the data footprint.
-- Buffer pool usage by database
SELECT
DB_NAME(database_id) AS db_name,
COUNT(*) * 8 / 1024 AS cached_mb,
SUM(CAST(is_modified AS INT)) * 8 / 1024 AS dirty_mb
FROM sys.dm_os_buffer_descriptors
WHERE database_id > 4 -- exclude system databases
GROUP BY database_id
ORDER BY cached_mb DESC;
Buffer Pool Internals: Lazywriter and Clock Algorithm
When the buffer pool is full and a new page needs to be loaded, SQL Server must evict an existing page. It uses a clock sweep (second-chance LRU) algorithm:
- Each buffer frame has a reference count that increments each time the page is accessed
- The lazywriter background thread sweeps through buffer frames, decrementing reference counts
- Frames that reach zero reference count become candidates for eviction
- Dirty frames (modified pages not yet written to disk) must be written to disk before eviction
- The lazywriter also writes dirty pages proactively to keep a pool of clean, evictable frames ready
When the lazywriter cannot keep up with eviction demand (too many dirty pages to flush), SQL Server may start issuing synchronous I/O to free frames — causing latency spikes. This shows up as elevated PAGEIOLATCH_EX and increased checkpoint pages/sec.
-- Buffer pool pressure indicators
SELECT
cntr_value AS value,
counter_name
FROM sys.dm_os_performance_counters
WHERE counter_name IN (
'Lazy writes/sec',
'Checkpoint pages/sec',
'Page life expectancy',
'Free pages'
)
AND object_name LIKE '%Buffer Manager%';
NUMA-Aware Memory
Modern multi-socket servers use Non-Uniform Memory Access (NUMA) architecture — each CPU socket has a local memory bank that it accesses faster than remote memory on another socket. SQL Server is fully NUMA-aware:
- The buffer pool is partitioned into NUMA nodes — each node has its own free list and its own lazywriter thread
- Worker threads are assigned to schedulers on the same NUMA node as the memory they access where possible
- Large memory grants are allocated from the NUMA node local to the executing thread
-- NUMA node memory breakdown
SELECT
memory_node_id,
virtual_address_space_reserved_kb / 1024 AS reserved_mb,
virtual_address_space_committed_kb / 1024 AS committed_mb,
locked_page_allocations_kb / 1024 AS locked_pages_mb,
pages_kb / 1024 AS pages_mb
FROM sys.dm_os_memory_nodes
ORDER BY memory_node_id;
Plan Cache: Structure and Memory Clerks
Compiled query plans are cached in the plan cache so subsequent executions can skip the compilation step. The plan cache is a set of hash buckets within the buffer pool's stolen memory region.
Each cached plan has a usecounts (reference count) and an objtype:
-- Top 20 plans by memory usage
SELECT TOP 20
cp.usecounts,
cp.size_in_bytes / 1024 AS size_kb,
cp.objtype,
cp.cacheobjtype,
SUBSTRING(qt.text, 1, 200) AS sql_text
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) qt
ORDER BY cp.size_in_bytes DESC;
-- Plan cache summary by object type
SELECT
objtype,
COUNT(*) AS plan_count,
SUM(size_in_bytes)/1048576 AS total_mb,
AVG(usecounts) AS avg_usecounts
FROM sys.dm_exec_cached_plans
GROUP BY objtype
ORDER BY total_mb DESC;
Object types:
- Proc — stored procedure plans
- Prepared — parameterized queries via sp_executesql
- Adhoc — literal SQL strings (may be single-use plans)
- View, Trigger, UsrTab — other object types
Plan Cache Pressure and Single-Use Plans
On systems running ad-hoc SQL with literal values (not parameterized), the plan cache fills with thousands of single-use plans — compiled once, never reused. This wastes memory and CPU on compilation.
-- Find single-use plans consuming memory
SELECT
SUM(CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) / 1048576 AS single_use_mb,
SUM(size_in_bytes) / 1048576 AS total_plan_cache_mb,
100.0 * SUM(CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END)
/ NULLIF(SUM(size_in_bytes), 0) AS single_use_pct
FROM sys.dm_exec_cached_plans
WHERE objtype = 'Adhoc';
If single-use plans are >30% of the plan cache, enable optimize for ad hoc workloads:
-- Store only a plan stub on first execution; full plan only if query runs again
EXEC sp_configure 'optimize for ad hoc workloads', 1;
RECONFIGURE;
This reduces plan cache memory by 50–80% on ad-hoc-heavy systems with no performance downside for queries that run more than once.
ALTER DATABASE YourDB SET PARAMETERIZATION FORCED tells SQL Server to auto-parameterize literal values in ad-hoc queries, reusing plans across different literal values. Use carefully — it can cause parameter sniffing issues for queries with skewed distributions.
Workspace Memory: Grants for Sort and Hash
Queries with sort (ORDER BY, GROUP BY) or hash (hash join, hash aggregate) operations request a memory grant before executing. The grant comes from the workspace memory pool — a separate region from the buffer pool, bounded by max server memory × 75% total, with individual query grants capped.
The query optimizer estimates the required grant based on cardinality estimates. If the estimate is accurate, the query runs in-memory. If the actual data is larger than estimated, the query spills to TempDB.
-- Active memory grants
SELECT
session_id,
requested_memory_kb / 1024 AS requested_mb,
granted_memory_kb / 1024 AS granted_mb,
used_memory_kb / 1024 AS used_mb,
max_used_memory_kb / 1024 AS max_used_mb,
queue_id,
wait_order,
is_next_candidate,
grant_time
FROM sys.dm_exec_query_memory_grants
ORDER BY granted_memory_kb DESC;
If max_used_memory_kb is much less than granted_memory_kb, the query received an over-estimated grant — memory is being wasted while other queries queue for grants.
RESOURCE_SEMAPHORE: Memory Grant Waits
When the workspace memory pool is exhausted, new queries requiring memory grants queue behind a resource semaphore. The wait type RESOURCE_SEMAPHORE accumulates on these queued sessions.
-- Check resource semaphore state
SELECT
resource_semaphore_id,
target_memory_kb / 1024 AS target_mb,
max_target_memory_kb / 1024 AS max_target_mb,
total_memory_kb / 1024 AS total_mb,
available_memory_kb / 1024 AS available_mb,
granted_memory_kb / 1024 AS granted_mb,
used_memory_kb / 1024 AS used_mb,
grantee_count,
waiter_count,
timeout_error_count
FROM sys.dm_exec_query_resource_semaphores;
If waiter_count > 0 persistently, you have memory grant contention. Root causes:
- Too many concurrent queries with large grants (reduce MAXDOP to reduce per-query parallelism)
- Bad cardinality estimates causing over-sized grants (update statistics)
- Insufficient
max server memoryleaving too little for workspace - In-memory OLTP or other large non-buffer-pool allocations squeezing workspace
max server memory: Sizing and Configuration
The default max server memory is 2,147,483,647 MB (unlimited) — SQL Server will consume all available RAM, leaving nothing for the OS. Always set this explicitly.
Recommended sizing formula:
max server memory = Total RAM
- OS reservation (4–8 GB for Windows)
- 1 GB per 4 GB RAM above 16 GB (OS paging overhead)
- Other SQL Server instances (if multiple)
- SSIS, SSRS, SSAS if co-located
Practical examples:
| Total RAM | Recommended max server memory |
|---|---|
| 16 GB | 12 GB |
| 32 GB | 26 GB |
| 64 GB | 54 GB |
| 128 GB | 110 GB |
| 256 GB | 220 GB |
-- Set max server memory
EXEC sp_configure 'show advanced options', 1; RECONFIGURE;
EXEC sp_configure 'max server memory (MB)', 54000; -- 54 GB for a 64 GB server
RECONFIGURE;
-- Verify
SELECT name, value_in_use FROM sys.configurations WHERE name = 'max server memory (MB)';
min server memory prevents buffer pool shrinkage during brief OS pressure (e.g., a backup agent spike). Set it to 25–50% of max server memory on dedicated database servers.
Diagnosing Memory Pressure
-- Overall memory state
SELECT
physical_memory_in_use_kb / 1024 AS sql_using_mb,
locked_page_allocations_kb / 1024 AS locked_pages_mb,
page_fault_count,
memory_utilization_percentage
FROM sys.dm_os_process_memory;
-- Memory grants and pending grants (workspace pressure)
SELECT waiter_count, grantee_count, available_memory_kb / 1024 AS avail_mb
FROM sys.dm_exec_query_resource_semaphores;
-- Page life expectancy trend
SELECT
cntr_value AS ple_seconds,
GETDATE() AS captured_at
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Page life expectancy'
AND object_name LIKE '%Buffer Manager%';
-- Memory clerk breakdown
SELECT TOP 15
type,
SUM(pages_kb) / 1024 AS pages_mb,
SUM(virtual_memory_committed_kb) / 1024 AS virtual_mb
FROM sys.dm_os_memory_clerks
GROUP BY type
ORDER BY pages_mb DESC;
MEMORYCLERK_SQLBUFFERPOOL (buffer pool — should be largest), CACHESTORE_SQLCP (SQL plans), CACHESTORE_OBJCP (object plans), MEMORYCLERK_SQLQUERYPLAN (workspace grants).
Key Memory Clerks and What They Mean
| Clerk Type | What It Holds | Concern Threshold |
|---|---|---|
| MEMORYCLERK_SQLBUFFERPOOL | Data and index pages | Should be 70–85% of total; lower = not enough RAM |
| CACHESTORE_SQLCP | Ad-hoc SQL plans | >500 MB suggests single-use plan bloat |
| CACHESTORE_OBJCP | Stored procedure / trigger plans | >1 GB may indicate large plan counts |
| MEMORYCLERK_SQLQUERYPLAN | Workspace memory grants | Spikes during large sort/hash operations |
| MEMORYCLERK_SQLSTORENG | Storage engine per-thread structures | High with many concurrent connections |
| MEMORYCLERK_XTP | In-memory OLTP (Hekaton) tables | Grows with in-memory table data; monitor growth |
| MEMORYCLERK_SOSNODE | SQLOS node-level structures | Fixed overhead per NUMA node |