SQL Server Join Operators: Nested Loops, Hash Match and Merge Join (2026)

1. Physical vs Logical Operators: How the Optimizer Translates JOINs

When you write a JOIN in T-SQL, you are expressing a logical operation — you declare what data you want, not how to retrieve it. SQL Server's cost-based query optimizer then translates that logical intent into one or more physical operators — the actual algorithms executed at runtime. For every logical JOIN, the optimizer has exactly three physical algorithms to choose from: Nested Loops, Hash Match, and Merge Join.

The distinction matters enormously for performance. The same logical query can run in milliseconds or minutes depending on which physical operator the optimizer selects and whether it selected correctly given the actual data distribution. Understanding the algorithms lets you read execution plans meaningfully, recognize bad choices, and intervene when necessary.

The Optimizer's Selection Process

The optimizer estimates the cost of each physical join algorithm using table statistics — row count estimates, column selectivity, and the presence or absence of supporting indexes. It assigns an estimated cost (in abstract "units") to each alternative and picks the cheapest. The three main inputs to this decision are:

Estimated row counts on both inputs (outer and inner)
Whether an index exists on the join column of the inner input
Whether inputs are already sorted on the join key

Logical vs physical: A logical LEFT OUTER JOIN can be physically implemented as a Nested Loops Left Outer Join, a Hash Match Left Outer Join, or a Merge Join Left Outer Join. The keyword OUTER is a logical modifier; the algorithm (Loops / Hash / Merge) is the physical choice.

Reading the Execution Plan Icon

In SQL Server Management Studio (SSMS), each join operator appears as a distinct icon in the graphical execution plan. You can hover to see the estimated vs actual row counts, the join type, and the join predicate. All three operators show an arrow indicating the outer (driving) input on top and the inner (build or probe) input below.

2. Nested Loops Join: How It Works

Nested Loops is conceptually the simplest join algorithm. SQL Server iterates through every row of the outer input (the driving side, typically the smaller table), and for each outer row it seeks into the inner input (typically the larger indexed table) to find matching rows.

-- Conceptual pseudocode for Nested Loops
FOR EACH outer_row IN outer_input:
    SEEK inner_input WHERE join_key = outer_row.join_key
    EMIT (outer_row + matching_inner_rows)

The key requirement for efficient Nested Loops is a seekable index on the inner input's join column. Without it, SQL Server must scan the entire inner table for every outer row — which becomes catastrophically expensive at scale.

When the Optimizer Chooses Nested Loops

The optimizer favors Nested Loops when:

The outer input is small (typically a few hundred to a few thousand rows)
The inner input has a supporting index on the join column (enabling a seek)
The query has a TOP, FAST n hint, or applies an outer filter that limits the outer rows early
The query runs on a single CPU thread (serial plan) — Nested Loops is natural for serial execution

-- This query is an ideal Nested Loops candidate:
-- Outer: filtered to 1 customer (1 row)
-- Inner: OrderID has a clustered index seek
SELECT c.CustomerName, o.OrderDate, o.TotalAmount
FROM   Sales.Customers c
JOIN   Sales.Orders    o ON o.CustomerID = c.CustomerID
WHERE  c.CustomerID = 42;

-- Execution plan will show:
--   Clustered Index Seek on Customers (1 row)
--   Nested Loops
--   Index Seek on Orders.IX_Orders_CustomerID (n rows)

Nested Loops and Key Lookups: When the inner index is non-clustered and does not cover all the SELECT columns, SQL Server adds a Key Lookup after the index seek to fetch the remaining columns from the clustered index. One Key Lookup per outer row — fine for 10 outer rows, devastating for 100,000.

3. Nested Loops Cost Formula: When It Becomes a Disaster

The computational complexity of Nested Loops is O(n × m) in the worst case — where n is the outer row count and m is the cost of probing the inner input per outer row. With an index seek, m is essentially O(log m_total), making the total O(n × log m). Without an index on the inner side, m becomes O(m_total) and the algorithm degrades to a full cross-product scan.

-- Disaster scenario: 1,000 outer rows × 1,000,000 inner rows, no inner index
-- The optimizer SHOULD switch to Hash Match here, but if stats are stale
-- or row estimates are wildly off, it may not.

-- Check for this pattern: look for a Nested Loops with a Table Scan
-- (not an Index Seek) on the inner side.

SELECT o.OrderID, od.ProductID, od.Quantity
FROM   Sales.Orders      o
JOIN   Sales.OrderDetail od ON od.OrderID = o.OrderID
WHERE  o.Status = 'Pending';

-- If the optimizer estimates 10 rows for Orders but actually gets 50,000:
--   50,000 outer rows × full scan of OrderDetail = catastrophic

The classic sign of a bad Nested Loops plan is a fat arrow feeding a small arrow in the graphical plan — meaning a large number of rows is driving seeks into an input that was estimated to be tiny. Check "Actual Number of Rows" vs "Estimated Number of Rows" on the outer input. A 100× mismatch indicates stale statistics.

-- Update statistics to fix bad cardinality estimates
UPDATE STATISTICS Sales.Orders WITH FULLSCAN;
UPDATE STATISTICS Sales.OrderDetail WITH FULLSCAN;

-- Or use AUTO_UPDATE_STATISTICS (enabled by default):
SELECT name, is_auto_update_stats_on
FROM   sys.databases
WHERE  name = DB_NAME();

The 1,000 outer rows rule of thumb: Nested Loops with an index seek is generally efficient up to a few thousand outer rows. Beyond ~10,000 outer rows driving seeks into a large table, Hash Match or Merge Join almost always wins. The exact threshold depends on index depth and row width.

4. Hash Match Join: Build Phase and Probe Phase

Hash Match is the workhorse algorithm for joining large, unsorted data sets. It operates in two distinct phases:

Build phase: SQL Server reads the build input (the smaller of the two inputs) and constructs an in-memory hash table. Each row's join key is hashed into a bucket. The build input is typically placed on top in the execution plan icon.
Probe phase: SQL Server streams every row from the probe input (the larger input), hashes each row's join key using the same hash function, and looks up matching rows in the hash table. Matches are emitted immediately as output rows.

-- Hash Match is ideal here: two large tables, no useful index on join key
-- Build input: Sales.Products (smaller)
-- Probe input: Sales.OrderDetail (larger)

SELECT p.ProductName, SUM(od.Quantity) AS TotalSold
FROM   Sales.Products    p
JOIN   Sales.OrderDetail od ON od.ProductID = p.ProductID
GROUP  BY p.ProductName
ORDER  BY TotalSold DESC;

-- Hash Match Join followed by a Hash Match Aggregate
-- Both inputs scanned in full once each — O(n + m) total I/O

Hash Match has O(n + m) complexity — it touches each input exactly once. This makes it dramatically better than Nested Loops when both inputs are large and unsorted. The optimizer selects Hash Match when neither input is small enough to drive efficient seeks and neither input is pre-sorted on the join key.

Why the Optimizer Places the Smaller Table on the Build Side

The hash table must fit in memory granted to the query. SQL Server uses statistics to estimate the smaller input and assigns it as the build side. If the estimate is wrong and the build input turns out to be larger than the memory grant, a hash spill to tempdb occurs — one of the most common performance killers in production workloads.

5. Hash Match Memory Grants and Spills to TempDB

Every Hash Match operator requires a memory grant — a reservation of SQL Server buffer pool memory allocated before the query begins executing. The size of the grant is calculated from the estimated row count and row width of the build input. If the actual data is larger than the estimate, the hash table overflows to tempdb in a process called a hash spill (or hash recursion at extreme scales).

-- Monitor active query memory grants
SELECT
    r.session_id,
    r.status,
    mg.requested_memory_kb,
    mg.granted_memory_kb,
    mg.used_memory_kb,
    mg.ideal_memory_kb,
    r.wait_type,
    SUBSTRING(t.text, (r.statement_start_offset/2)+1,
        ((CASE r.statement_end_offset
            WHEN -1 THEN DATALENGTH(t.text)
            ELSE r.statement_end_offset END
          - r.statement_start_offset)/2)+1) AS query_text
FROM   sys.dm_exec_requests r
JOIN   sys.dm_exec_query_memory_grants mg
         ON mg.session_id = r.session_id
CROSS  APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE  mg.granted_memory_kb > 0
ORDER  BY mg.granted_memory_kb DESC;

The key columns to monitor are granted_memory_kb vs used_memory_kb. When used_memory_kb approaches or exceeds granted_memory_kb, a spill is imminent or occurring. The ideal_memory_kb column shows what the query would need to run entirely in memory.

-- Identify queries that have spilled to tempdb in the plan cache
SELECT
    qp.query_plan,
    qs.total_spills,
    qs.total_logical_reads,
    qs.execution_count,
    qs.total_spills / qs.execution_count AS avg_spills_per_exec,
    SUBSTRING(qt.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset
            WHEN -1 THEN DATALENGTH(qt.text)
            ELSE qs.statement_end_offset END
          - qs.statement_start_offset)/2)+1) AS query_text
FROM   sys.dm_exec_query_stats qs
CROSS  APPLY sys.dm_exec_sql_text(qs.sql_handle)  qt
CROSS  APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE  qs.total_spills > 0
ORDER  BY qs.total_spills DESC;

Hash spills to tempdb: A spill causes SQL Server to write the overflowing hash buckets to tempdb and re-read them during the probe phase. A single-level spill can increase query duration 2–5×. A recursive spill (level 2+) can increase it 10–100×. Always investigate total_spills > 0 in the plan cache.

Fixing Hash Spills

The root cause is almost always a cardinality underestimate. Fix strategies in priority order:

Update statistics with FULLSCAN on both join tables
Check for parameter sniffing — the cached plan may have been compiled for a small parameter value; see reading execution plans for diagnosis
Use OPTION (MIN_GRANT_PERCENT = n) or OPTION (MAX_GRANT_PERCENT = n) hints to adjust the memory grant
Add a filtered index or covering index to reduce the build input size

6. Merge Join: Sorted Inputs and O(n+m) Elegance

Merge Join is the most efficient algorithm when both inputs are already sorted on the join key in the same order. It works exactly like merging two sorted lists: SQL Server reads both inputs simultaneously, advancing whichever pointer has the smaller current value, and emits matches when both pointers point to equal keys.

-- Merge Join is natural here: both inputs come from clustered index scans
-- on their join keys, so they arrive pre-sorted.

SELECT c.CustomerName, o.OrderDate, o.TotalAmount
FROM   Sales.Customers c      -- Clustered index on CustomerID
JOIN   Sales.Orders    o
         ON o.CustomerID = c.CustomerID  -- Clustered index on CustomerID
ORDER  BY c.CustomerID;

-- Plan: Clustered Index Scan (Customers) →
--       Clustered Index Scan (Orders) →
--       Merge Join (Many-to-Many or One-to-Many)
-- No sort operators needed — data arrives pre-sorted from both scans

Merge Join's complexity is O(n + m) — like Hash Match, it touches each row exactly once. Its advantage over Hash Match is that it requires zero memory grant for the join itself (no hash table to build), making it ideal for memory-constrained environments or very wide rows. Its disadvantage is the mandatory pre-sort requirement.

When Merge Join Is Not Free

If the inputs are not already sorted, the optimizer must add explicit Sort operators before the Merge Join. A Sort operator has O(n log n) complexity and requires a memory grant of its own. When the optimizer adds two Sort operators to enable a Merge Join, it often would have been cheaper to use Hash Match instead.

-- This query forces the optimizer to sort before merge-joining
-- if no suitable sorted index exists on StatusDate:
SELECT c.CustomerName, o.OrderDate
FROM   Sales.Customers c
JOIN   Sales.Orders    o ON o.CustomerID  = c.CustomerID
                        AND o.StatusDate >= '2026-01-01'
ORDER  BY o.StatusDate;

-- If no index on StatusDate: optimizer adds Sort → Merge Join
-- Watch for Sort operator with estimated subtree cost > Hash Match alternative

Merge Join and many-to-many: A standard Merge Join handles one-to-many joins (one unique key on one side) efficiently. For many-to-many joins (duplicate keys on both sides), SQL Server must buffer one side's duplicates in a worktable, which adds overhead. Execution plans show this as "Merge Join (Many-to-Many)" with a worktable reference in I/O statistics.

7. Decision Matrix: Which Operator SQL Server Picks

The optimizer's choice can be summarized in a decision matrix based on three factors: row count estimates, index availability, and sort order of inputs.

Outer rows	Inner index?	Inputs sorted?	Typical choice	Why
Small (< 1,000)	Yes (seek)	N/A	Nested Loops	Few seeks, low total I/O
Small (< 1,000)	No	N/A	Hash Match or NL scan	NL scan still OK for tiny outer
Large (100k+)	No	No	Hash Match	Single pass each, no seek overhead
Large (100k+)	No	Both sorted on key	Merge Join	Zero memory grant, pure streaming
Large (100k+)	Yes (covering)	Index delivers sorted	Merge Join	Index scans arrive sorted, no sort cost
Medium (1k–100k)	Yes (seek)	N/A	Nested Loops or Hash	Optimizer evaluates both; statistics critical

Real-world plans often chain multiple join operators. In a five-table query, the optimizer may pick Nested Loops for the first two small tables, Hash Match for the large intermediate join, and then Merge Join for the final aggregation-friendly join. The art of reading execution plans (covered in detail in our execution plans guide) is understanding these chains and identifying the bottleneck operator.

8. Reading Join Operators in Execution Plans

In the graphical execution plan, join operators appear as icons with two arrows feeding into them — the top arrow is the outer (or build) input, the bottom arrow is the inner (or probe) input. The arrow thickness is proportional to the estimated row count; a mismatch between estimated and actual width is the first thing to investigate.

Key Properties to Check on Every Join Operator

Estimated Rows vs Actual Rows — a large ratio (e.g., 100 estimated, 500,000 actual) means stale statistics and a bad plan choice
Output List — which columns the operator passes upward; wide output lists increase memory pressure
Join Type — Inner, Left Outer, Right Outer, etc.
Predicate — the join condition evaluated; a residual predicate (a second filter applied after the join) can indicate a missing index
Warnings — yellow exclamation marks on the operator indicate spills, missing statistics, or implicit conversions on the join key

-- Use SET STATISTICS PROFILE ON to see actual row counts in text plans
SET STATISTICS PROFILE ON;

SELECT o.OrderID, c.CustomerName, SUM(od.Quantity) AS TotalQty
FROM   Sales.Orders      o
JOIN   Sales.Customers   c  ON c.CustomerID = o.CustomerID
JOIN   Sales.OrderDetail od ON od.OrderID   = o.OrderID
GROUP  BY o.OrderID, c.CustomerName;

SET STATISTICS PROFILE OFF;
-- Look at the "Rows" column vs "EstimateRows" column in output
-- Large gaps between them → stale statistics → wrong join algorithm chosen

Identifying Build vs Probe Side in Hash Match

In SSMS graphical plans, the top input of a Hash Match is the build side (smaller table, builds the hash table). The bottom input is the probe side (larger table, streams through). If you see the build side's arrow is noticeably fatter than the probe side, the optimizer has the inputs reversed — it should be building on the smaller input. This is a common symptom of a stale statistics error where the smaller table's row count is overestimated.

9. Forcing Join Hints: LOOP, HASH, MERGE

SQL Server allows you to override the optimizer's join algorithm choice using query hints. These should be used sparingly and only after profiling — the optimizer is correct the vast majority of the time, and hints prevent it from adapting when data distributions change. That said, hints are a legitimate last resort in specific situations.

-- Force Nested Loops for the entire query
SELECT o.OrderID, c.CustomerName
FROM   Sales.Orders    o
JOIN   Sales.Customers c ON c.CustomerID = o.CustomerID
WHERE  o.Status = 'Shipped'
OPTION (LOOP JOIN);

-- Force Hash Match for the entire query
SELECT o.OrderID, c.CustomerName
FROM   Sales.Orders    o
JOIN   Sales.Customers c ON c.CustomerID = o.CustomerID
OPTION (HASH JOIN);

-- Force Merge Join for the entire query
SELECT o.OrderID, c.CustomerName
FROM   Sales.Orders    o
JOIN   Sales.Customers c ON c.CustomerID = o.CustomerID
ORDER  BY o.CustomerID
OPTION (MERGE JOIN);

You can also specify join hints at the per-join level in the FROM clause using the INNER LOOP JOIN, INNER HASH JOIN, or INNER MERGE JOIN syntax. This is more surgical and preferred over query-level hints:

-- Per-join hint: force Nested Loops only for the Orders→Customers join
-- Leave the Orders→OrderDetail join for the optimizer to decide
SELECT o.OrderID, c.CustomerName, od.ProductID
FROM   Sales.Orders    o
INNER LOOP JOIN Sales.Customers c ON c.CustomerID = o.CustomerID
JOIN   Sales.OrderDetail od ON od.OrderID = o.OrderID;

-- Per-join Hash Match hint (useful when you know Hash is correct
-- but optimizer is choosing NL due to a bad cardinality estimate):
SELECT p.ProductName, COUNT(od.OrderDetailID) AS LineCount
FROM   Sales.Products    p
INNER HASH JOIN Sales.OrderDetail od ON od.ProductID = p.ProductID
GROUP  BY p.ProductName;

When join hints are dangerous: Hints lock in a physical algorithm regardless of data growth. A LOOP JOIN hint that works today (1,000 outer rows) will cause a full-scan Nested Loops disaster next year (500,000 outer rows). Always document why a hint was added and set a calendar reminder to re-evaluate it after the next data volume milestone.

10. Adaptive Joins (SQL Server 2017+): Switching at Runtime

SQL Server 2017 introduced Adaptive Joins as part of the Intelligent Query Processing (IQP) feature family. An Adaptive Join allows the optimizer to defer the choice between Nested Loops and Hash Match until runtime, after the first input has been scanned and the actual row count is known. This directly solves the worst-case scenario of stale statistics causing the wrong algorithm to be chosen at compile time.

An Adaptive Join appears in the execution plan as a single "Adaptive Join" operator icon. It has an internal threshold row count — if the actual outer row count is below the threshold, it executes as Nested Loops; if above the threshold, it switches to Hash Match.

-- Adaptive Joins are enabled by default in SQL 2017+ with database
-- compatibility level 140 or higher.
SELECT compatibility_level, name
FROM   sys.databases
WHERE  name = DB_NAME();

-- To verify Adaptive Joins are active on a specific query,
-- look for "Adaptive Join" in the graphical plan or:
SELECT *
FROM   sys.dm_exec_cached_plans cp
CROSS  APPLY sys.dm_exec_query_plan(cp.plan_handle) qp
WHERE  CAST(qp.query_plan AS NVARCHAR(MAX)) LIKE '%AdaptiveJoin%';

Adaptive Join Threshold

The threshold is computed at plan compilation time based on the estimated cost of both algorithms. If the actual row count at runtime is below the threshold, Nested Loops is cheaper; above, Hash Match wins. The threshold value is visible in the plan XML as AdaptiveThresholdRows.

-- Read the Adaptive Join threshold from plan XML (SQL 2017+)
SELECT
    qp.query_plan.value(
        '(//AdaptiveJoin/@AdaptiveThresholdRows)[1]', 'FLOAT') AS threshold_rows,
    qs.execution_count,
    qs.total_worker_time / qs.execution_count AS avg_cpu_us
FROM   sys.dm_exec_query_stats qs
CROSS  APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE  CAST(qp.query_plan AS NVARCHAR(MAX)) LIKE '%AdaptiveJoin%'
ORDER  BY avg_cpu_us DESC;

Adaptive Joins only cover Nested Loops vs Hash Match. Merge Join is not part of the adaptive decision tree because it requires pre-sorted inputs — a property that must be known at plan compilation time. To benefit from Merge Join, the inputs must arrive sorted via an index scan or a pre-sorted CTE.

11. Index Design for Join Performance

The single most effective way to improve join performance is designing covering indexes on join columns. An index "covers" a query when it satisfies all the columns referenced — both the join key (enabling a seek) and the SELECT list columns (eliminating Key Lookups).

-- Problem: a non-covering index causes Key Lookups on every outer row
-- Table: Sales.OrderDetail (OrderID, ProductID, Quantity, UnitPrice, Discount)
-- Existing index: IX_OrderDetail_OrderID covers only (OrderID)
-- Query needs: OrderID (seek key) + ProductID, Quantity, UnitPrice (output)

-- Current plan: Index Seek → Nested Loops → Key Lookup (per row)
-- With 50,000 outer rows, that is 50,000 Key Lookups

SELECT o.OrderDate, od.ProductID, od.Quantity, od.UnitPrice
FROM   Sales.Orders      o
JOIN   Sales.OrderDetail od ON od.OrderID = o.OrderID
WHERE  o.OrderDate >= '2026-01-01';

-- Fix: create a covering index that includes all needed columns
CREATE NONCLUSTERED INDEX IX_OrderDetail_OrderID_Covering
ON Sales.OrderDetail (OrderID)
INCLUDE (ProductID, Quantity, UnitPrice);

-- New plan: Index Seek (covering) → Nested Loops — Key Lookups eliminated

Filtered Indexes to Reduce Build Input Size

For Hash Match joins on filtered data, a filtered index can dramatically reduce the build input size, shrinking the hash table and eliminating spills:

-- Instead of indexing all 10,000,000 orders, index only the 50,000 pending ones
CREATE NONCLUSTERED INDEX IX_Orders_Pending_CustomerID
ON Sales.Orders (CustomerID, OrderDate)
INCLUDE (TotalAmount)
WHERE Status = 'Pending';

-- Hash Match build phase now reads 50,000 rows instead of 10,000,000
-- Memory grant shrinks by 200×; spill probability drops to near zero

For more on index design strategy and the index selection process, see our dedicated joins guide and the set-based thinking article which covers how index design intersects with query rewrites.

12. Multi-Join Queries: Join Order in Execution Plans

When a query joins three or more tables, the optimizer must decide not only which algorithm to use for each join, but also in which order to perform the joins. The join order determines the size of intermediate result sets passed between operators, which has a massive impact on total query cost.

SQL Server evaluates join reordering through a process called join enumeration. For queries with up to 4 tables, it considers all possible join orders. For 5–15 tables, it uses a cost-guided heuristic search. For 16+ tables, it uses a greedy algorithm that may not find the true optimum.

-- Three-table join: optimizer picks join order based on selectivity estimates
-- It will try to reduce intermediate row counts as early as possible

SELECT c.CustomerName, p.ProductName, SUM(od.Quantity) AS TotalQty
FROM   Sales.Customers   c
JOIN   Sales.Orders      o  ON o.CustomerID = c.CustomerID
JOIN   Sales.OrderDetail od ON od.OrderID   = o.OrderID
JOIN   Sales.Products    p  ON p.ProductID  = od.ProductID
WHERE  c.Region = 'South'    -- Highly selective: reduces c to 500 rows
  AND  o.OrderYear = 2026;   -- Moderately selective: reduces o to 200,000 rows

-- Optimizer should apply c.Region filter first (500 rows)
-- then join to Orders (Nested Loops using CustomerID seek)
-- then join to OrderDetail (Hash Match — large)
-- then join to Products (Nested Loops using ProductID seek — small outer now)

Forcing Join Order

If you are certain the optimizer is choosing a bad join order, use OPTION (FORCE ORDER) to lock in the join order exactly as written in the FROM clause:

-- Force the optimizer to join in exactly the written order
-- Use only when you have profiled and are certain of the better order
SELECT c.CustomerName, o.OrderDate
FROM   Sales.Customers c       -- Join c first
JOIN   Sales.Orders    o       -- Then o
         ON o.CustomerID = c.CustomerID
WHERE  c.CustomerID = 42
OPTION (FORCE ORDER);

13. Troubleshooting: Spills, Unexpected Sorts, and Wrong Algorithm Choices

The three most common join operator problems in production SQL Server are: Hash Match spills to tempdb, unexpected Sort operators added for Merge Join, and Nested Loops driving too many seeks due to bad cardinality estimates.

Diagnosing Hash Match Spills

-- Live spill detection using Extended Events (SQL 2012+)
-- Look for hash_spill_details event in an active XE session, or use:

-- In the graphical plan: yellow warning icon on the Hash Match operator
-- In the plan XML: look for <Warnings><HashSpillDetails>

-- Check tempdb growth during a specific session
SELECT  session_id,
        internal_objects_alloc_page_count * 8 / 1024 AS internal_mb,
        user_objects_alloc_page_count    * 8 / 1024 AS user_mb
FROM    sys.dm_db_task_space_usage
WHERE   session_id = @@SPID;

Diagnosing Unexpected Sort Operators

-- A Sort operator before a Merge Join means inputs were not pre-sorted
-- Check estimated cost of the Sort — if it is >50% of total plan cost,
-- consider whether an index can deliver the data pre-sorted instead.

-- Create an index that delivers OrderDetail rows sorted by CustomerID
-- (via the Orders join key) to avoid the sort before Merge Join:
CREATE NONCLUSTERED INDEX IX_Orders_CustomerID_OrderDate
ON Sales.Orders (CustomerID, OrderDate)
INCLUDE (TotalAmount, Status);

Diagnosing Wrong Join Algorithm (Bad Estimates)

-- Step 1: Find the query in plan cache with largest estimate vs actual gap
SELECT TOP 10
    qs.execution_count,
    qs.total_logical_reads,
    CAST(qp.query_plan AS NVARCHAR(MAX)) AS plan_xml,
    SUBSTRING(qt.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset
            WHEN -1 THEN DATALENGTH(qt.text)
            ELSE qs.statement_end_offset END
          - qs.statement_start_offset)/2)+1) AS query_text
FROM   sys.dm_exec_query_stats qs
CROSS  APPLY sys.dm_exec_sql_text(qs.sql_handle)  qt
CROSS  APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
ORDER  BY qs.total_logical_reads DESC;

-- Step 2: Update statistics on the tables involved
UPDATE STATISTICS Sales.Orders      WITH FULLSCAN;
UPDATE STATISTICS Sales.OrderDetail WITH FULLSCAN;

-- Step 3: Clear the cached plan and re-execute
DBCC FREEPROCCACHE;  -- Only on a dev/test instance; never on production
-- On production, use: EXEC sp_recompile 'Sales.Orders';

14. Practical: Before and After — Same JOIN With and Without Index

The following example walks through a realistic tuning exercise: a slow query with a bad Nested Loops plan, the diagnosis steps, the index fix, and the resulting plan change.

Setup: The Slow Query

-- Environment: Orders table (2,000,000 rows), OrderDetail (15,000,000 rows)
-- No index on OrderDetail.CustomerID (join via Orders)
-- Statistics: last updated 6 months ago

SELECT
    c.CustomerName,
    o.OrderDate,
    SUM(od.Quantity * od.UnitPrice) AS OrderValue
FROM   Sales.Customers   c
JOIN   Sales.Orders      o  ON o.CustomerID = c.CustomerID
JOIN   Sales.OrderDetail od ON od.OrderID   = o.OrderID
WHERE  c.Region    = 'West'
  AND  o.OrderDate >= '2026-01-01'
GROUP  BY c.CustomerName, o.OrderDate;

-- ACTUAL EXECUTION PLAN BEFORE FIX:
-- Clustered Index Scan (Customers, 500 rows for Region='West')
-- Nested Loops
--   Table Scan (Orders, 2,000,000 rows — no index on CustomerID)
--   *** PROBLEM: 500 outer rows × 2M inner scan = 1 billion row comparisons ***
-- Hash Match Aggregate
-- Duration: 47 seconds. Logical reads: 28,000,000

Diagnosis Steps

-- Step 1: Check index on Orders.CustomerID
SELECT  i.name, ic.column_id, c.name AS column_name, i.type_desc
FROM    sys.indexes i
JOIN    sys.index_columns ic ON ic.object_id = i.object_id
                             AND ic.index_id  = i.index_id
JOIN    sys.columns       c  ON c.object_id  = i.object_id
                             AND c.column_id  = ic.column_id
WHERE   i.object_id = OBJECT_ID('Sales.Orders')
ORDER   BY i.name, ic.key_ordinal;
-- Result: clustered index on OrderID only. No index on CustomerID.

-- Step 2: Confirm statistics staleness
EXEC sys.sp_updatestats;        -- Quick auto-update pass
-- Then check DBCC SHOW_STATISTICS to see rowcount vs stats rows:
DBCC SHOW_STATISTICS ('Sales.Orders', 'PK_Orders') WITH STAT_HEADER;

The Fix: Add Covering Index

-- Add index that enables a seek on CustomerID and covers OrderDate + OrderID
CREATE NONCLUSTERED INDEX IX_Orders_CustomerID_Date
ON Sales.Orders (CustomerID, OrderDate)
INCLUDE (OrderID);

-- Also update statistics
UPDATE STATISTICS Sales.Orders      WITH FULLSCAN;
UPDATE STATISTICS Sales.OrderDetail WITH FULLSCAN;
UPDATE STATISTICS Sales.Customers   WITH FULLSCAN;

After: New Execution Plan

-- ACTUAL EXECUTION PLAN AFTER FIX:
-- Clustered Index Scan (Customers, 500 rows for Region='West')
-- Nested Loops
--   Index Seek on IX_Orders_CustomerID_Date (500 seeks, each returning ~50 rows)
-- Nested Loops (inner join to OrderDetail)
--   Index Seek on IX_OrderDetail_OrderID (clustered — existing)
-- Hash Match Aggregate
-- Duration: 0.4 seconds. Logical reads: 42,000
-- Improvement: 47 seconds → 0.4 seconds (117× faster)
-- Logical reads: 28,000,000 → 42,000 (667× fewer)

The lesson: The logical JOIN was identical in both queries. The only change was adding a supporting index that allowed the Nested Loops to use seeks instead of scans. The optimizer correctly chose Nested Loops all along — it just needed the index to make that choice viable. Index design and join operator analysis are inseparable disciplines.

SQL Server Join Operators: Nested Loops, Hash Match and Merge Join (2026)

1. Physical vs Logical Operators: How the Optimizer Translates JOINs

The Optimizer's Selection Process

Reading the Execution Plan Icon

2. Nested Loops Join: How It Works

When the Optimizer Chooses Nested Loops

3. Nested Loops Cost Formula: When It Becomes a Disaster

4. Hash Match Join: Build Phase and Probe Phase

Why the Optimizer Places the Smaller Table on the Build Side

5. Hash Match Memory Grants and Spills to TempDB

Fixing Hash Spills

6. Merge Join: Sorted Inputs and O(n+m) Elegance

When Merge Join Is Not Free

7. Decision Matrix: Which Operator SQL Server Picks

8. Reading Join Operators in Execution Plans

Key Properties to Check on Every Join Operator

Identifying Build vs Probe Side in Hash Match

9. Forcing Join Hints: LOOP, HASH, MERGE

10. Adaptive Joins (SQL Server 2017+): Switching at Runtime

Adaptive Join Threshold

11. Index Design for Join Performance

Filtered Indexes to Reduce Build Input Size

12. Multi-Join Queries: Join Order in Execution Plans

Forcing Join Order

13. Troubleshooting: Spills, Unexpected Sorts, and Wrong Algorithm Choices

Diagnosing Hash Match Spills

Diagnosing Unexpected Sort Operators

Diagnosing Wrong Join Algorithm (Bad Estimates)

14. Practical: Before and After — Same JOIN With and Without Index

Setup: The Slow Query

Diagnosis Steps

The Fix: Add Covering Index

After: New Execution Plan

Read Next

Reading SQL Server Execution Plans

SQL Server Execution Plan Warnings

SQL Server Series

In This Article

1. Physical vs Logical Operators: How the Optimizer Translates JOINs

The Optimizer's Selection Process

Reading the Execution Plan Icon

2. Nested Loops Join: How It Works

When the Optimizer Chooses Nested Loops

3. Nested Loops Cost Formula: When It Becomes a Disaster

4. Hash Match Join: Build Phase and Probe Phase

Why the Optimizer Places the Smaller Table on the Build Side

5. Hash Match Memory Grants and Spills to TempDB

Fixing Hash Spills

6. Merge Join: Sorted Inputs and O(n+m) Elegance

When Merge Join Is Not Free

7. Decision Matrix: Which Operator SQL Server Picks

8. Reading Join Operators in Execution Plans

Key Properties to Check on Every Join Operator

Identifying Build vs Probe Side in Hash Match

9. Forcing Join Hints: LOOP, HASH, MERGE

10. Adaptive Joins (SQL Server 2017+): Switching at Runtime

Adaptive Join Threshold

11. Index Design for Join Performance

Filtered Indexes to Reduce Build Input Size

12. Multi-Join Queries: Join Order in Execution Plans

Forcing Join Order

13. Troubleshooting: Spills, Unexpected Sorts, and Wrong Algorithm Choices

Diagnosing Hash Match Spills

Diagnosing Unexpected Sort Operators

Diagnosing Wrong Join Algorithm (Bad Estimates)

14. Practical: Before and After — Same JOIN With and Without Index

Setup: The Slow Query

Diagnosis Steps

The Fix: Add Covering Index

After: New Execution Plan

Read Next

Reading SQL Server Execution Plans

SQL Server Execution Plan Warnings

SQL Server Series

In This Article

Stay Updated with Techoral