June 10, 2026 | 18 min read | SQL Server
SQL Server's advanced grouping extensions — ROLLUP, CUBE, GROUPING SETS, PIVOT, UNPIVOT, and MERGE — let you compute subtotals, cross-tabulations, and complex aggregations in a single query pass. This guide covers every operator with real T-SQL examples you can run today.
GROUP BY collapses multiple rows that share the same values in the listed columns into a single summary row. Every column that appears in the SELECT list outside of an aggregate function (SUM, COUNT, AVG, MIN, MAX) must be present in the GROUP BY clause — SQL Server enforces this at parse time.
Common aggregate functions recap:
COUNT(*) — number of rows in the groupCOUNT(col) — non-NULL values onlyCOUNT(DISTINCT col) — distinct non-NULL valuesSUM(col) / AVG(col) — total / mean of numeric columnMIN(col) / MAX(col) — range extremesSTRING_AGG(col, sep) (SQL Server 2017+) — concatenate strings within group-- Basic GROUP BY: total sales per region
SELECT
Region,
COUNT(OrderID) AS OrderCount,
SUM(Amount) AS TotalSales,
AVG(Amount) AS AvgOrderValue
FROM dbo.Orders
GROUP BY Region
ORDER BY TotalSales DESC;
GROUP BY. If you want to keep a column in SELECT without grouping by it, wrap it in MAX() or MIN() (useful for deterministic single-value columns like a surrogate key).
WHERE filters rows before grouping; HAVING filters groups after aggregation. Using WHERE to pre-filter rows that will never contribute to a result is always more efficient than filtering with HAVING alone.
-- Find regions with more than 500 orders AND total sales > 1,000,000
SELECT
Region,
COUNT(OrderID) AS OrderCount,
SUM(Amount) AS TotalSales
FROM dbo.Orders
WHERE OrderDate >= '2025-01-01' -- filter rows FIRST (uses index)
GROUP BY Region
HAVING COUNT(OrderID) > 500
AND SUM(Amount) > 1000000
ORDER BY TotalSales DESC;
SQL Server evaluates the logical query plan in this order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. Aggregate expressions like SUM(Amount) are not yet computed during the WHERE phase, which is why you cannot write WHERE SUM(Amount) > 1000000.
WHERE so the optimizer can use indexes to reduce the row count before the hash/sort aggregation operator runs.
GROUP BY ALL was a SQL Server extension that included groups filtered out by WHERE (returning NULLs for their aggregates). It was deprecated in SQL Server 2008 and removed in SQL Server 2022. Do not use it in new code.
A common modern pattern is using DISTINCT to de-duplicate without aggregating, or combining GROUP BY with a window function for ranked or filtered aggregation:
-- Modern replacement: use a subquery or CTE instead of GROUP BY ALL
SELECT r.Region, ISNULL(s.TotalSales, 0) AS TotalSales
FROM (SELECT DISTINCT Region FROM dbo.RegionMaster) r
LEFT JOIN (
SELECT Region, SUM(Amount) AS TotalSales
FROM dbo.Orders
WHERE OrderDate >= '2025-01-01'
GROUP BY Region
) s ON r.Region = s.Region;
GROUP BY ROLLUP(col1, col2, col3) generates subtotal rows at each level of the column hierarchy, plus a grand total row. For n columns it produces n+1 grouping sets. This is far more efficient than writing multiple GROUP BY queries and combining them with UNION ALL.
-- Sales by Year → Region → Month, with subtotals at every level
SELECT
ISNULL(CAST(YEAR(OrderDate) AS VARCHAR), 'Grand Total') AS SalesYear,
ISNULL(Region, '-- All Regions') AS Region,
ISNULL(DATENAME(MONTH, OrderDate), '-- All Months') AS SalesMonth,
SUM(Amount) AS TotalSales,
COUNT(OrderID) AS OrderCount
FROM dbo.Orders
GROUP BY ROLLUP(
YEAR(OrderDate),
Region,
DATENAME(MONTH, OrderDate)
)
ORDER BY
YEAR(OrderDate),
Region,
MONTH(MIN(OrderDate));
The query above produces:
GROUPING(col) (Section 7) to tell them apart.
GROUP BY CUBE(col1, col2, col3) generates subtotals for every possible combination of the listed columns — 2n grouping sets for n columns. It is ideal for cross-tabulation / OLAP-style reporting where you need all marginal totals simultaneously.
-- Cross-tab: sales by Region × ProductCategory × SalesRep
-- Generates all 8 combinations (2^3) including grand total
SELECT
ISNULL(Region, 'All Regions') AS Region,
ISNULL(ProductCategory, 'All Categories') AS ProductCategory,
ISNULL(SalesRep, 'All Reps') AS SalesRep,
SUM(Amount) AS TotalSales
FROM dbo.Orders
GROUP BY CUBE(Region, ProductCategory, SalesRep)
ORDER BY
GROUPING(Region),
GROUPING(ProductCategory),
GROUPING(SalesRep),
Region, ProductCategory, SalesRep;
Ordering by GROUPING() puts detail rows first and grand totals last — useful for report rendering. CUBE with many columns can produce very large result sets; prefer GROUPING SETS when you need only a subset of combinations.
GROUPING SETS lets you specify exactly which grouping combinations you want, without the combinatorial explosion of CUBE. It replaces multiple GROUP BY queries stitched together with UNION ALL — and it processes the data in a single scan, making it significantly faster.
-- Equivalent to three separate GROUP BY + UNION ALL, but far more efficient
SELECT
Region,
ProductCategory,
SalesRep,
SUM(Amount) AS TotalSales
FROM dbo.Orders
GROUP BY GROUPING SETS (
(Region, ProductCategory), -- subtotal per region+category
(Region), -- subtotal per region only
(SalesRep), -- subtotal per sales rep
() -- grand total (empty set)
)
ORDER BY
GROUPING(Region),
GROUPING(ProductCategory),
Region, ProductCategory;
() in GROUPING SETS means the grand total row — equivalent to a SELECT SUM(Amount) FROM dbo.Orders with no GROUP BY.
GROUPING SETS also composes with ROLLUP and CUBE inside a single clause:
-- Mixed: ROLLUP on (Year, Region) PLUS a separate SalesRep subtotal
GROUP BY GROUPING SETS (
ROLLUP(YEAR(OrderDate), Region),
(SalesRep)
)
When ROLLUP or CUBE introduce NULL-filled subtotal rows, you need a way to distinguish them from genuine NULL data values. The GROUPING(col) function returns 1 when the column is aggregated (i.e., it is a subtotal row for that column) and 0 when it is a real group-by value. GROUPING_ID(col1, col2, ...) packs all the GROUPING bits into a single integer bitmask.
-- Use GROUPING() to label rows correctly even when data has real NULLs
SELECT
CASE GROUPING(Region)
WHEN 1 THEN 'Grand Total'
ELSE ISNULL(Region, '(Unknown Region)')
END AS Region,
CASE GROUPING(ProductCategory)
WHEN 1 THEN '-- All Categories'
ELSE ISNULL(ProductCategory, '(Unassigned)')
END AS ProductCategory,
SUM(Amount) AS TotalSales,
GROUPING(Region) AS IsRegionSubtotal,
GROUPING(ProductCategory) AS IsCatSubtotal,
GROUPING_ID(Region, ProductCategory) AS GroupingLevel
FROM dbo.Orders
GROUP BY ROLLUP(Region, ProductCategory)
ORDER BY GroupingLevel, Region, ProductCategory;
GROUPING_ID values for a two-column ROLLUP: 0 = detail, 1 = Region subtotal (Category rolled up), 3 = grand total.
PIVOT rotates distinct values from one column into multiple output columns, applying an aggregate to each cell. When the set of pivot values is known at design time, a static PIVOT is simple and performs well.
-- Pivot quarterly sales: rows = Region, columns = Q1 Q2 Q3 Q4
SELECT
Region,
ISNULL([Q1], 0) AS Q1_Sales,
ISNULL([Q2], 0) AS Q2_Sales,
ISNULL([Q3], 0) AS Q3_Sales,
ISNULL([Q4], 0) AS Q4_Sales,
ISNULL([Q1], 0) + ISNULL([Q2], 0) +
ISNULL([Q3], 0) + ISNULL([Q4], 0) AS YearTotal
FROM (
-- Source query: produce region, quarter label, and amount
SELECT
Region,
'Q' + CAST(DATEPART(QUARTER, OrderDate) AS CHAR(1)) AS Quarter,
Amount
FROM dbo.Orders
WHERE YEAR(OrderDate) = 2025
) AS SourceData
PIVOT (
SUM(Amount)
FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS PivotTable
ORDER BY Region;
SUM) is applied to the value column; FOR col IN (list) specifies which distinct values become column headers. Any column in the source query that is not listed in FOR and is not the value column automatically becomes a GROUP BY column.
When the pivot column values are not known at query-write time (e.g., product names change, new regions are added), you must build the column list at runtime using dynamic SQL.
-- Dynamic PIVOT stored procedure: pivot sales by any product
CREATE OR ALTER PROCEDURE dbo.usp_SalesByProductPivot
@Year INT = 2025
AS
BEGIN
SET NOCOUNT ON;
-- Step 1: collect distinct pivot column names (sanitised)
DECLARE @cols NVARCHAR(MAX) = '';
DECLARE @colsNull NVARCHAR(MAX) = '';
SELECT
@cols = @cols + ', ' + QUOTENAME(ProductName),
@colsNull = @colsNull + ', ISNULL(' + QUOTENAME(ProductName) + ', 0) AS ' + QUOTENAME(ProductName)
FROM (
SELECT DISTINCT TOP 200 ProductName
FROM dbo.Orders
WHERE YEAR(OrderDate) = @Year
ORDER BY ProductName
) t;
-- Remove leading comma
SET @cols = STUFF(@cols, 1, 2, '');
SET @colsNull = STUFF(@colsNull, 1, 2, '');
-- Step 2: build the full dynamic SQL string
DECLARE @sql NVARCHAR(MAX) = N'
SELECT Region, ' + @colsNull + N'
FROM (
SELECT Region, ProductName, Amount
FROM dbo.Orders
WHERE YEAR(OrderDate) = ' + CAST(@Year AS NVARCHAR) + N'
) AS src
PIVOT (
SUM(Amount)
FOR ProductName IN (' + @cols + N')
) AS pvt
ORDER BY Region;';
-- Step 3: execute safely
EXEC sp_executesql @sql;
END;
GO
-- Usage
EXEC dbo.usp_SalesByProductPivot @Year = 2025;
QUOTENAME() around column values before embedding them in dynamic SQL. Never concatenate raw user input into a dynamic query string.
UNPIVOT is the inverse of PIVOT — it normalises a wide (denormalised) table with many columns back into a tall (normalised) table with fewer columns. This is common when consuming reporting data that arrived pre-pivoted.
-- Source: wide table with one column per quarter
-- Goal: normalise to (Region, Quarter, Sales)
SELECT Region, Quarter, Sales
FROM (
SELECT Region, Q1_Sales, Q2_Sales, Q3_Sales, Q4_Sales
FROM dbo.RegionalQuarterlySales
WHERE SalesYear = 2025
) AS WideSrc
UNPIVOT (
Sales FOR Quarter IN (Q1_Sales, Q2_Sales, Q3_Sales, Q4_Sales)
) AS UnpivotResult
ORDER BY Region, Quarter;
UNPIVOT automatically excludes NULL cells — it only returns rows where the value column is non-NULL. If you need NULLs in the output, use a CROSS APPLY VALUES pattern instead:
-- CROSS APPLY VALUES: normalise including NULLs
SELECT r.Region, v.Quarter, v.Sales
FROM dbo.RegionalQuarterlySales r
CROSS APPLY (VALUES
('Q1', r.Q1_Sales),
('Q2', r.Q2_Sales),
('Q3', r.Q3_Sales),
('Q4', r.Q4_Sales)
) AS v(Quarter, Sales)
WHERE r.SalesYear = 2025
ORDER BY r.Region, v.Quarter;
MERGE synchronises a target table with a source dataset in a single atomic statement. It is the standard upsert pattern in SQL Server — replacing the error-prone "check then insert/update" antipattern.
-- Upsert: sync staging data into the production sales summary
MERGE dbo.SalesSummary AS target
USING (
SELECT Region, ProductCategory, SUM(Amount) AS TotalSales, COUNT(*) AS OrderCount
FROM dbo.OrdersStaging
GROUP BY Region, ProductCategory
) AS source
ON target.Region = source.Region
AND target.ProductCategory = source.ProductCategory
-- Row exists in both: update the totals
WHEN MATCHED THEN
UPDATE SET
target.TotalSales = source.TotalSales,
target.OrderCount = source.OrderCount,
target.UpdatedAt = SYSUTCDATETIME()
-- Row exists in source only: insert new summary row
WHEN NOT MATCHED BY TARGET THEN
INSERT (Region, ProductCategory, TotalSales, OrderCount, CreatedAt)
VALUES (source.Region, source.ProductCategory,
source.TotalSales, source.OrderCount, SYSUTCDATETIME())
-- Row exists in target only (stale/removed): delete it
WHEN NOT MATCHED BY SOURCE THEN
DELETE
-- Optional: capture what happened
OUTPUT
$action AS MergeAction,
inserted.Region,
inserted.ProductCategory,
inserted.TotalSales;
GROUP BY or a CTE with ROW_NUMBER().
SQL Server's query optimizer recognises ROLLUP/CUBE/GROUPING SETS and can handle them with a single table scan + sort/hash aggregation pass. Multiple GROUP BY … UNION ALL queries each require a separate scan. On large fact tables with hundreds of millions of rows, ROLLUP can be 3–10× faster.
-- Covering index for the quarterly pivot query
CREATE NONCLUSTERED INDEX IX_Orders_Region_Date_Amount
ON dbo.Orders (Region, OrderDate)
INCLUDE (Amount, OrderID, ProductName);
-- Or go columnstore for OLAP workloads
CREATE NONCLUSTERED COLUMNSTORE INDEX IX_Orders_CS
ON dbo.Orders (Region, ProductName, OrderDate, Amount, OrderID);
SET STATISTICS IO, TIME ON and compare logical reads and CPU time between ROLLUP and equivalent UNION ALL queries on your data volume.
Combining GROUPING SETS with a subsequent PIVOT gives you a compact, single-pass sales dashboard that covers every dimension your reporting layer needs.
-- Step 1: compute all needed groupings in one pass with GROUPING SETS
WITH SalesAgg AS (
SELECT
ISNULL(Region, 'All') AS Region,
ISNULL(ProductCategory, 'All') AS ProductCategory,
ISNULL(CAST(YEAR(OrderDate) AS VARCHAR(4)), 'All') AS SalesYear,
SUM(Amount) AS TotalSales,
COUNT(OrderID) AS OrderCount,
GROUPING_ID(Region, ProductCategory,
YEAR(OrderDate)) AS GrpLevel
FROM dbo.Orders
WHERE OrderDate >= '2024-01-01'
GROUP BY GROUPING SETS (
(Region, ProductCategory, YEAR(OrderDate)), -- full detail
(Region, ProductCategory), -- subtotal: omit year
(Region), -- subtotal: region only
() -- grand total
)
),
-- Step 2: pivot the years into columns
Pivoted AS (
SELECT *
FROM SalesAgg
PIVOT (
SUM(TotalSales)
FOR SalesYear IN ([2024], [2025], [2026], [All])
) AS pvt
)
-- Step 3: final output with year-over-year growth
SELECT
Region,
ProductCategory,
GrpLevel,
ISNULL([2024], 0) AS Sales_2024,
ISNULL([2025], 0) AS Sales_2025,
ISNULL([2026], 0) AS Sales_2026,
ISNULL([All], 0) AS GrandTotal,
CASE WHEN ISNULL([2024], 0) = 0 THEN NULL
ELSE ROUND(100.0 * (ISNULL([2025], 0) - ISNULL([2024], 0))
/ ISNULL([2024], 0), 2)
END AS YoY_Growth_Pct
FROM Pivoted
ORDER BY GrpLevel, Region, ProductCategory;
This pattern produces a ready-to-display matrix that a reporting tool (SSRS, Power BI, or a web front-end) can render directly as a cross-tab — no client-side pivot logic required.
| Feature | Purpose | Rows Generated |
|---|---|---|
GROUP BY | Basic aggregation | One per distinct key combination |
ROLLUP(a,b,c) | Hierarchical subtotals | n+1 grouping sets |
CUBE(a,b,c) | All cross-tab combinations | 2n grouping sets |
GROUPING SETS | Custom combinations (most flexible) | One per listed set |
PIVOT | Rows → columns | One per grouping column value |
UNPIVOT | Columns → rows | One per listed column |
MERGE | Upsert (insert+update+delete) | Affects target rows matched/unmatched |