SQL GROUP BY and HAVING: The Complete Guide

SQL GROUP BY and HAVING are the clauses you reach for the moment someone asks "how many tracks does each genre have?" or "which customers spent more than $40?", you need GROUP BY and HAVING. These two clauses turn raw rows into meaningful summaries — total revenue by country, average order size per customer, top-selling genres. They are the foundation of every dashboard, report, and data analysis query you will ever write.

In this guide you'll master both clauses with real SQL examples on the Chinook music-store database, learn the critical difference between WHERE and HAVING, and practice with hands-on exercises.

How GROUP BY Works
Aggregate Functions: COUNT, SUM, AVG, MIN, MAX
Grouping by Multiple Columns
HAVING — Filter After Grouping
WHERE vs HAVING: Know the Difference
Quick Comparison Cheat Sheet
Common Mistakes to Avoid
Practice Makes Perfect
Summary
Frequently Asked Questions

How GROUP BY Works

GROUP BY collapses rows that share the same value(s) into a single summary row. Every column in your SELECT must either appear in the GROUP BY clause or be wrapped in an aggregate function.

Here's the simplest example — counting how many tracks exist in each genre:

SELECT
  g.Name AS Genre,
  COUNT(*) AS TrackCount
FROM Track t
JOIN Genre g ON g.GenreId = t.GenreId
GROUP BY g.Name
ORDER BY TrackCount DESC
LIMIT 5;

Genre	TrackCount
Rock	1297
Latin	579
Metal	374
Alternative & Punk	332
Jazz	130

Without GROUP BY, this query would return a single count of all tracks. With it, you get one row per genre — each with its own count.

The SQL engine processes GROUP BY after FROM, JOIN, and WHERE. That means rows are first filtered, then grouped. This execution order is key to understanding when to use WHERE versus HAVING.

Try it yourself — Count tracks per album on the Chinook database →

Aggregate Functions: COUNT, SUM, AVG, MIN, MAX

GROUP BY is only useful when paired with aggregate functions. Here are the five you'll use constantly:

Function	Purpose	Example
`COUNT(*)`	Number of rows in each group	Total invoices per customer
`COUNT(column)`	Non-NULL values in the column	Customers with a company name
`SUM(column)`	Total of all values	Revenue per country
`AVG(column)`	Average value	Average invoice amount
`MIN(column)` / `MAX(column)`	Smallest / largest value	Shortest and longest track

Revenue by country

SELECT
  BillingCountry,
  COUNT(*) AS InvoiceCount,
  ROUND(SUM(Total), 2) AS TotalRevenue,
  ROUND(AVG(Total), 2) AS AvgInvoice
FROM Invoice
GROUP BY BillingCountry
ORDER BY TotalRevenue DESC
LIMIT 5;

BillingCountry	InvoiceCount	TotalRevenue	AvgInvoice
USA	91	523.06	5.75
Canada	56	303.96	5.43
France	35	195.10	5.57
Brazil	35	190.10	5.43
Germany	28	156.48	5.59

Notice how you can mix multiple aggregates in a single query. Each one independently summarizes the rows within each country group.

Try it yourself — Calculate revenue distribution by country →

Grouping by Multiple Columns

You aren't limited to grouping by one column. When you add more columns to GROUP BY, the database creates finer-grained groups — one for each unique combination of values.

Annual sales by country

SELECT
  BillingCountry,
  strftime('%Y', InvoiceDate) AS Year,
  ROUND(SUM(Total), 2) AS Revenue
FROM Invoice
WHERE BillingCountry IN ('USA', 'Canada')
GROUP BY BillingCountry, strftime('%Y', InvoiceDate)
ORDER BY BillingCountry, Year
LIMIT 6;

BillingCountry	Year	Revenue
Canada	2021	72.27
Canada	2022	74.25
Canada	2023	72.27
USA	2021	130.68
USA	2022	122.76
USA	2023	130.68

Each row represents one country-year combination. This is how you build trend analyses and time-series reports — group by both the category and the time period.

Try it yourself — Build an annual sales summary →

HAVING — Filter After Grouping

HAVING lets you filter the results of aggregation. While WHERE filters individual rows before grouping, HAVING filters entire groups after the aggregates are calculated.

Find genres with more than 100 tracks

SELECT
  g.Name AS Genre,
  COUNT(*) AS TrackCount
FROM Track t
JOIN Genre g ON g.GenreId = t.GenreId
GROUP BY g.Name
HAVING COUNT(*) > 100
ORDER BY TrackCount DESC;

Genre	TrackCount
Rock	1297
Latin	579
Metal	374
Alternative & Punk	332
Jazz	130

Without HAVING, you'd get all 25 genres. HAVING eliminates the groups that don't meet the threshold. This is something WHERE simply cannot do — you can't write WHERE COUNT(*) > 100 because WHERE runs before any aggregation happens.

Customers who spent more than $40 total

SELECT
  c.FirstName || ' ' || c.LastName AS Customer,
  ROUND(SUM(i.Total), 2) AS TotalSpent
FROM Customer c
JOIN Invoice i ON i.CustomerId = c.CustomerId
GROUP BY c.CustomerId, c.FirstName, c.LastName
HAVING SUM(i.Total) > 40
ORDER BY TotalSpent DESC
LIMIT 5;

Customer	TotalSpent
Helena Holý	49.62
Richard Cunningham	47.62
Luis Rojas	46.62
Ladislav Kovács	45.62
Hugh O'Brien	45.62

HAVING is essential for answering questions like "which X has more than N of Y" — the bread and butter of business reporting.

Try it yourself — Find the most prolific artist IDs →

WHERE vs HAVING: Know the Difference

This is the single most important concept in this guide. Here's the SQL execution order:

FROM / JOIN — assemble the rows
WHERE — filter individual rows
GROUP BY — collapse into groups
HAVING — filter groups
SELECT — compute output columns
ORDER BY — sort the results

Because WHERE runs before GROUP BY, it can only reference individual column values. Because HAVING runs after GROUP BY, it can reference aggregate results.

Same query, different filters

-- WHERE: filter rows BEFORE grouping
-- "Only look at invoices from 2023, then count per country"
SELECT BillingCountry, COUNT(*) AS Cnt
FROM Invoice
WHERE InvoiceDate >= '2023-01-01'
GROUP BY BillingCountry
ORDER BY Cnt DESC
LIMIT 3;

BillingCountry	Cnt
USA	26
Canada	16
Brazil	10

-- HAVING: filter groups AFTER grouping
-- "Count all invoices per country, then only show countries with 30+"
SELECT BillingCountry, COUNT(*) AS Cnt
FROM Invoice
GROUP BY BillingCountry
HAVING COUNT(*) >= 30
ORDER BY Cnt DESC;

BillingCountry	Cnt
USA	91
Canada	56
France	35
Brazil	35

You can (and often should) use both together: WHERE to pre-filter rows, GROUP BY to aggregate, and HAVING to post-filter the groups.

Try it yourself — Analyze average spending per invoice →

Quick Comparison Cheat Sheet

Feature	WHERE	HAVING
Runs	Before GROUP BY	After GROUP BY
Filters	Individual rows	Aggregated groups
Can use aggregates?	No	Yes
Can use column values?	Yes	Only grouped or aggregated
Required with GROUP BY?	No	No
Works without GROUP BY?	Yes	Technically yes, but rarely useful

Rule of thumb: If your condition involves COUNT, SUM, AVG, MIN, or MAX, put it in HAVING. Everything else goes in WHERE.

Common Mistakes to Avoid

1. Using an aggregate in WHERE

This is the most common GROUP BY error. WHERE does not know about aggregates.

-- Wrong: aggregate in WHERE
SELECT GenreId, COUNT(*) AS Cnt
FROM Track
WHERE COUNT(*) > 100
GROUP BY GenreId;

-- Right: use HAVING
SELECT GenreId, COUNT(*) AS Cnt
FROM Track
GROUP BY GenreId
HAVING COUNT(*) > 100;

2. Selecting a non-aggregated column without GROUP BY

Every column in SELECT must be in GROUP BY or inside an aggregate. Otherwise the database doesn't know which row's value to show.

-- Wrong: Name is not grouped or aggregated
SELECT GenreId, Name, COUNT(*)
FROM Track
GROUP BY GenreId;

-- Right: include Name in GROUP BY
SELECT g.GenreId, g.Name, COUNT(*)
FROM Track t
JOIN Genre g ON g.GenreId = t.GenreId
GROUP BY g.GenreId, g.Name;

3. Using a column alias in HAVING

In standard SQL, HAVING runs before SELECT assigns aliases. Use the full expression instead.

-- Wrong in many databases: alias in HAVING
SELECT GenreId, COUNT(*) AS TrackCount
FROM Track
GROUP BY GenreId
HAVING TrackCount > 100;

-- Right: repeat the aggregate expression
SELECT GenreId, COUNT(*) AS TrackCount
FROM Track
GROUP BY GenreId
HAVING COUNT(*) > 100;

Note: SQLite and MySQL allow aliases in HAVING, but PostgreSQL, SQL Server, and Oracle do not. Write portable SQL by repeating the expression.

4. Forgetting that GROUP BY changes your row count

After GROUP BY, you have one row per group — not one row per original record. If you expected 1000 invoice rows but got 59, that's not a bug — it's 59 customers. Always sanity-check your GROUP BY results by comparing to a simple SELECT COUNT(*) on the base table.

5. Using DISTINCT when you need GROUP BY

DISTINCT removes duplicate rows, but it can't aggregate. If you need counts, sums, or averages alongside unique values, you need GROUP BY.

-- Limited: only unique countries
SELECT DISTINCT BillingCountry FROM Invoice;

-- Better: countries WITH their invoice counts
SELECT BillingCountry, COUNT(*) AS InvoiceCount
FROM Invoice
GROUP BY BillingCountry;

Practice Makes Perfect

The best way to master GROUP BY and HAVING is to write real queries. Here are exercises that target exactly these concepts:

Exercise	What You'll Practice
Count of Tracks per Album →	Basic GROUP BY with COUNT
Revenue Distribution by Country →	SUM and GROUP BY on invoices
Average Song Length per Genre →	AVG with GROUP BY and JOINs
Most Prolific Artist IDs →	GROUP BY with HAVING filter
Track Count by Individual Composer →	Grouping with NULL handling
Annual Sales Summary →	Multi-column GROUP BY with dates

Join 500+ SQL learners already practicing on SQiLs — Browse all exercises →

Summary

GROUP BY collapses rows with the same values into summary rows — one per unique group.
Pair it with aggregate functions (COUNT, SUM, AVG, MIN, MAX) to calculate totals, averages, and extremes.
Group by multiple columns for finer breakdowns like revenue-per-country-per-year.
Use HAVING to filter groups after aggregation — it's the only way to filter on aggregate results.
WHERE filters rows before grouping; HAVING filters groups after. Use both together for precise control.
Once you're comfortable with GROUP BY, explore window functions to perform aggregations without collapsing rows.

Ready to put this into practice? Start solving exercises on SQiLs →

Frequently Asked Questions

Can I use GROUP BY without an aggregate function?

Technically yes — GROUP BY without aggregates behaves like DISTINCT, returning unique combinations of the grouped columns. However, this is considered poor style. Use DISTINCT for deduplication and reserve GROUP BY for when you actually need aggregation.

What is the difference between WHERE and HAVING in SQL?

WHERE filters individual rows before grouping happens, so it cannot reference aggregate functions. HAVING filters entire groups after aggregation, so it can use COUNT, SUM, AVG, and other aggregates. Use WHERE for row-level conditions (e.g., date ranges, specific IDs) and HAVING for group-level conditions (e.g., "groups with more than 10 rows").

Can I GROUP BY a column that is not in SELECT?

Yes. You can group by any column from your tables, even if it doesn't appear in the SELECT list. This is useful when you want to aggregate at a certain granularity but only display the result. However, it can make results harder to interpret, so include the grouping column when clarity matters.

How does GROUP BY work with JOINs?

GROUP BY runs after FROM and JOIN, so it operates on the already-joined result set. A common pattern is joining two tables (e.g., Track and Genre) and then grouping by a column from the lookup table (e.g., Genre.Name). Just make sure every non-aggregated column in SELECT is in your GROUP BY — the join doesn't change this rule.