SQL GROUP BY and HAVING are the clauses you reach for the moment someone asks "how many tracks does each genre have?" or "which customers spent more than $40?", you need GROUP BY and HAVING. These two clauses turn raw rows into meaningful summaries — total revenue by country, average order size per customer, top-selling genres. They are the foundation of every dashboard, report, and data analysis query you will ever write.
In this guide you'll master both clauses with real SQL examples on the Chinook music-store database, learn the critical difference between WHERE and HAVING, and practice with hands-on exercises.
Table of Contents
- How GROUP BY Works
- Aggregate Functions: COUNT, SUM, AVG, MIN, MAX
- Grouping by Multiple Columns
- HAVING — Filter After Grouping
- WHERE vs HAVING: Know the Difference
- Quick Comparison Cheat Sheet
- Common Mistakes to Avoid
- Practice Makes Perfect
- Summary
- Frequently Asked Questions
How GROUP BY Works
GROUP BY collapses rows that share the same value(s) into a single summary row. Every column in your SELECT must either appear in the GROUP BY clause or be wrapped in an aggregate function.
Here's the simplest example — counting how many tracks exist in each genre:
SELECT
g.Name AS Genre,
COUNT(*) AS TrackCount
FROM Track t
JOIN Genre g ON g.GenreId = t.GenreId
GROUP BY g.Name
ORDER BY TrackCount DESC
LIMIT 5;
| Genre | TrackCount |
|---|---|
| Rock | 1297 |
| Latin | 579 |
| Metal | 374 |
| Alternative & Punk | 332 |
| Jazz | 130 |
Without GROUP BY, this query would return a single count of all tracks. With it, you get one row per genre — each with its own count.
The SQL engine processes GROUP BY after FROM, JOIN, and WHERE. That means rows are first filtered, then grouped. This execution order is key to understanding when to use WHERE versus HAVING.
Try it yourself — Count tracks per album on the Chinook database →
Aggregate Functions: COUNT, SUM, AVG, MIN, MAX
GROUP BY is only useful when paired with aggregate functions. Here are the five you'll use constantly:
| Function | Purpose | Example |
|---|---|---|
COUNT(*) | Number of rows in each group | Total invoices per customer |
COUNT(column) | Non-NULL values in the column | Customers with a company name |
SUM(column) | Total of all values | Revenue per country |
AVG(column) | Average value | Average invoice amount |
MIN(column) / MAX(column) | Smallest / largest value | Shortest and longest track |
Revenue by country
SELECT
BillingCountry,
COUNT(*) AS InvoiceCount,
ROUND(SUM(Total), 2) AS TotalRevenue,
ROUND(AVG(Total), 2) AS AvgInvoice
FROM Invoice
GROUP BY BillingCountry
ORDER BY TotalRevenue DESC
LIMIT 5;
| BillingCountry | InvoiceCount | TotalRevenue | AvgInvoice |
|---|---|---|---|
| USA | 91 | 523.06 | 5.75 |
| Canada | 56 | 303.96 | 5.43 |
| France | 35 | 195.10 | 5.57 |
| Brazil | 35 | 190.10 | 5.43 |
| Germany | 28 | 156.48 | 5.59 |
Notice how you can mix multiple aggregates in a single query. Each one independently summarizes the rows within each country group.
Try it yourself — Calculate revenue distribution by country →
Grouping by Multiple Columns
You aren't limited to grouping by one column. When you add more columns to GROUP BY, the database creates finer-grained groups — one for each unique combination of values.
Annual sales by country
SELECT
BillingCountry,
strftime('%Y', InvoiceDate) AS Year,
ROUND(SUM(Total), 2) AS Revenue
FROM Invoice
WHERE BillingCountry IN ('USA', 'Canada')
GROUP BY BillingCountry, strftime('%Y', InvoiceDate)
ORDER BY BillingCountry, Year
LIMIT 6;
| BillingCountry | Year | Revenue |
|---|---|---|
| Canada | 2021 | 72.27 |
| Canada | 2022 | 74.25 |
| Canada | 2023 | 72.27 |
| USA | 2021 | 130.68 |
| USA | 2022 | 122.76 |
| USA | 2023 | 130.68 |
Each row represents one country-year combination. This is how you build trend analyses and time-series reports — group by both the category and the time period.
Try it yourself — Build an annual sales summary →
HAVING — Filter After Grouping
HAVING lets you filter the results of aggregation. While WHERE filters individual rows before grouping, HAVING filters entire groups after the aggregates are calculated.
Find genres with more than 100 tracks
SELECT
g.Name AS Genre,
COUNT(*) AS TrackCount
FROM Track t
JOIN Genre g ON g.GenreId = t.GenreId
GROUP BY g.Name
HAVING COUNT(*) > 100
ORDER BY TrackCount DESC;
| Genre | TrackCount |
|---|---|
| Rock | 1297 |
| Latin | 579 |
| Metal | 374 |
| Alternative & Punk | 332 |
| Jazz | 130 |
Without HAVING, you'd get all 25 genres. HAVING eliminates the groups that don't meet the threshold. This is something WHERE simply cannot do — you can't write WHERE COUNT(*) > 100 because WHERE runs before any aggregation happens.
Customers who spent more than $40 total
SELECT
c.FirstName || ' ' || c.LastName AS Customer,
ROUND(SUM(i.Total), 2) AS TotalSpent
FROM Customer c
JOIN Invoice i ON i.CustomerId = c.CustomerId
GROUP BY c.CustomerId, c.FirstName, c.LastName
HAVING SUM(i.Total) > 40
ORDER BY TotalSpent DESC
LIMIT 5;
| Customer | TotalSpent |
|---|---|
| Helena Holý | 49.62 |
| Richard Cunningham | 47.62 |
| Luis Rojas | 46.62 |
| Ladislav Kovács | 45.62 |
| Hugh O'Brien | 45.62 |
HAVING is essential for answering questions like "which X has more than N of Y" — the bread and butter of business reporting.
Try it yourself — Find the most prolific artist IDs →
WHERE vs HAVING: Know the Difference
This is the single most important concept in this guide. Here's the SQL execution order:
- FROM / JOIN — assemble the rows
- WHERE — filter individual rows
- GROUP BY — collapse into groups
- HAVING — filter groups
- SELECT — compute output columns
- ORDER BY — sort the results
Because WHERE runs before GROUP BY, it can only reference individual column values. Because HAVING runs after GROUP BY, it can reference aggregate results.
Same query, different filters
-- WHERE: filter rows BEFORE grouping
-- "Only look at invoices from 2023, then count per country"
SELECT BillingCountry, COUNT(*) AS Cnt
FROM Invoice
WHERE InvoiceDate >= '2023-01-01'
GROUP BY BillingCountry
ORDER BY Cnt DESC
LIMIT 3;
| BillingCountry | Cnt |
|---|---|
| USA | 26 |
| Canada | 16 |
| Brazil | 10 |
-- HAVING: filter groups AFTER grouping
-- "Count all invoices per country, then only show countries with 30+"
SELECT BillingCountry, COUNT(*) AS Cnt
FROM Invoice
GROUP BY BillingCountry
HAVING COUNT(*) >= 30
ORDER BY Cnt DESC;
| BillingCountry | Cnt |
|---|---|
| USA | 91 |
| Canada | 56 |
| France | 35 |
| Brazil | 35 |
You can (and often should) use both together: WHERE to pre-filter rows, GROUP BY to aggregate, and HAVING to post-filter the groups.
Try it yourself — Analyze average spending per invoice →
Quick Comparison Cheat Sheet
| Feature | WHERE | HAVING |
|---|---|---|
| Runs | Before GROUP BY | After GROUP BY |
| Filters | Individual rows | Aggregated groups |
| Can use aggregates? | No | Yes |
| Can use column values? | Yes | Only grouped or aggregated |
| Required with GROUP BY? | No | No |
| Works without GROUP BY? | Yes | Technically yes, but rarely useful |
Rule of thumb: If your condition involves COUNT, SUM, AVG, MIN, or MAX, put it in HAVING. Everything else goes in WHERE.
Common Mistakes to Avoid
1. Using an aggregate in WHERE
This is the most common GROUP BY error. WHERE does not know about aggregates.
-- Wrong: aggregate in WHERE
SELECT GenreId, COUNT(*) AS Cnt
FROM Track
WHERE COUNT(*) > 100
GROUP BY GenreId;
-- Right: use HAVING
SELECT GenreId, COUNT(*) AS Cnt
FROM Track
GROUP BY GenreId
HAVING COUNT(*) > 100;
2. Selecting a non-aggregated column without GROUP BY
Every column in SELECT must be in GROUP BY or inside an aggregate. Otherwise the database doesn't know which row's value to show.
-- Wrong: Name is not grouped or aggregated
SELECT GenreId, Name, COUNT(*)
FROM Track
GROUP BY GenreId;
-- Right: include Name in GROUP BY
SELECT g.GenreId, g.Name, COUNT(*)
FROM Track t
JOIN Genre g ON g.GenreId = t.GenreId
GROUP BY g.GenreId, g.Name;
3. Using a column alias in HAVING
In standard SQL, HAVING runs before SELECT assigns aliases. Use the full expression instead.
-- Wrong in many databases: alias in HAVING
SELECT GenreId, COUNT(*) AS TrackCount
FROM Track
GROUP BY GenreId
HAVING TrackCount > 100;
-- Right: repeat the aggregate expression
SELECT GenreId, COUNT(*) AS TrackCount
FROM Track
GROUP BY GenreId
HAVING COUNT(*) > 100;
Note: SQLite and MySQL allow aliases in HAVING, but PostgreSQL, SQL Server, and Oracle do not. Write portable SQL by repeating the expression.
4. Forgetting that GROUP BY changes your row count
After GROUP BY, you have one row per group — not one row per original record. If you expected 1000 invoice rows but got 59, that's not a bug — it's 59 customers. Always sanity-check your GROUP BY results by comparing to a simple SELECT COUNT(*) on the base table.
5. Using DISTINCT when you need GROUP BY
DISTINCT removes duplicate rows, but it can't aggregate. If you need counts, sums, or averages alongside unique values, you need GROUP BY.
-- Limited: only unique countries
SELECT DISTINCT BillingCountry FROM Invoice;
-- Better: countries WITH their invoice counts
SELECT BillingCountry, COUNT(*) AS InvoiceCount
FROM Invoice
GROUP BY BillingCountry;
Practice Makes Perfect
The best way to master GROUP BY and HAVING is to write real queries. Here are exercises that target exactly these concepts:
| Exercise | What You'll Practice |
|---|---|
| Count of Tracks per Album → | Basic GROUP BY with COUNT |
| Revenue Distribution by Country → | SUM and GROUP BY on invoices |
| Average Song Length per Genre → | AVG with GROUP BY and JOINs |
| Most Prolific Artist IDs → | GROUP BY with HAVING filter |
| Track Count by Individual Composer → | Grouping with NULL handling |
| Annual Sales Summary → | Multi-column GROUP BY with dates |
Join 500+ SQL learners already practicing on SQiLs — Browse all exercises →
Summary
- GROUP BY collapses rows with the same values into summary rows — one per unique group.
- Pair it with aggregate functions (COUNT, SUM, AVG, MIN, MAX) to calculate totals, averages, and extremes.
- Group by multiple columns for finer breakdowns like revenue-per-country-per-year.
- Use HAVING to filter groups after aggregation — it's the only way to filter on aggregate results.
- WHERE filters rows before grouping; HAVING filters groups after. Use both together for precise control.
- Once you're comfortable with GROUP BY, explore window functions to perform aggregations without collapsing rows.
Ready to put this into practice? Start solving exercises on SQiLs →
Frequently Asked Questions
Can I use GROUP BY without an aggregate function?
Technically yes — GROUP BY without aggregates behaves like DISTINCT, returning unique combinations of the grouped columns. However, this is considered poor style. Use DISTINCT for deduplication and reserve GROUP BY for when you actually need aggregation.
What is the difference between WHERE and HAVING in SQL?
WHERE filters individual rows before grouping happens, so it cannot reference aggregate functions. HAVING filters entire groups after aggregation, so it can use COUNT, SUM, AVG, and other aggregates. Use WHERE for row-level conditions (e.g., date ranges, specific IDs) and HAVING for group-level conditions (e.g., "groups with more than 10 rows").
Can I GROUP BY a column that is not in SELECT?
Yes. You can group by any column from your tables, even if it doesn't appear in the SELECT list. This is useful when you want to aggregate at a certain granularity but only display the result. However, it can make results harder to interpret, so include the grouping column when clarity matters.
How does GROUP BY work with JOINs?
GROUP BY runs after FROM and JOIN, so it operates on the already-joined result set. A common pattern is joining two tables (e.g., Track and Genre) and then grouping by a column from the lookup table (e.g., Genre.Name). Just make sure every non-aggregated column in SELECT is in your GROUP BY — the join doesn't change this rule.



