Skip to main content
    Skip to main content
    SQL
    GROUP BY
    Beginner

    SQL GROUP BY and HAVING: The Complete Guide

    SQiLs TeamMarch 23, 202611 min read
    Blog post cover: SQL GROUP BY and HAVING — the complete guide with examples

    SQL GROUP BY and HAVING are the clauses you reach for the moment someone asks "how many tracks does each genre have?" or "which customers spent more than $40?", you need GROUP BY and HAVING. These two clauses turn raw rows into meaningful summaries — total revenue by country, average order size per customer, top-selling genres. They are the foundation of every dashboard, report, and data analysis query you will ever write.

    In this guide you'll master both clauses with real SQL examples on the Chinook music-store database, learn the critical difference between WHERE and HAVING, and practice with hands-on exercises.

    Table of Contents

    How GROUP BY Works

    GROUP BY collapses rows that share the same value(s) into a single summary row. Every column in your SELECT must either appear in the GROUP BY clause or be wrapped in an aggregate function.

    Here's the simplest example — counting how many tracks exist in each genre:

    SELECT
      g.Name AS Genre,
      COUNT(*) AS TrackCount
    FROM Track t
    JOIN Genre g ON g.GenreId = t.GenreId
    GROUP BY g.Name
    ORDER BY TrackCount DESC
    LIMIT 5;
    
    GenreTrackCount
    Rock1297
    Latin579
    Metal374
    Alternative & Punk332
    Jazz130

    Without GROUP BY, this query would return a single count of all tracks. With it, you get one row per genre — each with its own count.

    The SQL engine processes GROUP BY after FROM, JOIN, and WHERE. That means rows are first filtered, then grouped. This execution order is key to understanding when to use WHERE versus HAVING.

    Try it yourselfCount tracks per album on the Chinook database →

    Aggregate Functions: COUNT, SUM, AVG, MIN, MAX

    GROUP BY is only useful when paired with aggregate functions. Here are the five you'll use constantly:

    FunctionPurposeExample
    COUNT(*)Number of rows in each groupTotal invoices per customer
    COUNT(column)Non-NULL values in the columnCustomers with a company name
    SUM(column)Total of all valuesRevenue per country
    AVG(column)Average valueAverage invoice amount
    MIN(column) / MAX(column)Smallest / largest valueShortest and longest track

    Revenue by country

    SELECT
      BillingCountry,
      COUNT(*) AS InvoiceCount,
      ROUND(SUM(Total), 2) AS TotalRevenue,
      ROUND(AVG(Total), 2) AS AvgInvoice
    FROM Invoice
    GROUP BY BillingCountry
    ORDER BY TotalRevenue DESC
    LIMIT 5;
    
    BillingCountryInvoiceCountTotalRevenueAvgInvoice
    USA91523.065.75
    Canada56303.965.43
    France35195.105.57
    Brazil35190.105.43
    Germany28156.485.59

    Notice how you can mix multiple aggregates in a single query. Each one independently summarizes the rows within each country group.

    Try it yourselfCalculate revenue distribution by country →

    Grouping by Multiple Columns

    You aren't limited to grouping by one column. When you add more columns to GROUP BY, the database creates finer-grained groups — one for each unique combination of values.

    Annual sales by country

    SELECT
      BillingCountry,
      strftime('%Y', InvoiceDate) AS Year,
      ROUND(SUM(Total), 2) AS Revenue
    FROM Invoice
    WHERE BillingCountry IN ('USA', 'Canada')
    GROUP BY BillingCountry, strftime('%Y', InvoiceDate)
    ORDER BY BillingCountry, Year
    LIMIT 6;
    
    BillingCountryYearRevenue
    Canada202172.27
    Canada202274.25
    Canada202372.27
    USA2021130.68
    USA2022122.76
    USA2023130.68

    Each row represents one country-year combination. This is how you build trend analyses and time-series reports — group by both the category and the time period.

    Try it yourselfBuild an annual sales summary →

    HAVING — Filter After Grouping

    HAVING lets you filter the results of aggregation. While WHERE filters individual rows before grouping, HAVING filters entire groups after the aggregates are calculated.

    Find genres with more than 100 tracks

    SELECT
      g.Name AS Genre,
      COUNT(*) AS TrackCount
    FROM Track t
    JOIN Genre g ON g.GenreId = t.GenreId
    GROUP BY g.Name
    HAVING COUNT(*) > 100
    ORDER BY TrackCount DESC;
    
    GenreTrackCount
    Rock1297
    Latin579
    Metal374
    Alternative & Punk332
    Jazz130

    Without HAVING, you'd get all 25 genres. HAVING eliminates the groups that don't meet the threshold. This is something WHERE simply cannot do — you can't write WHERE COUNT(*) > 100 because WHERE runs before any aggregation happens.

    Customers who spent more than $40 total

    SELECT
      c.FirstName || ' ' || c.LastName AS Customer,
      ROUND(SUM(i.Total), 2) AS TotalSpent
    FROM Customer c
    JOIN Invoice i ON i.CustomerId = c.CustomerId
    GROUP BY c.CustomerId, c.FirstName, c.LastName
    HAVING SUM(i.Total) > 40
    ORDER BY TotalSpent DESC
    LIMIT 5;
    
    CustomerTotalSpent
    Helena Holý49.62
    Richard Cunningham47.62
    Luis Rojas46.62
    Ladislav Kovács45.62
    Hugh O'Brien45.62

    HAVING is essential for answering questions like "which X has more than N of Y" — the bread and butter of business reporting.

    Try it yourselfFind the most prolific artist IDs →

    WHERE vs HAVING: Know the Difference

    This is the single most important concept in this guide. Here's the SQL execution order:

    1. FROM / JOIN — assemble the rows
    2. WHERE — filter individual rows
    3. GROUP BY — collapse into groups
    4. HAVING — filter groups
    5. SELECT — compute output columns
    6. ORDER BY — sort the results

    Because WHERE runs before GROUP BY, it can only reference individual column values. Because HAVING runs after GROUP BY, it can reference aggregate results.

    Same query, different filters

    -- WHERE: filter rows BEFORE grouping
    -- "Only look at invoices from 2023, then count per country"
    SELECT BillingCountry, COUNT(*) AS Cnt
    FROM Invoice
    WHERE InvoiceDate >= '2023-01-01'
    GROUP BY BillingCountry
    ORDER BY Cnt DESC
    LIMIT 3;
    
    BillingCountryCnt
    USA26
    Canada16
    Brazil10
    -- HAVING: filter groups AFTER grouping
    -- "Count all invoices per country, then only show countries with 30+"
    SELECT BillingCountry, COUNT(*) AS Cnt
    FROM Invoice
    GROUP BY BillingCountry
    HAVING COUNT(*) >= 30
    ORDER BY Cnt DESC;
    
    BillingCountryCnt
    USA91
    Canada56
    France35
    Brazil35

    You can (and often should) use both together: WHERE to pre-filter rows, GROUP BY to aggregate, and HAVING to post-filter the groups.

    Try it yourselfAnalyze average spending per invoice →

    Quick Comparison Cheat Sheet

    FeatureWHEREHAVING
    RunsBefore GROUP BYAfter GROUP BY
    FiltersIndividual rowsAggregated groups
    Can use aggregates?NoYes
    Can use column values?YesOnly grouped or aggregated
    Required with GROUP BY?NoNo
    Works without GROUP BY?YesTechnically yes, but rarely useful

    Rule of thumb: If your condition involves COUNT, SUM, AVG, MIN, or MAX, put it in HAVING. Everything else goes in WHERE.

    Common Mistakes to Avoid

    1. Using an aggregate in WHERE

    This is the most common GROUP BY error. WHERE does not know about aggregates.

    -- Wrong: aggregate in WHERE
    SELECT GenreId, COUNT(*) AS Cnt
    FROM Track
    WHERE COUNT(*) > 100
    GROUP BY GenreId;
    
    -- Right: use HAVING
    SELECT GenreId, COUNT(*) AS Cnt
    FROM Track
    GROUP BY GenreId
    HAVING COUNT(*) > 100;
    

    2. Selecting a non-aggregated column without GROUP BY

    Every column in SELECT must be in GROUP BY or inside an aggregate. Otherwise the database doesn't know which row's value to show.

    -- Wrong: Name is not grouped or aggregated
    SELECT GenreId, Name, COUNT(*)
    FROM Track
    GROUP BY GenreId;
    
    -- Right: include Name in GROUP BY
    SELECT g.GenreId, g.Name, COUNT(*)
    FROM Track t
    JOIN Genre g ON g.GenreId = t.GenreId
    GROUP BY g.GenreId, g.Name;
    

    3. Using a column alias in HAVING

    In standard SQL, HAVING runs before SELECT assigns aliases. Use the full expression instead.

    -- Wrong in many databases: alias in HAVING
    SELECT GenreId, COUNT(*) AS TrackCount
    FROM Track
    GROUP BY GenreId
    HAVING TrackCount > 100;
    
    -- Right: repeat the aggregate expression
    SELECT GenreId, COUNT(*) AS TrackCount
    FROM Track
    GROUP BY GenreId
    HAVING COUNT(*) > 100;
    

    Note: SQLite and MySQL allow aliases in HAVING, but PostgreSQL, SQL Server, and Oracle do not. Write portable SQL by repeating the expression.

    4. Forgetting that GROUP BY changes your row count

    After GROUP BY, you have one row per group — not one row per original record. If you expected 1000 invoice rows but got 59, that's not a bug — it's 59 customers. Always sanity-check your GROUP BY results by comparing to a simple SELECT COUNT(*) on the base table.

    5. Using DISTINCT when you need GROUP BY

    DISTINCT removes duplicate rows, but it can't aggregate. If you need counts, sums, or averages alongside unique values, you need GROUP BY.

    -- Limited: only unique countries
    SELECT DISTINCT BillingCountry FROM Invoice;
    
    -- Better: countries WITH their invoice counts
    SELECT BillingCountry, COUNT(*) AS InvoiceCount
    FROM Invoice
    GROUP BY BillingCountry;
    

    Practice Makes Perfect

    The best way to master GROUP BY and HAVING is to write real queries. Here are exercises that target exactly these concepts:

    ExerciseWhat You'll Practice
    Count of Tracks per Album →Basic GROUP BY with COUNT
    Revenue Distribution by Country →SUM and GROUP BY on invoices
    Average Song Length per Genre →AVG with GROUP BY and JOINs
    Most Prolific Artist IDs →GROUP BY with HAVING filter
    Track Count by Individual Composer →Grouping with NULL handling
    Annual Sales Summary →Multi-column GROUP BY with dates

    Join 500+ SQL learners already practicing on SQiLs — Browse all exercises →

    Summary

    • GROUP BY collapses rows with the same values into summary rows — one per unique group.
    • Pair it with aggregate functions (COUNT, SUM, AVG, MIN, MAX) to calculate totals, averages, and extremes.
    • Group by multiple columns for finer breakdowns like revenue-per-country-per-year.
    • Use HAVING to filter groups after aggregation — it's the only way to filter on aggregate results.
    • WHERE filters rows before grouping; HAVING filters groups after. Use both together for precise control.
    • Once you're comfortable with GROUP BY, explore window functions to perform aggregations without collapsing rows.

    Ready to put this into practice? Start solving exercises on SQiLs →

    Frequently Asked Questions

    Can I use GROUP BY without an aggregate function?

    Technically yes — GROUP BY without aggregates behaves like DISTINCT, returning unique combinations of the grouped columns. However, this is considered poor style. Use DISTINCT for deduplication and reserve GROUP BY for when you actually need aggregation.

    What is the difference between WHERE and HAVING in SQL?

    WHERE filters individual rows before grouping happens, so it cannot reference aggregate functions. HAVING filters entire groups after aggregation, so it can use COUNT, SUM, AVG, and other aggregates. Use WHERE for row-level conditions (e.g., date ranges, specific IDs) and HAVING for group-level conditions (e.g., "groups with more than 10 rows").

    Can I GROUP BY a column that is not in SELECT?

    Yes. You can group by any column from your tables, even if it doesn't appear in the SELECT list. This is useful when you want to aggregate at a certain granularity but only display the result. However, it can make results harder to interpret, so include the grouping column when clarity matters.

    How does GROUP BY work with JOINs?

    GROUP BY runs after FROM and JOIN, so it operates on the already-joined result set. A common pattern is joining two tables (e.g., Track and Genre) and then grouping by a column from the lookup table (e.g., Genre.Name). Just make sure every non-aggregated column in SELECT is in your GROUP BY — the join doesn't change this rule.

    Related Posts