Skip to main content
    Skip to main content
    SQL
    Subqueries
    JOINs
    Intermediate

    SQL Subqueries vs JOINs: When to Use Which

    SQiLs TeamMarch 23, 202613 min read
    Blog post cover: SQL Subqueries vs JOINs — when to use which with performance tips

    SQL subqueries vs JOINs — it's one of the most common questions once you move past the basics. You're staring at a query and wondering: should I use a subquery or a JOIN here? The answer isn't always obvious, and choosing wrong can mean a query that's harder to read, harder to maintain, or — in production — painfully slow.

    This guide breaks down exactly when to reach for a subquery versus a JOIN, with side-by-side examples on the Chinook database so you can see both approaches and decide for yourself.

    Table of Contents

    Subqueries Refresher

    A subquery is a SELECT statement nested inside another query. It runs first, produces a result, and the outer query uses that result. Subqueries can appear in WHERE, FROM, or SELECT clauses.

    Find tracks priced above the average

    SELECT Name, UnitPrice
    FROM Track
    WHERE UnitPrice > (
      SELECT AVG(UnitPrice)
      FROM Track
    )
    ORDER BY UnitPrice DESC
    LIMIT 5;
    
    NameUnitPrice
    War Pigs0.99

    Here the inner query calculates the average price, and the outer query filters tracks above it. The subquery returns a single value — this is called a scalar subquery.

    Subqueries can also return lists (for use with IN) or full result sets (as a derived table in FROM).

    Try it yourselfFind tracks priced above average →

    JOINs Refresher

    A JOIN combines columns from two or more tables based on a related column. Unlike subqueries, JOINs produce a single flat result set where you can access columns from all joined tables simultaneously.

    Get album titles with artist names

    SELECT
      ar.Name AS Artist,
      al.Title AS Album
    FROM Album al
    JOIN Artist ar ON ar.ArtistId = al.ArtistId
    WHERE ar.Name = 'Iron Maiden'
    ORDER BY al.Title;
    
    ArtistAlbum
    Iron MaidenA Matter of Life and Death
    Iron MaidenBrave New World
    Iron MaidenFear of the Dark
    Iron MaidenKillers
    Iron MaidenLive After Death

    The JOIN links albums to their artists through the shared ArtistId column. You need data from both tables in the output — this is a textbook JOIN use case. For a deeper dive on JOIN types, see our SQL JOINs guide.

    Try it yourselfGet album titles and artist names for Iron Maiden →

    Side-by-Side: Same Problem, Two Approaches

    Many queries can be written with either a subquery or a JOIN. Let's solve the same problem both ways.

    Question: Find customers who live in the same country as an employee.

    Subquery approach

    SELECT FirstName, LastName, Country
    FROM Customer
    WHERE Country IN (
      SELECT DISTINCT Country
      FROM Employee
    )
    ORDER BY Country, LastName
    LIMIT 5;
    
    FirstNameLastNameCountry
    MarkPhilipsCanada
    JenniferPetersonCanada
    EdwardFrancisCanada
    EllieSullivanCanada
    RobertoAlmeidaCanada

    JOIN approach

    SELECT DISTINCT
      c.FirstName, c.LastName, c.Country
    FROM Customer c
    JOIN Employee e ON e.Country = c.Country
    ORDER BY c.Country, c.LastName
    LIMIT 5;
    

    Same result. The subquery reads more naturally — "find customers whose country is in the list of employee countries." The JOIN requires a DISTINCT because each customer might match multiple employees. In this case, the subquery is cleaner.

    Try it yourselfFind customers in employee countries →

    When Subqueries Win

    Subqueries are the better choice in these scenarios:

    1. Comparing against an aggregate

    When you need to compare each row against a calculated value like an average, max, or count, a scalar subquery in WHERE is the cleanest approach.

    SELECT Name, Milliseconds / 1000 AS Seconds
    FROM Track
    WHERE Milliseconds > (
      SELECT AVG(Milliseconds)
      FROM Track t2
      JOIN Genre g ON g.GenreId = t2.GenreId
      WHERE g.Name = 'Rock'
    )
    ORDER BY Milliseconds DESC
    LIMIT 5;
    
    NameSeconds
    Occupation / Precipice5286
    Through a Looking Glass5088
    Greetings from Earth, Pt. 12960
    The Woman King2626
    A Day in the Life2611

    A JOIN can't elegantly express "rows where this value exceeds the average of a different group." You'd need a cross join with a derived table — much noisier.

    Try it yourselfFind tracks longer than the Rock average →

    2. Existence checks with NOT EXISTS

    NOT EXISTS is the most reliable way to find rows in one table that have no matching rows in another.

    SELECT t.Name AS Track
    FROM Track t
    WHERE NOT EXISTS (
      SELECT 1
      FROM InvoiceLine il
      WHERE il.TrackId = t.TrackId
    )
    ORDER BY t.Name
    LIMIT 5;
    
    Track
    20 Flight Rock
    A Cor Do Sangue
    A Dança da Solidão
    A Estrada
    A Hora e a Vez do Cabelo Nascer

    These are tracks that have never been sold. You could use LEFT JOIN ... WHERE il.TrackId IS NULL, but NOT EXISTS makes the intent crystal clear and handles NULLs correctly in all cases.

    Try it yourselfIdentify tracks never sold →

    3. Subqueries in SELECT for row-level calculations

    When you need to add a computed column from another table without changing the row count, a correlated subquery in SELECT works well.

    SELECT
      c.FirstName || ' ' || c.LastName AS Customer,
      (
        SELECT COUNT(*)
        FROM Invoice i
        WHERE i.CustomerId = c.CustomerId
      ) AS InvoiceCount
    FROM Customer c
    ORDER BY InvoiceCount DESC
    LIMIT 5;
    
    CustomerInvoiceCount
    Luis Rojas7
    Helena Holý7
    Richard Cunningham7
    Hugh O'Brien7
    Frank Ralston7

    Note: for large datasets, this pattern can be slow because the subquery runs once per row. Consider a JOIN with GROUP BY for better performance at scale.

    When JOINs Win

    1. You need columns from multiple tables

    This is the most common scenario. When the output requires data from two or more tables, a JOIN is the only practical choice.

    SELECT
      c.FirstName || ' ' || c.LastName AS Customer,
      e.FirstName || ' ' || e.LastName AS SupportRep,
      e.Title
    FROM Customer c
    JOIN Employee e ON e.EmployeeId = c.SupportRepId
    ORDER BY Customer
    LIMIT 5;
    
    CustomerSupportRepTitle
    Aaron MitchellMargaret ParkSales Support Agent
    Alexandre RochaSteve JohnsonSales Support Agent
    Astrid GruberSteve JohnsonSales Support Agent
    Bjørn HansenJane PeacockSales Support Agent
    Camille BernardMargaret ParkSales Support Agent

    A subquery can't put columns from two tables side by side in a single result row (without correlated subqueries for each column, which is ugly and slow).

    Try it yourselfList customers and their support reps →

    2. Joining multiple tables in a chain

    Real queries often chain three, four, or more tables. JOINs handle this naturally.

    SELECT
      ar.Name AS Artist,
      al.Title AS Album,
      t.Name AS Track,
      g.Name AS Genre
    FROM Track t
    JOIN Album al ON al.AlbumId = t.AlbumId
    JOIN Artist ar ON ar.ArtistId = al.ArtistId
    JOIN Genre g ON g.GenreId = t.GenreId
    WHERE ar.Name = 'Led Zeppelin'
    ORDER BY al.Title, t.TrackId
    LIMIT 5;
    
    ArtistAlbumTrackGenre
    Led ZeppelinBBC Sessions [Disc 1] [Live]Communication BreakdownRock
    Led ZeppelinBBC Sessions [Disc 1] [Live]Dazed And ConfusedRock
    Led ZeppelinBBC Sessions [Disc 1] [Live]The Girl I Love She Got Long Black Wavy HairRock
    Led ZeppelinBBC Sessions [Disc 1] [Live]What is and What Should Never BeRock
    Led ZeppelinBBC Sessions [Disc 1] [Live]Communication Breakdown(2)Rock

    Four tables, one clean query. Achieving this with nested subqueries would be a nightmare.

    Try it yourselfGet detailed sales and track info →

    When you need to GROUP BY and aggregate data from joined tables, JOINs combined with GROUP BY are the standard approach.

    SELECT
      g.Name AS Genre,
      COUNT(*) AS TrackCount,
      ROUND(AVG(t.Milliseconds) / 1000, 0) AS AvgSeconds
    FROM Track t
    JOIN Genre g ON g.GenreId = t.GenreId
    GROUP BY g.Name
    ORDER BY TrackCount DESC
    LIMIT 5;
    
    GenreTrackCountAvgSeconds
    Rock1297284
    Latin579232
    Metal374310
    Alternative & Punk332234
    Jazz130291

    Performance: What Actually Matters

    The "subqueries are slow" advice is often repeated but oversimplified. Here's what actually matters:

    Modern query optimizers rewrite subqueries. In PostgreSQL, MySQL 8+, and SQL Server, the optimizer often transforms a subquery into a JOIN internally. The execution plan may be identical regardless of how you wrote it.

    Correlated subqueries are the real bottleneck. A correlated subquery (one that references the outer query) runs once per row. On a 1-million-row table, that's 1 million sub-executions. Rewriting it as a JOIN + GROUP BY runs it once.

    -- Slow: correlated subquery (runs per row)
    SELECT Name,
      (SELECT COUNT(*) FROM Track WHERE Track.AlbumId = Album.AlbumId) AS Cnt
    FROM Album;
    
    -- Faster: JOIN + GROUP BY (single pass)
    SELECT al.Title, COUNT(t.TrackId) AS Cnt
    FROM Album al
    LEFT JOIN Track t ON t.AlbumId = al.AlbumId
    GROUP BY al.AlbumId, al.Title;
    

    Use EXPLAIN to decide. Don't guess — check the query plan. If both approaches produce the same plan, choose the one that reads better.

    Practical rules:

    • Scalar subqueries (returning one value) in WHERE are almost always fine
    • IN subqueries on small result sets are fine; for large sets, consider EXISTS or a JOIN
    • Correlated subqueries in SELECT are the main performance risk — rewrite as a JOIN when the table is large
    • When in doubt, write it the way that's clearest, then optimize if you measure a problem

    Quick Comparison Cheat Sheet

    ScenarioBest ChoiceWhy
    Need columns from multiple tablesJOINOnly practical way
    Filter against an aggregateSubqueryCleaner WHERE clause
    Check for existence / non-existenceSubquery (EXISTS)Clearest intent, NULL-safe
    Chain 3+ tablesJOINScales naturally
    Derived table for intermediate resultSubquery in FROMEncapsulates complex logic
    Row-level computed columnSubquery in SELECTWorks but watch performance
    Aggregate across joined tablesJOIN + GROUP BYStandard, efficient pattern
    Filter with IN (small list)SubqueryReadable
    Filter with IN (large list)JOIN or EXISTSBetter performance

    Common Mistakes to Avoid

    1. Using a correlated subquery when a JOIN works

    Correlated subqueries re-execute for every row. If you're pulling data from another table for every row, a JOIN is almost always better.

    -- Slow: correlated subquery
    SELECT t.Name,
      (SELECT g.Name FROM Genre g WHERE g.GenreId = t.GenreId) AS Genre
    FROM Track t;
    
    -- Better: simple JOIN
    SELECT t.Name, g.Name AS Genre
    FROM Track t
    JOIN Genre g ON g.GenreId = t.GenreId;
    

    2. Using IN with a subquery that can return NULL

    NOT IN with a subquery that returns NULL gives an empty result — a classic trap.

    -- Dangerous: if any CustomerId is NULL, returns no rows
    SELECT Name FROM Track
    WHERE TrackId NOT IN (
      SELECT TrackId FROM InvoiceLine
    );
    
    -- Safe: NOT EXISTS handles NULLs correctly
    SELECT t.Name FROM Track t
    WHERE NOT EXISTS (
      SELECT 1 FROM InvoiceLine il
      WHERE il.TrackId = t.TrackId
    );
    

    3. Forgetting DISTINCT with JOINs that multiply rows

    When you JOIN to a table with a one-to-many relationship and only need the "one" side, you'll get duplicates without DISTINCT or aggregation.

    -- Wrong: each customer appears once per invoice
    SELECT c.FirstName, c.Country
    FROM Customer c
    JOIN Invoice i ON i.CustomerId = c.CustomerId;
    
    -- Right: use DISTINCT or GROUP BY
    SELECT DISTINCT c.FirstName, c.Country
    FROM Customer c
    JOIN Invoice i ON i.CustomerId = c.CustomerId;
    
    -- Or: use EXISTS (no duplicates by design)
    SELECT c.FirstName, c.Country
    FROM Customer c
    WHERE EXISTS (
      SELECT 1 FROM Invoice i WHERE i.CustomerId = c.CustomerId
    );
    

    4. Nesting subqueries too deeply

    Three or more levels of nesting become unreadable. Use CTEs (WITH clauses) to flatten complex logic.

    -- Hard to read: nested subqueries
    SELECT * FROM Track WHERE GenreId IN (
      SELECT GenreId FROM Genre WHERE Name IN (
        SELECT DISTINCT g.Name FROM Genre g
        JOIN Track t ON t.GenreId = g.GenreId
        GROUP BY g.Name HAVING COUNT(*) > 100
      )
    );
    
    -- Better: CTE
    WITH BigGenres AS (
      SELECT g.GenreId, g.Name
      FROM Genre g
      JOIN Track t ON t.GenreId = g.GenreId
      GROUP BY g.GenreId, g.Name
      HAVING COUNT(*) > 100
    )
    SELECT t.*
    FROM Track t
    JOIN BigGenres bg ON bg.GenreId = t.GenreId;
    

    Practice Makes Perfect

    The best way to develop intuition for subqueries vs JOINs is to solve problems both ways. These exercises give you hands-on practice:

    ExerciseWhat You'll Practice
    Tracks Priced Above Average →Scalar subquery in WHERE
    Customers in Employee Countries →Subquery with IN vs JOIN
    Identify Tracks Never Sold →NOT EXISTS subquery
    Customer Names and Their Invoices →Basic JOIN pattern
    Detailed Sales and Track Info →Multi-table JOIN chain
    Tracks from the Largest Genre →Subquery + GROUP BY combo

    Join 500+ SQL learners already practicing on SQiLs — Browse all exercises →

    Summary

    • JOINs are best when you need columns from multiple tables, chain several tables, or aggregate across relationships.
    • Subqueries are best for filtering against aggregates, existence checks, and encapsulating intermediate logic.
    • Many problems can be solved either way — modern optimizers often produce the same execution plan.
    • Correlated subqueries are the real performance concern — rewrite them as JOINs when operating on large tables.
    • NOT EXISTS beats NOT IN for null-safety. Use CTEs instead of deeply nested subqueries.
    • When in doubt, write for clarity first, then optimize with EXPLAIN if performance matters.

    Ready to put this into practice? Start solving exercises on SQiLs →

    Frequently Asked Questions

    Are subqueries slower than JOINs?

    Not necessarily. Modern database optimizers (PostgreSQL, MySQL 8+, SQL Server) often rewrite subqueries into JOINs internally, producing identical execution plans. The main performance risk is correlated subqueries that run once per row. Scalar subqueries and IN/EXISTS subqueries on small sets are typically fine.

    Can I always replace a subquery with a JOIN?

    No. Some patterns are unique to subqueries — like comparing against an aggregate value (WHERE price > (SELECT AVG(price) ...)), or using NOT EXISTS for anti-joins. While workarounds exist, the subquery version is often clearer and just as fast.

    When should I use EXISTS vs IN?

    EXISTS is generally preferred when the subquery could return NULLs, especially with NOT EXISTS vs NOT IN. Performance-wise, EXISTS short-circuits (stops as soon as it finds one match), while IN evaluates the full list. For small lists, the difference is negligible; for large datasets, EXISTS usually wins.

    What about CTEs — are they subqueries or JOINs?

    CTEs (Common Table Expressions, using WITH) are syntactic sugar for subqueries in the FROM clause. They don't change performance — the optimizer treats them the same as inline subqueries. Their value is readability: they give names to intermediate steps and flatten deeply nested queries. For advanced CTE patterns like recursion, check out the window functions guide.

    Related Posts