SQL DISTINCT: Remove Duplicates the Right Way

SQL DISTINCT is the keyword you reach for when a database table contains repeated values — the same country appearing on hundreds of invoices, the same composer credited on dozens of tracks. When you need a clean list of unique values, DISTINCT is the keyword that strips away the duplicates. It looks simple, but knowing when to use it (and when not to) separates beginners from confident SQL writers.

In this guide you'll learn exactly how DISTINCT works, see it in action on the Chinook music-store database, understand how it compares to GROUP BY, and avoid the mistakes that trip up most learners.

How DISTINCT Works
DISTINCT on Multiple Columns
COUNT(DISTINCT column) — Counting Unique Values
DISTINCT vs GROUP BY
Cheat Sheet
Common Mistakes to Avoid
Practice Makes Perfect
Summary
Frequently Asked Questions

How DISTINCT Works

Place DISTINCT right after SELECT and the database eliminates duplicate rows from the result set. If two or more rows are identical across every selected column, only one copy survives.

Here's the classic example — listing every unique billing country from the Invoice table:

SELECT DISTINCT BillingCountry
FROM Invoice
ORDER BY BillingCountry
LIMIT 10;

BillingCountry
Argentina
Australia
Austria
Belgium
Brazil
Canada
Chile
Czech Republic
Denmark
Finland

Without DISTINCT, you would get 412 rows (one per invoice). With it, you get 24 — one per unique country.

How the database processes DISTINCT

FROM — reads all rows from the table
WHERE — filters rows (if present)
SELECT — picks the requested columns
DISTINCT — removes duplicate rows from the result
ORDER BY — sorts the final output

Because DISTINCT runs after WHERE, you can combine them freely:

SELECT DISTINCT BillingCity
FROM Invoice
WHERE BillingCountry = 'USA'
ORDER BY BillingCity;

BillingCity
Boston
Chicago
Cupertino
Fort Worth
Madison
Mountain View
New York
Orlando
Redmond
Reno

Try it yourself — List all unique billing countries →

DISTINCT on Multiple Columns

When you select more than one column, DISTINCT removes rows where the entire combination of values is duplicated. Each unique pair (or triple, or more) gets one row.

Unique city-country pairs

SELECT DISTINCT
  BillingCity,
  BillingCountry
FROM Invoice
ORDER BY BillingCountry, BillingCity
LIMIT 10;

BillingCity	BillingCountry
Buenos Aires	Argentina
Sidney	Australia
Vienne	Austria
Brussels	Belgium
Brasília	Brazil
Rio de Janeiro	Brazil
São José dos Campos	Brazil
São Paulo	Brazil
Edmonton	Canada
Halifax	Canada

Notice that Brazil appears multiple times — each row has a different city. DISTINCT looks at the full combination, not each column independently.

Finding unique customer locations

SELECT DISTINCT
  City,
  State,
  Country
FROM Customer
ORDER BY Country, State, City
LIMIT 10;

City	State	Country
Buenos Aires	NULL	Argentina
Sidney	NSW	Australia
Vienne	NULL	Austria
Brussels	NULL	Belgium
Brasília	DF	Brazil
Rio de Janeiro	RJ	Brazil
São José dos Campos	SP	Brazil
São Paulo	SP	Brazil
Edmonton	AB	Canada
Halifax	NS	Canada

Try it yourself — Find unique customer locations →

COUNT(DISTINCT column) — Counting Unique Values

Often you don't need the list of unique values — you just need to know how many there are. Wrap the column in COUNT(DISTINCT ...):

SELECT
  COUNT(*) AS TotalInvoices,
  COUNT(DISTINCT CustomerId) AS UniqueCustomers,
  COUNT(DISTINCT BillingCountry) AS UniqueCountries
FROM Invoice;

TotalInvoices	UniqueCustomers	UniqueCountries
412	59	24

This tells you: 412 invoices were placed by 59 different customers across 24 countries — all in a single query.

COUNT(DISTINCT) inside GROUP BY

You can pair COUNT(DISTINCT) with GROUP BY for powerful breakdowns:

SELECT
  BillingCountry,
  COUNT(*) AS TotalInvoices,
  COUNT(DISTINCT CustomerId) AS UniqueCustomers
FROM Invoice
GROUP BY BillingCountry
ORDER BY UniqueCustomers DESC
LIMIT 5;

BillingCountry	TotalInvoices	UniqueCustomers
USA	91	13
Canada	56	8
France	35	5
Brazil	35	5
Germany	28	4

The USA has 91 invoices but only 13 unique customers. COUNT(DISTINCT) reveals patterns that a plain COUNT(*) would hide.

Try it yourself — Count tracks per album →

DISTINCT vs GROUP BY

This is the question every SQL learner asks: when should I use DISTINCT, and when should I use GROUP BY? The short answer: DISTINCT is for deduplication, GROUP BY is for aggregation.

Side-by-side comparison

Both of these queries return the same 24 countries:

-- Using DISTINCT
SELECT DISTINCT BillingCountry
FROM Invoice
ORDER BY BillingCountry;

-- Using GROUP BY
SELECT BillingCountry
FROM Invoice
GROUP BY BillingCountry
ORDER BY BillingCountry;

The results are identical, but the intent is different:

DISTINCT says: "Give me the unique values."
GROUP BY says: "Organize rows into groups so I can aggregate them."

When to pick which

Scenario	Use
You need a clean list of unique values	DISTINCT
You need counts, sums, averages per group	GROUP BY
You need unique values and aggregates	GROUP BY with aggregates
You want to count unique values within groups	COUNT(DISTINCT) inside GROUP BY

The key rule

If your query has no aggregate functions (COUNT, SUM, AVG, MIN, MAX), prefer DISTINCT — it signals your intent clearly. The moment you need an aggregate, switch to GROUP BY. For a deep dive into GROUP BY, see our complete GROUP BY and HAVING guide.

Cheat Sheet

Syntax	What It Does	Example
`SELECT DISTINCT col`	Unique values in one column	Unique countries
`SELECT DISTINCT col1, col2`	Unique combinations of columns	Unique city-country pairs
`COUNT(DISTINCT col)`	Number of unique values	How many distinct customers
`SELECT DISTINCT *`	Remove fully duplicate rows	Deduplicate an entire result set
`GROUP BY col`	Group rows for aggregation	Revenue per country

Quick decision rule:

No aggregates needed → DISTINCT
Aggregates needed → GROUP BY
Count of unique values → COUNT(DISTINCT col) inside either

Common Mistakes to Avoid

1. Using DISTINCT when GROUP BY is needed

DISTINCT cannot aggregate. If you need counts or totals alongside your unique values, you need GROUP BY.

-- Wrong thinking: "I want distinct countries with their invoice counts"
SELECT DISTINCT BillingCountry, COUNT(*)
FROM Invoice;
-- This errors or gives wrong results

-- Right: use GROUP BY
SELECT BillingCountry, COUNT(*) AS InvoiceCount
FROM Invoice
GROUP BY BillingCountry
ORDER BY InvoiceCount DESC;

2. Treating DISTINCT as a function

DISTINCT is a keyword that applies to the entire row, not a single column. Parentheses after DISTINCT don't change this.

-- These are identical — the parentheses do nothing
SELECT DISTINCT(BillingCountry) FROM Invoice;
SELECT DISTINCT BillingCountry FROM Invoice;

-- This returns distinct PAIRS, not distinct countries
SELECT DISTINCT BillingCountry, BillingCity FROM Invoice;

The parentheses in DISTINCT(col) are just grouping parentheses, not a function call. This catches many beginners off guard.

3. Using DISTINCT to hide a JOIN problem

If a JOIN produces more rows than expected and you slap DISTINCT on it to "fix" the count, you're masking a real bug. The correct fix is to check your JOIN conditions.

-- Suspicious: DISTINCT used to hide duplicate rows from a bad join
SELECT DISTINCT c.FirstName, c.LastName
FROM Customer c
JOIN Invoice i ON i.CustomerId = c.CustomerId;

-- Better: ask whether you actually need the JOIN
SELECT FirstName, LastName
FROM Customer;

If you find yourself adding DISTINCT to "fix" unexpected duplicates, stop and examine your JOINs first.

4. Forgetting that DISTINCT considers NULL as a single value

In SQL, two NULLs are considered equal for DISTINCT purposes. So if a column has multiple NULL entries, DISTINCT will keep just one.

SELECT DISTINCT Composer
FROM Track
ORDER BY Composer
LIMIT 5;

Composer
NULL
A. Jamal
AC/DC
Academy of St. Martin in the Fields & Sir Neville Marriner
Academy of St. Martin in the Fields, John Birch, Sir Neville Marriner & Sylvia McNair

The single NULL row represents all 978 tracks with no composer. Keep this in mind when counting — COUNT(DISTINCT Composer) will not count the NULLs.

5. Using SELECT DISTINCT * without understanding the cost

SELECT DISTINCT * compares every column in every row. On wide tables with many columns, this is computationally expensive and rarely what you actually want. Be specific about which columns need deduplication.

-- Expensive and usually unnecessary
SELECT DISTINCT * FROM Track;

-- Better: specify the columns you care about
SELECT DISTINCT Composer FROM Track;

Practice Makes Perfect

The best way to internalize DISTINCT is to write queries yourself. Here are exercises on the Chinook database that target these exact skills:

Exercise	What You'll Practice
Find Unique Track Composers →	Basic DISTINCT on a single column
Identify Unique Employee Cities →	DISTINCT with a small result set
List All Unique Billing Countries →	DISTINCT on invoice data
Discover Unique Customer States →	DISTINCT with NULL values
Unique Customer Locations →	Multi-column DISTINCT
Product Count by Category →	COUNT with GROUP BY (compare to DISTINCT)

Join 500+ SQL learners already practicing on SQiLs — Browse all exercises →

Summary

DISTINCT removes duplicate rows from your query results — place it right after SELECT.
With multiple columns, DISTINCT eliminates rows where the entire combination is duplicated, not each column independently.
Use COUNT(DISTINCT col) to count unique values without listing them all.
DISTINCT is for deduplication; GROUP BY is for aggregation. If you need counts, sums, or averages, reach for GROUP BY.
Watch out for DISTINCT masking JOIN problems — if you need DISTINCT to "fix" unexpected duplicates, check your join conditions first.
NULLs count as one value for DISTINCT. COUNT(DISTINCT col) ignores NULLs entirely.

Ready to practice? Start solving DISTINCT exercises on SQiLs →

Frequently Asked Questions

What is the difference between DISTINCT and GROUP BY in SQL?

DISTINCT removes duplicate rows from the result set — it's purely for deduplication. GROUP BY organizes rows into groups so you can apply aggregate functions like COUNT, SUM, and AVG. If you only need unique values, use DISTINCT. If you need calculations per group, use GROUP BY. While SELECT DISTINCT col and SELECT col GROUP BY col return the same result, GROUP BY signals that aggregation is the goal.

Does DISTINCT affect performance?

Yes. DISTINCT forces the database to sort or hash all result rows to find duplicates, which adds processing time. On small result sets the cost is negligible. On large tables with many columns, it can be significant. The best practice is to be specific about which columns need deduplication and to ensure you actually need DISTINCT rather than using it as a crutch for a poorly written JOIN.

Can I use DISTINCT with ORDER BY?

Yes, but there's one rule: every column in ORDER BY must appear in the SELECT list (or be derived from it) when DISTINCT is used. This is because DISTINCT produces a set of unique rows, and the database can only sort by columns that exist in that set. Most databases enforce this strictly.

How does DISTINCT handle NULL values?

DISTINCT treats all NULL values as equal to each other. If a column contains multiple NULLs, DISTINCT will return only one NULL row. However, COUNT(DISTINCT col) does not count NULLs — it only counts non-NULL unique values. This means COUNT(DISTINCT col) can return a lower number than you expect if your data has NULLs.