Using DISTINCT along with GROUP BY in SQL Server

Use DISTINCT to remove duplicate GROUPING SETS from the GROUP BY clause

In a completely silly example using GROUPING SETS() in general (or the special grouping sets ROLLUP() or CUBE() in particular), you could use DISTINCT in order to remove the duplicate values produced by the grouping sets again:

SELECT DISTINCT actors
FROM (VALUES('a'), ('a'), ('b'), ('b')) t(actors)
GROUP BY CUBE(actors, actors)

With DISTINCT:

actors
------
NULL
a
b

Without DISTINCT:

actors
------
a
b
NULL
a
b
a
b

But why, apart from making an academic point, would you do that?

Use DISTINCT to find unique aggregate function values

In a less far-fetched example, you might be interested in the DISTINCT aggregated values, such as, how many different duplicate numbers of actors are there?

SELECT DISTINCT COUNT(*)
FROM (VALUES('a'), ('a'), ('b'), ('b')) t(actors)
GROUP BY actors

Answer:

count
-----
2

Use DISTINCT to remove duplicates with more than one GROUP BY column

Another case, of course, is this one:

SELECT DISTINCT actors, COUNT(*)
FROM (VALUES('a', 1), ('a', 1), ('b', 1), ('b', 2)) t(actors, id)
GROUP BY actors, id

With DISTINCT:

actors  count
-------------
a       2
b       1

Without DISTINCT:

actors  count
-------------
a       2
b       1
b       1

For more details, I’ve written some blog posts, e.g. about GROUPING SETS and how they influence the GROUP BY operation, or about the logical order of SQL operations (as opposed to the lexical order of operations).

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)