Can I optimize a SELECT DISTINCT x FROM hugeTable query by creating an index on column x?

Question

This is likely not a problem of indexing, but one of data design. Normalization, to be precise. The fact that you need to query distinct values of a field, and even willing to add an index, is a strong indicator that the field should be normalized into a separate table with a (small) join key. Then the distinct values will be available immediately by scanning the much smaller lookup foreign table.

Update
As a workaround, you can create an indexed view on an aggregate by the ‘distinct’ field. COUNT_BIG is an aggregate that is allowed in indexed views:

create view vwDistinct
with schemabinding
as select x, count_big(*)
from schema.hugetable
group by x;

create clustered index cdxDistinct on vwDistinct(x);

select x from vwDistinct with (noexpand);

Leave a Comment Cancel reply