Restrict results to top N rows per group

You want to find top n rows per group. This answer provides a generic solution using example data that is different from OP.

In MySQL 8 or later you can use the ROW_NUMBER, RANK or DENSE_RANK function depending on the exact definition of top 5. Below are the numbers generated by these functions based on value sorted descending. Notice how ties are handled:

pkid catid value row_number rank dense_rank
1 p01 100 *1 *1 *1
2 p01 90 *2 *2 *2
3 p01 90 *3 *2 *2
4 p01 80 *4 *4 *3
5 p01 80 *5 *4 *3
6 p01 80 6 *4 *3
7 p01 70 7 7 *4
8 p01 60 8 8 *5
9 p01 50 9 9 6
10 p01 40 10 10 7

Once you have chosen the function, use it like so:

SELECT *
FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY catid ORDER BY value DESC) AS n
    FROM t
) AS x
WHERE n <= 5

DB<>Fiddle


In MySQL 5.x you can use poor man’s rank over partition to achieve desired result: outer join the table with itself and for each row, count the number of rows before it (e.g. the before row could be the one with higher value).

The following will produce results similar to RANK function:

SELECT t.pkid, t.catid, t.value, COUNT(b.value) + 1 AS rank
FROM t
LEFT JOIN t AS b ON b.catid = t.catid AND b.value > t.value
GROUP BY t.pkid, t.catid, t.value
HAVING COUNT(b.value) + 1 <= 5
ORDER BY t.catid, t.value DESC, t.pkid

Make the following change to produce results similar to DENSE_RANK function:

COUNT(DISTINCT b.value)

Or make the following change to produce results similar to ROW_NUMBER function:

ON b.catid = t.catid AND (b.value > t.value OR b.value = t.value AND b.pkid < t.pkid)

DB<>Fiddle

Leave a Comment

tech