greatest-n-per-group – Tarik Billa

SQL: how to limit a join on the first found row?

April 10, 2024 by Tarik

The key word here is FIRST. You can use analytic function FIRST_VALUE or aggregate construct FIRST. For FIRST or LAST the performance is never worse and frequently better than the equivalent FIRST_VALUE or LAST_VALUE construct because we don’t have a superfluous window sort and as a consequence a lower execution cost: select table_A.id, table_A.name, firstFromB.city … Read more

How to get the last record per group in SQL

April 5, 2024 by Tarik

You can use a ranking function and a common table expression. WITH e AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY ApplicationId ORDER BY CONVERT(datetime, [Date], 101) DESC, [Time] DESC ) AS Recency FROM [Event] ) SELECT * FROM e WHERE Recency = 1

Restrict results to top N rows per group

January 10, 2024 by Tarik

You want to find top n rows per group. This answer provides a generic solution using example data that is different from OP. In MySQL 8 or later you can use the ROW_NUMBER, RANK or DENSE_RANK function depending on the exact definition of top 5. Below are the numbers generated by these functions based on … Read more

Get most common value for each value of another column in SQL

December 1, 2023 by Tarik

It is now even simpler: PostgreSQL 9.4 introduced the mode() function: select mode() within group (order by food_id) from munch group by country returns (like user2247323’s example): country | mode ————– GB | 3 US | 1 See documentation here: https://wiki.postgresql.org/wiki/Aggregate_Mode https://www.postgresql.org/docs/current/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE

Mysql select distinct

September 24, 2023 by Tarik

DISTINCT is not a function that applies only to some columns. It’s a query modifier that applies to all columns in the select-list. That is, DISTINCT reduces rows only if all columns are identical to the columns of another row. DISTINCT must follow immediately after SELECT (along with other query modifiers, like SQL_CALC_FOUND_ROWS). Then following … Read more

Select rows based on last date

August 22, 2023 by Tarik

In PostgreSQL, to get unique rows for a defined set of columns, the preferable technique is generally DISTINCT ON: SELECT DISTINCT ON (“ID”) * FROM “Course” ORDER BY “ID”, “Course Date” DESC NULLS LAST, “Course Name”; Assuming you actually use those unfortunate upper case identifiers with spaces. You get exactly one row per ID this … Read more

Selecting most recent and specific version in each group of records, for multiple groups

August 12, 2023 by Tarik

To get only latest revisions: SELECT * from t t1 WHERE t1.rev = (SELECT max(rev) FROM t t2 WHERE t2.id = t1.id) To get a specific revision, in this case 1 (and if an item doesn’t have the revision yet the next smallest revision): SELECT * from foo t1 WHERE t1.rev = (SELECT max(rev) FROM … Read more

How to SELECT the newest four items per category?

August 12, 2023 by Tarik

This is the greatest-n-per-group problem, and it’s a very common SQL question. Here’s how I solve it with outer joins: SELECT i1.* FROM item i1 LEFT OUTER JOIN item i2 ON (i1.category_id = i2.category_id AND i1.item_id < i2.item_id) GROUP BY i1.item_id HAVING COUNT(*) < 4 ORDER BY category_id, date_listed; I’m assuming the primary key of … Read more

Selecting rows ordered by some column and distinct on another

June 15, 2023 by Tarik

Quite a clear question 🙂 SELECT t1.* FROM purchases t1 LEFT JOIN purchases t2 ON t1.address_id = t2.address_id AND t1.purchased_at < t2.purchased_at WHERE t2.purchased_at IS NULL ORDER BY t1.purchased_at DESC And most likely a faster approach: SELECT t1.* FROM purchases t1 JOIN ( SELECT address_id, max(purchased_at) max_purchased_at FROM purchases GROUP BY address_id ) t2 ON … Read more