Spark SQL replacement for MySQL’s GROUP_CONCAT aggregate function

Before you proceed: This operations is yet another another groupByKey. While it has multiple legitimate applications it is relatively expensive so be sure to use it only when required. Not exactly concise or efficient solution but you can use UserDefinedAggregateFunction introduced in Spark 1.5.0: object GroupConcat extends UserDefinedAggregateFunction { def inputSchema = new StructType().add(“x”, StringType) … Read more

Return multiple columns of the same row as JSON array of objects

json_build_object() in Postgres 9.4 or newer Or jsonb_build_object() to return jsonb. SELECT value_two, json_agg(json_build_object(‘value_three’, value_three , ‘value_four’ , value_four)) AS value_four FROM mytable GROUP BY value_two; The manual: Builds a JSON object out of a variadic argument list. By convention, the argument list consists of alternating keys and values. For any version (incl. Postgres 9.3) … Read more

group by first character

Your query is wrong, since you would need to perform some aggregation function on EMPLOYEE_ID if you want that to work. Like: select substr(first_name,1,1) as alpha, count(employee_id) from employees group by substr(first_name,1,1) What exactly you are trying to accomplish?

PostgreSQL: running count of rows for a query ‘by minute’

Return only minutes with activity Shortest SELECT DISTINCT date_trunc(‘minute’, “when”) AS minute , count(*) OVER (ORDER BY date_trunc(‘minute’, “when”)) AS running_ct FROM mytable ORDER BY 1; Use date_trunc(), it returns exactly what you need. Don’t include id in the query, since you want to GROUP BY minute slices. count() is typically used as plain aggregate … Read more

LINQ aggregate and group by periods of time

You could round the time stamp to the next boundary (i.e. down to the closest 5 minute boundary in the past) and use that as your grouping: var groups = series.GroupBy(x => { var stamp = x.timestamp; stamp = stamp.AddMinutes(-(stamp.Minute % 5)); stamp = stamp.AddMilliseconds(-stamp.Millisecond – 1000 * stamp.Second); return stamp; }) .Select(g => new … Read more

How to SUM and SUBTRACT using SQL?

I think this is what you’re looking for. NEW_BAL is the sum of QTYs subtracted from the balance: SELECT master_table.ORDERNO, master_table.ITEM, SUM(master_table.QTY), stock_bal.BAL_QTY, (stock_bal.BAL_QTY – SUM(master_table.QTY)) AS NEW_BAL FROM master_table INNER JOIN stock_bal ON master_bal.ITEM = stock_bal.ITEM GROUP BY master_table.ORDERNO, master_table.ITEM If you want to update the item balance with the new balance, use the … Read more

DISTINCT ON in an aggregate function in postgres

The most simple thing I discovered is to use DISTINCT over jsonb (not json!). (jsonb_build_object creates jsonb objects) SELECT JSON_AGG( DISTINCT jsonb_build_object(‘tag_id’, photo_tag.tag_id, ‘name’, tag.name)) AS tags FROM photo LEFT OUTER JOIN comment ON comment.photo_id = photo.photo_id LEFT OUTER JOIN photo_tag ON photo_tag.photo_id = photo.photo_id LEFT OUTER JOIN tag ON photo_tag.tag_id = tag.tag_id GROUP BY … Read more

in postgres select, return a column subquery as an array?

Use the aggregate function: select usr_id, name, array_agg(tag_id) as tag_arr from users join tags using(usr_id) group by usr_id, name or an array constructor from the results of a subquery: select u.usr_id, name, array( select tag_id from tags t where t.usr_id = u.usr_id ) as tag_arr from users u The second option is a simple one-source … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)