Spark SQL replacement for MySQL’s GROUP_CONCAT aggregate function
Before you proceed: This operations is yet another another groupByKey. While it has multiple legitimate applications it is relatively expensive so be sure to use it only when required. Not exactly concise or efficient solution but you can use UserDefinedAggregateFunction introduced in Spark 1.5.0: object GroupConcat extends UserDefinedAggregateFunction { def inputSchema = new StructType().add(“x”, StringType) … Read more