Although I still prefer dplyr syntax, this code snippet will do:
import pyspark.sql.functions as sf
(df.groupBy("group")
.agg(sf.sum('money').alias('money'))
.show(100))
It gets verbose.
Although I still prefer dplyr syntax, this code snippet will do:
import pyspark.sql.functions as sf
(df.groupBy("group")
.agg(sf.sum('money').alias('money'))
.show(100))
It gets verbose.