select
and show
:
df.select("col").show()
or select
, flatMap
, collect
:
df.select("col").rdd.flatMap(list).collect()
Bracket notation (df[df.col]
) is used only for logical slicing and columns by itself (df.col
) are not distributed data structures but SQL expressions and cannot be collected.