Add an empty column to Spark DataFrame
All you need here is importing StringType and using lit and cast: from pyspark.sql.types import StringType from pyspark.sql.functions import lit new_df = old_df.withColumn(‘new_column’, lit(None).cast(StringType())) A full example: df = sc.parallelize([row(1, “2”), row(2, “3”)]).toDF() df.printSchema() # root # |– foo: long (nullable = true) # |– bar: string (nullable = true) new_df = df.withColumn(‘new_column’, lit(None).cast(StringType())) new_df.printSchema() … Read more