Spark load data and add filename as dataframe column

You can use input_file_name which:

Creates a string column for the file name of the current Spark task.

from  pyspark.sql.functions import input_file_name

df.withColumn("filename", input_file_name())

Same thing in Scala:

import org.apache.spark.sql.functions.input_file_name

df.withColumn("filename", input_file_name)

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)