pyspark : NameError: name ‘spark’ is not defined
You can add from pyspark.context import SparkContext from pyspark.sql.session import SparkSession sc = SparkContext(‘local’) spark = SparkSession(sc) to the begining of your code to define a SparkSession, then the spark.createDataFrame() should work.