How do I install pyspark for use in standalone scripts?

Spark-2.2.0 onwards use pip install pyspark to install pyspark in your machine.

For older versions refer following steps. Add Pyspark lib in Python path in the bashrc

export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH

also don’t forget to set up the SPARK_HOME.
PySpark depends the py4j Python package. So install that as follows

pip install py4j

For more details about stand alone PySpark application refer this post

Leave a Comment