Even though sys.argv is a good solution, I still prefer this more proper way of handling line command args in my PySpark jobs:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--ngrams", help="some useful description.")
args = parser.parse_args()
if args.ngrams:
ngrams = args.ngrams
This way, you can launch your job as follows:
spark-submit job.py --ngrams 3
More information about argparse module can be found in Argparse Tutorial