How to print the contents of RDD?
If you want to view the content of a RDD, one way is to use collect(): myRDD.collect().foreach(println) That’s not a good idea, though, when the RDD has billions of lines. Use take() to take just a few to print out: myRDD.take(n).foreach(println)