What’s the difference between => , ()=>, and Unit=>

Call-by-Name: => Type The => Type notation stands for call-by-name, which is one of the many ways parameters can be passed. If you aren’t familiar with them, I recommend taking some time to read that wikipedia article, even though nowadays it is mostly call-by-value and call-by-reference. What it means is that what is passed is … Read more

How to write to a file in Scala?

This is one of the features missing from standard Scala that I have found so useful that I add it to my personal library. (You probably should have a personal library, too.) The code goes like so: def printToFile(f: java.io.File)(op: java.io.PrintWriter => Unit) { val p = new java.io.PrintWriter(f) try { op(p) } finally { … Read more

How to store custom objects in Dataset?

Update This answer is still valid and informative, although things are now better since 2.2/2.3, which adds built-in encoder support for Set, Seq, Map, Date, Timestamp, and BigDecimal. If you stick to making types with only case classes and the usual Scala types, you should be fine with just the implicit in SQLImplicits. Unfortunately, virtually … Read more

Write single CSV file using spark-csv

It is creating a folder with multiple files, because each partition is saved individually. If you need a single output file (still in a folder) you can repartition (preferred if upstream data is large, but requires a shuffle): df .repartition(1) .write.format(“com.databricks.spark.csv”) .option(“header”, “true”) .save(“mydata.csv”) or coalesce: df .coalesce(1) .write.format(“com.databricks.spark.csv”) .option(“header”, “true”) .save(“mydata.csv”) data frame before … Read more

Use of def, val, and var in scala

There are three ways of defining things in Scala: def defines a method val defines a fixed value (which cannot be modified) var defines a variable (which can be modified) Looking at your code: def person = new Person(“Kumar”,12) This defines a new method called person. You can call this method only without () because … Read more

Spark – load CSV file as DataFrame?

spark-csv is part of core Spark functionality and doesn’t require a separate library. So you could just do for example df = spark.read.format(“csv”).option(“header”, “true”).load(“csvfile.csv”) In scala,(this works for any format-in delimiter mention “,” for csv, “\t” for tsv etc) val df = sqlContext.read.format(“com.databricks.spark.csv”) .option(“delimiter”, “,”) .load(“csvfile.csv”)

Why does the Scala compiler disallow overloaded methods with default arguments?

I’d like to cite Lukas Rytz (from here): The reason is that we wanted a deterministic naming-scheme for the generated methods which return default arguments. If you write def f(a: Int = 1) the compiler generates def f$default$1 = 1 If you have two overloads with defaults on the same parameter position, we would need … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)