Explode (transpose?) multiple columns in Spark SQL table

Spark >= 2.4 You can skip zip udf and use arrays_zip function: df.withColumn(“vars”, explode(arrays_zip($”varA”, $”varB”))).select( $”userId”, $”someString”, $”vars.varA”, $”vars.varB”).show Spark < 2.4 What you want is not possible without a custom UDF. In Scala you could do something like this: val data = sc.parallelize(Seq( “””{“userId”: 1, “someString”: “example1”, “varA”: [0, 2, 5], “varB”: [1, 2, … Read more

Add a column in a table in HIVE QL

You cannot add a column with a default value in Hive. You have the right syntax for adding the column ALTER TABLE test1 ADD COLUMNS (access_count1 int);, you just need to get rid of default sum(max_count). No changes to that files backing your table will happen as a result of adding the column. Hive handles … Read more