Explode the Array of Struct in Hive

You need to explode only once (in conjunction with LATERAL VIEW). After exploding you can use a new column (called prod_and_ts in my example) which will be of struct type. Then, you can resolve the product_id and timestamps members of this new struct column to retrieve the desired result. SELECT user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps … Read more

hadoop.mapred vs hadoop.mapreduce?

They are separated out because both of these packages represent 2 different APIs. org.apache.hadoop.mapred is the older API and org.apache.hadoop.mapreduce is the new one. And it was done to allow programmers write MapReduce jobs in a more convenient, easier and sophisticated fashion. You might find this presentation useful, which talks about the differences in detail. … Read more

Is it better to use the mapred or the mapreduce package to create a Hadoop Job?

Functionality wise there is not much difference between the old (o.a.h.mapred) and the new (o.a.h.mapreduce) API. The only significant difference is that records are pushed to the mapper/reducer in the old API. While the new API supports both pull/push mechanism. You can get more information about the pull mechanism here. Also, the old API has … Read more

Simple Java Map/Reduce framework [closed]

Have you check out Akka? While akka is really a distributed Actor model based concurrency framework, you can implement a lot of things simply with little code. It’s just so easy to divide work into pieces with it, and it automatically takes full advantage of a multi-core machine, as well as being able to use … Read more

Is gzip format supported in Spark?

From the Spark Scala Programming guide’s section on “Hadoop Datasets”: Spark can create distributed datasets from any file stored in the Hadoop distributed file system (HDFS) or other storage systems supported by Hadoop (including your local file system, Amazon S3, Hypertable, HBase, etc). Spark supports text files, SequenceFiles, and any other Hadoop InputFormat. Support for … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)