Querying on multiple Hive stores using Apache Spark

I think this is possible by making use of Spark SQL capability of connecting and reading data from remote databases using JDBC. After an exhaustive R & D, I was successfully able to connect to two different hive environments using JDBC and load the hive tables as DataFrames into Spark for further processing. Environment details … Read more

Query HIVE table in pyspark

We cannot pass the Hive table name directly to Hive context sql method since it doesn’t understand the Hive table name. One way to read Hive table in pyspark shell is: from pyspark.sql import HiveContext hive_context = HiveContext(sc) bank = hive_context.table(“default.bank”) bank.show() To run the SQL on the hive table: First, we need to register … Read more

Hive Alter table change Column Name

Change Column Name/Type/Position/Comment: ALTER TABLE table_name CHANGE [COLUMN] col_old_name col_new_name column_type [COMMENT col_comment] [FIRST|AFTER column_name] Example: CREATE TABLE test_change (a int, b int, c int); // will change column a’s name to a1 ALTER TABLE test_change CHANGE a a1 INT;

what is HiveServer and Thrift server [closed]

HiveServer2 (HS2) is a service that enables clients to execute queries against Hive. HiveServer2 is the successor to HiveServer1 which has been deprecated. HS2 supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC. You can find more details about hiveserver at https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Overview Hive Service … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)