hive – Tarik Billa

How to update partition metadata in Hive , when partition data is manualy deleted from HDFS

April 9, 2024 by Tarik

EDIT : Starting with Hive 3.0.0 MSCK can now discover new partitions or remove missing partitions (or both) using the following syntax : MSCK [REPAIR] TABLE table_name [ADD/DROP/SYNC PARTITIONS] This was implemented in HIVE-17824 As correctly stated by HakkiBuyukcengiz, MSCK REPAIR doesn’t remove partitions if the corresponding folder on HDFS was manually deleted, it only … Read more

Transferring hive table from one database to another

December 19, 2023 by Tarik

Since 0.14, you can use following statement to move table from one database to another in the same metastore: use old_database; alter table table_a rename to new_database.table_a The above statements will also move the table data on hdfs if table_a is a managed table.

How to calculate median in Hive

December 9, 2023 by Tarik

You can use the percentile function to compute the median. Try this: select percentile(cast(age as BIGINT), 0.5) from table_name

Is there a way to alter column type in hive table?

December 7, 2023 by Tarik

Found the solution: ALTER TABLE tableA CHANGE ts ts BIGINT AFTER id; See this for complete details: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterColumn

Query HIVE table in pyspark

September 19, 2023 by Tarik

We cannot pass the Hive table name directly to Hive context sql method since it doesn’t understand the Hive table name. One way to read Hive table in pyspark shell is: from pyspark.sql import HiveContext hive_context = HiveContext(sc) bank = hive_context.table(“default.bank”) bank.show() To run the SQL on the hive table: First, we need to register … Read more

java.lang.RuntimeException:Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

September 18, 2023 by Tarik

Looks like problem with your metastore. If you are using the default hive metastore embedded derby. Lock file would be there in case of abnormal exit. if you remove that lock file this issue would be solved rm metastore_db/*.lck

Is there a way to make a multi line comment in hive scripts

September 1, 2023 by Tarik

As per my knowledge, multi-line comments are not supported in Hive scripts as of now. Seems like this JIRA introduced only single line comments, starting with — in Hive 0.8

Hive Alter table change Column Name

August 26, 2023 by Tarik

Change Column Name/Type/Position/Comment: ALTER TABLE table_name CHANGE [COLUMN] col_old_name col_new_name column_type [COMMENT col_comment] [FIRST|AFTER column_name] Example: CREATE TABLE test_change (a int, b int, c int); // will change column a’s name to a1 ALTER TABLE test_change CHANGE a a1 INT;

Create hive table using “as select” or “like” and also specify delimiter

May 6, 2023 by Tarik

Create Table as select (CTAS) is possible in Hive. You can try out below command: CREATE TABLE new_test row format delimited fields terminated by ‘|’ STORED AS RCFile AS select * from source where col=1 Target cannot be partitioned table. Target cannot be external table. It copies the structure as well as the data Create … Read more

How to skip CSV header in Hive External Table?

March 22, 2023 by Tarik

As of Hive v0.13.0, you can use skip.header.line.count table property: create external table testtable (name string, message string) row format delimited fields terminated by ‘\t’ lines terminated by ‘\n’ location ‘/testtable’ TBLPROPERTIES (“skip.header.line.count”=”1”); Use ALTER TABLE for an existing table: ALTER TABLE tablename SET TBLPROPERTIES (“skip.header.line.count”=”1”); Please note that while it works it comes with … Read more