What is hive, Is it a database? [closed]

Hive is a data warehousing package/infrastructure built on top of Hadoop. It provides an SQL dialect called Hive Query Language (HQL) for querying data stored in a Hadoop cluster. Like all SQL dialects in widespread use, HQL doesn’t fully conform to any particular revision of the ANSI SQL standard. It is perhaps closest to MySQL’s … Read more

Get output from scans in hbase shell

I know that this post is quite old but i was searching something about HBase myself and came across with it. Well i don’t know if this is the best way to do it, but you can definitely use the scripting option HBase gives you. Just open a shell (preferably go to the directory bin … Read more

How Do You Rename a Table in HBase?

To rename a table in HBase, apparently you have to use snapshots. So, you take a snapshot of the table and then clone it as a different name. In the HBase shell: disable ‘tableName’ snapshot ‘tableName’, ‘tableSnapshot’ clone_snapshot ‘tableSnapshot’, ‘newTableName’ delete_snapshot ‘tableSnapshot’ drop ‘tableName’ SOURCE http://hbase.apache.org/book.html#table.rename

HBase REST Filter ( SingleColumnValueFilter )

Filter fields in the Scanner XML are strings formatted as JSON. Since the JSON for the filter has many quotes in it, I recommend using a separate file for curl’s -d parameter, to avoid the single quote. curl -v -H “Content-Type:text/xml” -d @args.txt http://hbasegw:8080/table/scanner Where the file args.txt is: <Scanner startRow=”cm93MDE=” endRow=”cm93MDg=” batch=”1024″> <filter> { … Read more

Scan HTable rows for specific column value using HBase shell

It is possible without Hive: scan ‘filemetadata’, { COLUMNS => ‘colFam:colQualifier’, LIMIT => 10, FILTER => “ValueFilter( =, ‘binaryprefix:<someValue.e.g. test1 AsDefinedInQuestion>’ )” } Note: in order to find all rows that contain test1 as value as specified in the question, use binaryprefix:test1 in the filter (see this answer for more examples)

How to connect to remote HBase in Java?

Here’s a snippet from a system we use to create an HTable we use to connect to HBase Configuration hConf = HBaseConfiguration.create(conf); hConf.set(Constants.HBASE_CONFIGURATION_ZOOKEEPER_QUORUM, hbaseZookeeperQuorum); hConf.setInt(Constants.HBASE_CONFIGURATION_ZOOKEEPER_CLIENTPORT, hbaseZookeeperClientPort); HTable hTable = new HTable(hConf, tableName); HTH EDIT: Example Values: public static final String HBASE_CONFIGURATION_ZOOKEEPER_QUORUM = “hbase.zookeeper.quorum”; public static final String HBASE_CONFIGURATION_ZOOKEEPER_CLIENTPORT = “hbase.zookeeper.property.clientPort”; … hbaseZookeeperQuorum=”PDHadoop1.corp.CompanyName.com,PDHadoop2.corp.CompanyName.com”; hbaseZookeeperClientPort=10000; tableName=”HBaseTableName”;

Why HBase is a better choice than Cassandra with Hadoop?

I don’t think either is better than the others, it’s not just one or the other. These are very different systems, each with their strengths and weaknesses, so it really depends on your use cases. They can definitely be used in complement of one another in the same infrastructure. To explain the difference better I’d … Read more

Scan with filter using HBase shell

Try this. It’s kind of ugly, but it works for me. import org.apache.hadoop.hbase.filter.CompareFilter import org.apache.hadoop.hbase.filter.SingleColumnValueFilter import org.apache.hadoop.hbase.filter.SubstringComparator import org.apache.hadoop.hbase.util.Bytes scan ‘t1’, { COLUMNS => ‘family:qualifier’, FILTER => SingleColumnValueFilter.new (Bytes.toBytes(‘family’), Bytes.toBytes(‘qualifier’), CompareFilter::CompareOp.valueOf(‘EQUAL’), SubstringComparator.new(‘somevalue’)) } The HBase shell will include whatever you have in ~/.irbrc, so you can put something like this in there (I’m no Ruby … Read more