如何在Hive中访问HBase表,反之亦然?

时间:2023-02-03 23:10:00

As a developer, I've created HBase table for our project by importing data from existing MySQL table using sqoop job. The problem is our data analyst team are familiar with MySQL syntax, implies they can query HIVE table easily. For them, I need to expose HBase table in HIVE. I don't want to duplicate data by populating data again in HIVE. Also, duplicating data might have consistency issues in future.

作为开发人员,我通过使用sqoop job从现有MySQL表导入数据,为我们的项目创建了HBase表。问题是我们的数据分析师团队熟悉MySQL语法,暗示他们可以轻松查询HIVE表。对他们来说,我需要在HIVE中公开HBase表。我不想通过在HIVE中再次填充数据来复制数据。此外,复制数据将来可能会出现一致性问题。

Can I expose HBase table in HIVE without duplicating data? If yes, how do I do it? Also, if I insert/update/delete data in my HBase table will updated data appear in HIVE without any issues?

我可以在没有重复数据的情况下暴露HIVE中的HBase表吗?如果是,我该怎么办?另外,如果我在HBase表中插入/更新/删除数据,更新数据会出现在HIVE中而没有任何问题吗?

Sometimes, our data analytic team create table and populate data in HIVE. Can I expose them to HBase? If yes, how?

有时,我们的数据分析团队会在HIVE中创建表格并填充数据。我可以将它们暴露给HBase吗?如果有,怎么样?

1 个解决方案

#1


HBase-Hive Integration:

Creating an external table in hive for HBase table allows you to query HBase data o be queried in Hive without the need for duplicating data. You can just update or delete data from HBase table and you can view the modified table in Hive too.

在HBase表的hive中创建外部表允许您查询在Hive中查询的HBase数据,而无需复制数据。您可以只更新或删除HBase表中的数据,也可以在Hive中查看修改后的表。

Example:

Consider you have an hbase table with columns id, name and email.

考虑你有一个包含列id,名称和电子邮件的hbase表。

Sample external table command for hive:

针对hive的示例外部表命令:

CREATE EXTERNAL TABLE hivehbasetable(key INT, id INT,  username STRING, password STRING, email STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:username,name:password,email:email") TBLPROPERTIES("hbase.table.name" = "hbasetable");

For more information on Hive-Hbase integration look here

有关Hive-Hbase集成的更多信息,请查看此处

#1


HBase-Hive Integration:

Creating an external table in hive for HBase table allows you to query HBase data o be queried in Hive without the need for duplicating data. You can just update or delete data from HBase table and you can view the modified table in Hive too.

在HBase表的hive中创建外部表允许您查询在Hive中查询的HBase数据,而无需复制数据。您可以只更新或删除HBase表中的数据,也可以在Hive中查看修改后的表。

Example:

Consider you have an hbase table with columns id, name and email.

考虑你有一个包含列id,名称和电子邮件的hbase表。

Sample external table command for hive:

针对hive的示例外部表命令:

CREATE EXTERNAL TABLE hivehbasetable(key INT, id INT,  username STRING, password STRING, email STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:username,name:password,email:email") TBLPROPERTIES("hbase.table.name" = "hbasetable");

For more information on Hive-Hbase integration look here

有关Hive-Hbase集成的更多信息,请查看此处