As a developer, I've created HBase table for our project by importing data from existing MySQL table using sqoop job
. The problem is our data analyst team are familiar with MySQL syntax, implies they can query HIVE
table easily. For them, I need to expose HBase table in HIVE. I don't want to duplicate data by populating data again in HIVE. Also, duplicating data might have consistency issues in future.
作为开发人员,我通过使用sqoop job从现有MySQL表导入数据,为我们的项目创建了HBase表。问题是我们的数据分析师团队熟悉MySQL语法,暗示他们可以轻松查询HIVE表。对他们来说,我需要在HIVE中公开HBase表。我不想通过在HIVE中再次填充数据来复制数据。此外,复制数据将来可能会出现一致性问题。
Can I expose HBase table in HIVE without duplicating data? If yes, how do I do it? Also, if I insert/update/delete
data in my HBase table will updated data appear in HIVE without any issues?
我可以在没有重复数据的情况下暴露HIVE中的HBase表吗?如果是,我该怎么办?另外,如果我在HBase表中插入/更新/删除数据,更新数据会出现在HIVE中而没有任何问题吗?
Sometimes, our data analytic team create table and populate data in HIVE. Can I expose them to HBase? If yes, how?
有时,我们的数据分析团队会在HIVE中创建表格并填充数据。我可以将它们暴露给HBase吗?如果有,怎么样?
1 个解决方案
#1
HBase-Hive Integration:
Creating an external table
in hive for HBase table allows you to query HBase data o be queried in Hive without the need for duplicating data. You can just update or delete data from HBase table and you can view the modified table in Hive too.
在HBase表的hive中创建外部表允许您查询在Hive中查询的HBase数据,而无需复制数据。您可以只更新或删除HBase表中的数据,也可以在Hive中查看修改后的表。
Example:
Consider you have an hbase table with columns id
, name
and email
.
考虑你有一个包含列id,名称和电子邮件的hbase表。
Sample external table command for hive:
针对hive的示例外部表命令:
CREATE EXTERNAL TABLE hivehbasetable(key INT, id INT, username STRING, password STRING, email STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:username,name:password,email:email") TBLPROPERTIES("hbase.table.name" = "hbasetable");
For more information on Hive-Hbase integration look here
有关Hive-Hbase集成的更多信息,请查看此处
#1
HBase-Hive Integration:
Creating an external table
in hive for HBase table allows you to query HBase data o be queried in Hive without the need for duplicating data. You can just update or delete data from HBase table and you can view the modified table in Hive too.
在HBase表的hive中创建外部表允许您查询在Hive中查询的HBase数据,而无需复制数据。您可以只更新或删除HBase表中的数据,也可以在Hive中查看修改后的表。
Example:
Consider you have an hbase table with columns id
, name
and email
.
考虑你有一个包含列id,名称和电子邮件的hbase表。
Sample external table command for hive:
针对hive的示例外部表命令:
CREATE EXTERNAL TABLE hivehbasetable(key INT, id INT, username STRING, password STRING, email STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:username,name:password,email:email") TBLPROPERTIES("hbase.table.name" = "hbasetable");
For more information on Hive-Hbase integration look here
有关Hive-Hbase集成的更多信息,请查看此处