I have two clusters in azure, one for spark 2.0 and another for hbase. I can fill and use the hbase perfectly, now I want to connect the spark to the hbase and do some calculations. I would like to use python as I don't know scala.
我在azure中有两个集群,一个用于spark 2.0,另一个用于hbase。我可以完美地填充和使用hbase,现在我想将火花连接到hbase并进行一些计算。我想使用python,因为我不知道scala。
My first questions is: What's the best way to do it? spark -> hive -> hbase or spark directly to hbase?
我的第一个问题是:最好的方法是什么? spark - > hive - > hbase或spark直接到hbase?
Could I have a good examples for inserting and querying?
我可以有一个很好的插入和查询示例吗?
Thanks?
1 个解决方案
#1
0
If you're cool with Spark to HBase directly via the Spark-to-HBase connector, Zhan Zhang has a pretty good blog post at http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/ providing examples.
如果您通过Spark-to-HBase连接器直接使用Spark到HBase很酷,Zhan Zhang在http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/上有一篇非常好的博文。提供例子。
HTH!
#1
0
If you're cool with Spark to HBase directly via the Spark-to-HBase connector, Zhan Zhang has a pretty good blog post at http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/ providing examples.
如果您通过Spark-to-HBase连接器直接使用Spark到HBase很酷,Zhan Zhang在http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/上有一篇非常好的博文。提供例子。
HTH!