如何从pyspark连接到hbase

I have two clusters in azure, one for spark 2.0 and another for hbase. I can fill and use the hbase perfectly, now I want to connect the spark to the hbase and do some calculations. I would like to use python as I don't know scala.

我在azure中有两个集群,一个用于spark 2.0,另一个用于hbase。我可以完美地填充和使用hbase,现在我想将火花连接到hbase并进行一些计算。我想使用python,因为我不知道scala。

My first questions is: What's the best way to do it? spark -> hive -> hbase or spark directly to hbase?

我的第一个问题是:最好的方法是什么? spark - > hive - > hbase或spark直接到hbase?

Could I have a good examples for inserting and querying?

我可以有一个很好的插入和查询示例吗?

Thanks?

1 个解决方案

#1

If you're cool with Spark to HBase directly via the Spark-to-HBase connector, Zhan Zhang has a pretty good blog post at http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/ providing examples.

如果您通过Spark-to-HBase连接器直接使用Spark到HBase很酷,Zhan Zhang在http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/上有一篇非常好的博文。提供例子。

HTH!

#1

HTH!

秒客网

如何从pyspark连接到hbase

1 个解决方案

#1

#1

相关文章