是否可以在Eclipse IDE中运行Spark上的hive?

时间:2022-05-07 23:12:50

I've been trying to develop a Spark program using the Apache Spark Framework.
I want to instantiate HiveContext without any clusters.
Is it possible to use HiveContext and run it locally via the Eclipse Scala IDE without using any cluster?

我一直在尝试使用Apache Spark Framework开发Spark程序。我想在没有任何集群的情况下实例化HiveContext。是否可以使用HiveContext并在不使用任何集群的情况下通过Eclipse Scala IDE在本地运行它?

1 个解决方案

#1


0  

Simply is it possible? Sure... (emphasis added)

有可能吗?当然......(重点补充)

To use a HiveContext, you do not need to have an existing Hive setup, and all of the data sources available to a SQLContext are still available.

要使用HiveContext,您不需要现有的Hive设置,并且SQLContext可用的所有数据源仍然可用。

However, you need to compile some additional code.

但是,您需要编译一些其他代码。

HiveContext is only packaged separately to avoid including all of Hive’s dependencies in the default Spark build. If these dependencies are not a problem for your application then using HiveContext is recommended

HiveContext仅单独打包,以避免在默认的Spark构建中包含所有Hive的依赖项。如果这些依赖项对您的应用程序不是问题,那么建议使用HiveContext

But, if you are just writing Spark without any cluster, there is nothing holding you to Spark 1.x, and you should instead be using Spark 2.x which has a SparkSession as the entrypoint for SQL-related things.

但是,如果你只是在没有任何集群的情况下编写Spark,那么没有任何东西可以阻止你使用Spark 1.x,你应该使用Spark 2.x,它有一个SparkSession作为SQL相关事物的入口点。


Eclipse IDE shouldn't matter. You could also use IntelliJ... or no IDE and spark-submit any JAR file containing some Spark code...

Eclipse IDE无关紧要。你也可以使用IntelliJ ...或者没有IDE和spark-submit包含一些Spark代码的任何JAR文件......

#1


0  

Simply is it possible? Sure... (emphasis added)

有可能吗?当然......(重点补充)

To use a HiveContext, you do not need to have an existing Hive setup, and all of the data sources available to a SQLContext are still available.

要使用HiveContext,您不需要现有的Hive设置,并且SQLContext可用的所有数据源仍然可用。

However, you need to compile some additional code.

但是,您需要编译一些其他代码。

HiveContext is only packaged separately to avoid including all of Hive’s dependencies in the default Spark build. If these dependencies are not a problem for your application then using HiveContext is recommended

HiveContext仅单独打包,以避免在默认的Spark构建中包含所有Hive的依赖项。如果这些依赖项对您的应用程序不是问题,那么建议使用HiveContext

But, if you are just writing Spark without any cluster, there is nothing holding you to Spark 1.x, and you should instead be using Spark 2.x which has a SparkSession as the entrypoint for SQL-related things.

但是,如果你只是在没有任何集群的情况下编写Spark,那么没有任何东西可以阻止你使用Spark 1.x,你应该使用Spark 2.x,它有一个SparkSession作为SQL相关事物的入口点。


Eclipse IDE shouldn't matter. You could also use IntelliJ... or no IDE and spark-submit any JAR file containing some Spark code...

Eclipse IDE无关紧要。你也可以使用IntelliJ ...或者没有IDE和spark-submit包含一些Spark代码的任何JAR文件......