如何通过spark阅读几列elasticsearch?

时间:2022-09-14 15:20:38

In the es cluster, it has a large scale data, we used spark to compute data but in the way of elasticsearch-hadoop, followed by https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark.html

在es集群中,它有一个大规模的数据,我们用spark来计算数据但是以elasticsearch-hadoop的方式,然后是https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark html的

We have to read full columns of an index. Is there anything that help?

我们必须读取索引的完整列。有什么有用的吗?

1 个解决方案

#1


0  

Yes, you can set config parameter "es.read.field.include" or "es.read.field.exclude" respectively. Full details here. Example assuming Spark 2 or higher.

是的,您可以分别设置配置参数“es.read.field.include”或“es.read.field.exclude”。详情请点击此处。假设Spark 2或更高版本的示例。

val sparkSession:SparkSession = SparkSession
  .builder()
  .appName("jobName")
  .config("es.nodes", "elastichostc1n1.example.com")
  .config("es.read.field.include", "foo,bar")
  .getOrCreate()

#1


0  

Yes, you can set config parameter "es.read.field.include" or "es.read.field.exclude" respectively. Full details here. Example assuming Spark 2 or higher.

是的,您可以分别设置配置参数“es.read.field.include”或“es.read.field.exclude”。详情请点击此处。假设Spark 2或更高版本的示例。

val sparkSession:SparkSession = SparkSession
  .builder()
  .appName("jobName")
  .config("es.nodes", "elastichostc1n1.example.com")
  .config("es.read.field.include", "foo,bar")
  .getOrCreate()