如何根据另一个textFile第一列作为KEY读取textFile [复制]

时间:2021-07-18 20:52:22

This question is an exact duplicate of:

这个问题与以下内容完全相同:

I have 2 TextFiles A and B

我有2个TextFiles A和B.

I want have read file A into Spark next I have to read file B also but depends on key I have to read

我想要将文件A读入Spark接下来我还要读取文件B但是依赖于我必须阅读的密钥

  1. table A

    id column

  2. table一个id列

1 anu 2 usha

1 anu 2 usha

2 Table B

2表B.

id age

1 23 2 23

1 23 2 23


I want load data based on 1st table 1st column as a key Like Foreign key and Primary Key Any one please Guide me-

我希望基于第1表第1列的加载数据作为关键像外键和主键任何一个请指导我 -

1 个解决方案

#1


0  

form the gist of your question it seems you are trying to join to text files. You can use dataframes to joins this two files.

形成你的问题的要点似乎你正在尝试加入文本文件。您可以使用数据框来连接这两个文件。

val table1 = sqlContext.read.format("text").load("/user/chlr/spark_test/table1.csv").map(x => x.getString(0).split(',').head -> x.getString(0).split(',').tail.head ).toDF("id","column")

val table2 = sqlContext.read.format("text").load("/user/chlr/spark_test/table2.csv").map(x => x.getString(0).split(',').head -> x.getString(0).split(',').tail.head ).toDF("id","column")

table1.join(table2, "id").show()
+---+------+------+
| id|column|column|
+---+------+------+
|  1|   anu|    23|
|  2|  usha|    23|
+---+------+------+

#1


0  

form the gist of your question it seems you are trying to join to text files. You can use dataframes to joins this two files.

形成你的问题的要点似乎你正在尝试加入文本文件。您可以使用数据框来连接这两个文件。

val table1 = sqlContext.read.format("text").load("/user/chlr/spark_test/table1.csv").map(x => x.getString(0).split(',').head -> x.getString(0).split(',').tail.head ).toDF("id","column")

val table2 = sqlContext.read.format("text").load("/user/chlr/spark_test/table2.csv").map(x => x.getString(0).split(',').head -> x.getString(0).split(',').tail.head ).toDF("id","column")

table1.join(table2, "id").show()
+---+------+------+
| id|column|column|
+---+------+------+
|  1|   anu|    23|
|  2|  usha|    23|
+---+------+------+