This question is an exact duplicate of:
这个问题与以下内容完全相同:
- Read a Text File Based on key as a Column from another textFile 1 answer
从另一个textFile 1答案中读取基于键作为列的文本文件
I have 2 TextFiles A and B
我有2个TextFiles A和B.
I want have read file A into Spark next I have to read file B also but depends on key I have to read
我想要将文件A读入Spark接下来我还要读取文件B但是依赖于我必须阅读的密钥
- table A
id column
table一个id列
1 anu 2 usha
1 anu 2 usha
2 Table B
2表B.
id age
1 23 2 23
1 23 2 23
I want load data based on 1st table 1st column as a key Like Foreign key and Primary Key Any one please Guide me-
我希望基于第1表第1列的加载数据作为关键像外键和主键任何一个请指导我 -
1 个解决方案
#1
0
form the gist of your question it seems you are trying to join to text files. You can use dataframes to joins this two files.
形成你的问题的要点似乎你正在尝试加入文本文件。您可以使用数据框来连接这两个文件。
val table1 = sqlContext.read.format("text").load("/user/chlr/spark_test/table1.csv").map(x => x.getString(0).split(',').head -> x.getString(0).split(',').tail.head ).toDF("id","column")
val table2 = sqlContext.read.format("text").load("/user/chlr/spark_test/table2.csv").map(x => x.getString(0).split(',').head -> x.getString(0).split(',').tail.head ).toDF("id","column")
table1.join(table2, "id").show()
+---+------+------+
| id|column|column|
+---+------+------+
| 1| anu| 23|
| 2| usha| 23|
+---+------+------+
#1
0
form the gist of your question it seems you are trying to join to text files. You can use dataframes to joins this two files.
形成你的问题的要点似乎你正在尝试加入文本文件。您可以使用数据框来连接这两个文件。
val table1 = sqlContext.read.format("text").load("/user/chlr/spark_test/table1.csv").map(x => x.getString(0).split(',').head -> x.getString(0).split(',').tail.head ).toDF("id","column")
val table2 = sqlContext.read.format("text").load("/user/chlr/spark_test/table2.csv").map(x => x.getString(0).split(',').head -> x.getString(0).split(',').tail.head ).toDF("id","column")
table1.join(table2, "id").show()
+---+------+------+
| id|column|column|
+---+------+------+
| 1| anu| 23|
| 2| usha| 23|
+---+------+------+