Currently I have a structure like this: Array[(Int, Array[(String, Int)])]
, and I want to use reduceByKey
on the Array[(String, Int)]
, which is inside the Array of tuple. I tried code like
目前我有这样的结构:Array [(Int,Array [(String,Int)])],我想在Array [(String,Int)]上使用reduceByKey,它位于元组数组中。我试过像代码一样
//data is in Array[(Int, Array[(String, Int)])] structure
val result = data.map(l => (l._1, l._2.reduceByKey(_ + _)))
The error is telling that Array[(String,Int)]
does not have method called reduceByKey, and I understand that this method can only be used on RDD. So my question is, is there any way to use "reduceByKey" feature, doesn't need to use exactly this method, in the nested structure?
错误告诉Array [(String,Int)]没有名为reduceByKey的方法,我知道这个方法只能在RDD上使用。所以我的问题是,有没有办法使用“reduceByKey”功能,在嵌套结构中不需要使用这种方法?
Thanks guys.
1 个解决方案
#1
You simply use Array
's reduce
method here as you are now working with an Array
and not an RDD
(assuming you really meant the outer wrapper to be an RDD
)
你现在只使用Array的reduce方法,因为你现在使用的是数组而不是RDD(假设你真的认为外包装是RDD)
val data = sc.parallelize(List((1,List(("foo", 1), ("foo", 1)))))
data.map(l=>(l._1, l._2.foldLeft(List[(String, Int)]())((accum, curr)=>{
val accumAsMap = accum.toMap
accumAsMap.get(curr._1) match {
case Some(value : Int) => (accumAsMap + (curr._1 -> (value + curr._2))).toList
case None => curr :: accum
}
}))).collect
Ultimately, it seems that you do not understand what an RDD
is, so you might want to read some of the docs on them.
最终,您似乎无法理解RDD是什么,因此您可能希望阅读有关它们的一些文档。
#1
You simply use Array
's reduce
method here as you are now working with an Array
and not an RDD
(assuming you really meant the outer wrapper to be an RDD
)
你现在只使用Array的reduce方法,因为你现在使用的是数组而不是RDD(假设你真的认为外包装是RDD)
val data = sc.parallelize(List((1,List(("foo", 1), ("foo", 1)))))
data.map(l=>(l._1, l._2.foldLeft(List[(String, Int)]())((accum, curr)=>{
val accumAsMap = accum.toMap
accumAsMap.get(curr._1) match {
case Some(value : Int) => (accumAsMap + (curr._1 -> (value + curr._2))).toList
case None => curr :: accum
}
}))).collect
Ultimately, it seems that you do not understand what an RDD
is, so you might want to read some of the docs on them.
最终,您似乎无法理解RDD是什么,因此您可能希望阅读有关它们的一些文档。