When writing Scala spark code, if I want to add together two collections, I can simply write
在编写Scala spark代码时,如果我想将两个集合添加在一起,我可以简单地编写
myRdd.reduceByKey(_ ++ _)
If I want to do the same in Java, however, I have to do
但是,如果我想在Java中做同样的事情,我必须这样做
myPairRdd.reduceBykey((s1, s2) -> {
s1.addAll(s2);
return s1;
}
I was wondering if there was a more concise way of writing the Java code.
我想知道是否有更简洁的编写Java代码的方法。
1 个解决方案
#1
0
If you are trying to get a list for each key, consider the following:
如果您尝试获取每个密钥的列表,请考虑以下事项:
pairRDD.groupByKey().mapValues(_.toList)
#1
0
If you are trying to get a list for each key, consider the following:
如果您尝试获取每个密钥的列表,请考虑以下事项:
pairRDD.groupByKey().mapValues(_.toList)