最简洁的方法是在Java spark中添加2个集合

时间:2021-12-01 23:09:00

When writing Scala spark code, if I want to add together two collections, I can simply write

在编写Scala spark代码时,如果我想将两个集合添加在一起,我可以简单地编写

myRdd.reduceByKey(_ ++ _)

If I want to do the same in Java, however, I have to do

但是,如果我想在Java中做同样的事情,我必须这样做

myPairRdd.reduceBykey((s1, s2) -> {
    s1.addAll(s2);
    return s1;
}

I was wondering if there was a more concise way of writing the Java code.

我想知道是否有更简洁的编写Java代码的方法。

1 个解决方案

#1


0  

If you are trying to get a list for each key, consider the following:

如果您尝试获取每个密钥的列表,请考虑以下事项:

pairRDD.groupByKey().mapValues(_.toList)

#1


0  

If you are trying to get a list for each key, consider the following:

如果您尝试获取每个密钥的列表,请考虑以下事项:

pairRDD.groupByKey().mapValues(_.toList)