rdd去重时间:2023-03-09 00:01:11 a=[[1,2,3,2,3,4],[3,4,5,6,7,5,3,2]]b=sc.parallelize(a) d=b.flatMap(lambda x:x) #铺平 ,形成一个rdd e=d.distinct() e.collect() => [1, 2, 3, 4, 5, 6, 7]