比较两个数组或arraylists,找到相似和不同的值

时间:2022-02-20 12:08:23

I have two arrays (or arraylists if it is easier) of strings. I need to compare these, find which only exist in the first array, which exist in both, and which only exist in the second array. These arrays are different lengths, and may be in different orders. If necessary, I suppose I could sort them...

我有两个数组(或更简单的arraylists)字符串。我需要比较这些,找到只存在于第一个数组中的数据,它们存在于两者中,并且只存在于第二个数组中。这些阵列的长度不同,可能的顺序不同。如有必要,我想我可以对它们进行排序......

I know I could hack this together, but I think this might have a fairly standard and efficient / "best" solution, and I am curious more than anything.

我知道我可以一起破解这个,但我认为这可能有一个相当标准和有效/“最佳”的解决方案,我比任何事情都更好奇。

I am using c# for this, but if you want to write your solution in another language, any help is welcome.

我正在使用c#,但如果你想用另一种语言编写解决方案,欢迎任何帮助。

Thanks for the help!

谢谢您的帮助!

2 个解决方案

#1


2  

var onlyinfirst = from s in list1 where !list2.Contains(s) select s;
var onlyinsecond = from s in list2 where !list1.Contains(s) select s;
var onboth = from s in list1 where list2.Contains(s) select s;

#2


6  

If the arrays are large then you'll want to use a data structure that is efficient for these operations; arrays are not.

如果数组很大,那么你将需要使用对这些操作有效的数据结构;数组不是。

The naive solution is O(n^2) in time if the arrays are of size n.

如果阵列的大小为n,则天真的解决方案是O(n ^ 2)。

If you sort the arrays in place then you can binary search them for the items; sorting will likely be O(n lg n) and searching n times at a cost of lg n per search will also be O(n lg n) in time.

如果你对数组进行排序,那么你可以二进制搜索它们的项目;排序可能是O(n lg n)并且每次搜索以ng n为代价进行n次搜索也将是O(n lg n)。

If you turn each array into a HashSet<T> first then you can do it in O(n) time and O(n) extra space.

如果先将每个数组转换为HashSet ,那么可以在O(n)时间和O(n)额外空间中进行。

#1


2  

var onlyinfirst = from s in list1 where !list2.Contains(s) select s;
var onlyinsecond = from s in list2 where !list1.Contains(s) select s;
var onboth = from s in list1 where list2.Contains(s) select s;

#2


6  

If the arrays are large then you'll want to use a data structure that is efficient for these operations; arrays are not.

如果数组很大,那么你将需要使用对这些操作有效的数据结构;数组不是。

The naive solution is O(n^2) in time if the arrays are of size n.

如果阵列的大小为n,则天真的解决方案是O(n ^ 2)。

If you sort the arrays in place then you can binary search them for the items; sorting will likely be O(n lg n) and searching n times at a cost of lg n per search will also be O(n lg n) in time.

如果你对数组进行排序,那么你可以二进制搜索它们的项目;排序可能是O(n lg n)并且每次搜索以ng n为代价进行n次搜索也将是O(n lg n)。

If you turn each array into a HashSet<T> first then you can do it in O(n) time and O(n) extra space.

如果先将每个数组转换为HashSet ,那么可以在O(n)时间和O(n)额外空间中进行。