获得两个对象数组之间差异的有效方法?

时间:2022-07-16 21:29:25

I have two arrays of objects:

我有两个对象数组:

var a = [  {'id': 20},   {'id': 15},   {'id': 10},   {'id': 17},   {'id': 23}  ];

var b = [ {'id': 90},   {'id': 15},    {'id': 17},   {'id': 23}  ];  

I'd like to get objects which are in a, but not in b. Results from this example would be:

我想得到一个在a中但不在b中的对象。此示例的结果将是:

{'id': 20} and {'id': 10}.

{'id':20}和{'id':10}。

Because the arrays could be large, I need an efficient way to do this.

因为数组可能很大,我需要一种有效的方法来做到这一点。

4 个解决方案

#1


19  

// Make hashtable of ids in B
var bIds = {}
b.forEach(function(obj){
    bIds[obj.id] = obj;
});

// Return all elements in A, unless in B
return a.filter(function(obj){
    return !(obj.id in bIds);
});

very minor addendum: If the lists are very large and you wish to avoid the factor of 2 extra memory, you could store the objects in a hashmap in the first place instead of using lists, assuming the ids are unique: a = {20:{etc:...}, 15:{etc:...}, 10:{etc:...}, 17:{etc:...}, 23:{etc:...}}. I'd personally do this. Alternatively: Secondly, javascript sorts lists in-place so it doesn't use more memory. e.g. a.sort((x,y)=>x.id-y.id) Sorting would be worse than the above because it's O(N log(N)). But if you had to sort it anyway, there is an O(N) algorithm that involves two sorted lists: namely, you consider both lists together, and repeatedly take the leftmost (smallest) element from the lists (that is examine, then increment a pointer/bookmark from the list you took). This is just like merge sort but with a little bit more care to find identical items... and maybe pesky to code. Thirdly, if the lists are legacy code and you want to convert it to a hashmap without memory overhead, you can also do so element-by-element by repeatedly popping the elements off of the lists and into hashmaps.

非常小的附录:如果列表非常大并且您希望避免2个额外内存的因素,您可以首先将对象存储在散列图中而不是使用列表,假设ID是唯一的:a = {20: {etc:...},15:{etc:...},10:{etc:...},17:{etc:...},23:{etc:...}}。我个人会这样做。或者:其次,javascript就地排序列表,因此它不会占用更多内存。例如a.sort((x,y)=> x.id-y.id)排序会比上面更差,因为它是O(N log(N))。但是如果你不得不对它进行排序,那么有一个O(N)算法涉及两个排序列表:即,你将两个列表一起考虑,并重复从列表中取最左边(最小)的元素(即检查,然后增加你所采用的列表中的指针/书签)。这就像合并排序一样,但是要更加小心地找到相同的项目......并且可能会讨厌代码。第三,如果列表是遗留代码,并且您希望将其转换为没有内存开销的散列映射,则还可以通过重复弹出列表中的元素和哈希映射逐个元素地执行此操作。

#2


5  

With lodash 4.12.0 you can use _.differenceBy.

使用lodash 4.12.0,您可以使用_.differenceBy。

_.differenceBy(a, b, 'id');

#3


2  

A general way to do this would be:

一般来说,这样做的方法是:

  1. put all objects from b into a hashtable
  2. 将b中的所有对象放入哈希表中
  3. iterate over a, for each item check if it is in the hashtable
  4. 迭代a,为每个项目检查它是否在哈希表中

A lot of programming environments have set and/or HashSet implementations these days, which make it very simple to do this.

现在很多编程环境都设置了和/或HashSet实现,这使得它非常简单。

In special cases, other ways might be more efficient. If, for example, your elements were byte-sized values, and a and b both fairly big, then I would use a boolean array "flags" with 256 elements, initialize all to false. Then, for each element x of b, set flags[x] to true. Then iterate over a, and for each y in a, check if flags[y] is set.

在特殊情况下,其他方式可能更有效。例如,如果你的元素是字节大小的值,并且a和b都相当大,那么我将使用256个元素的布尔数组“flags”,将all初始化为false。然后,对于b的每个元素x,将flags [x]设置为true。然后迭代a,对于a中的每个y,检查是否设置了flags [y]。

#4


0  

If you not adverse to including a library use underscore.js it has a good intersection function http://documentcloud.github.com/underscore/

如果你不喜欢包含一个库使用underscore.js它有一个很好的交集功能http://documentcloud.github.com/underscore/

#1


19  

// Make hashtable of ids in B
var bIds = {}
b.forEach(function(obj){
    bIds[obj.id] = obj;
});

// Return all elements in A, unless in B
return a.filter(function(obj){
    return !(obj.id in bIds);
});

very minor addendum: If the lists are very large and you wish to avoid the factor of 2 extra memory, you could store the objects in a hashmap in the first place instead of using lists, assuming the ids are unique: a = {20:{etc:...}, 15:{etc:...}, 10:{etc:...}, 17:{etc:...}, 23:{etc:...}}. I'd personally do this. Alternatively: Secondly, javascript sorts lists in-place so it doesn't use more memory. e.g. a.sort((x,y)=>x.id-y.id) Sorting would be worse than the above because it's O(N log(N)). But if you had to sort it anyway, there is an O(N) algorithm that involves two sorted lists: namely, you consider both lists together, and repeatedly take the leftmost (smallest) element from the lists (that is examine, then increment a pointer/bookmark from the list you took). This is just like merge sort but with a little bit more care to find identical items... and maybe pesky to code. Thirdly, if the lists are legacy code and you want to convert it to a hashmap without memory overhead, you can also do so element-by-element by repeatedly popping the elements off of the lists and into hashmaps.

非常小的附录:如果列表非常大并且您希望避免2个额外内存的因素,您可以首先将对象存储在散列图中而不是使用列表,假设ID是唯一的:a = {20: {etc:...},15:{etc:...},10:{etc:...},17:{etc:...},23:{etc:...}}。我个人会这样做。或者:其次,javascript就地排序列表,因此它不会占用更多内存。例如a.sort((x,y)=> x.id-y.id)排序会比上面更差,因为它是O(N log(N))。但是如果你不得不对它进行排序,那么有一个O(N)算法涉及两个排序列表:即,你将两个列表一起考虑,并重复从列表中取最左边(最小)的元素(即检查,然后增加你所采用的列表中的指针/书签)。这就像合并排序一样,但是要更加小心地找到相同的项目......并且可能会讨厌代码。第三,如果列表是遗留代码,并且您希望将其转换为没有内存开销的散列映射,则还可以通过重复弹出列表中的元素和哈希映射逐个元素地执行此操作。

#2


5  

With lodash 4.12.0 you can use _.differenceBy.

使用lodash 4.12.0,您可以使用_.differenceBy。

_.differenceBy(a, b, 'id');

#3


2  

A general way to do this would be:

一般来说,这样做的方法是:

  1. put all objects from b into a hashtable
  2. 将b中的所有对象放入哈希表中
  3. iterate over a, for each item check if it is in the hashtable
  4. 迭代a,为每个项目检查它是否在哈希表中

A lot of programming environments have set and/or HashSet implementations these days, which make it very simple to do this.

现在很多编程环境都设置了和/或HashSet实现,这使得它非常简单。

In special cases, other ways might be more efficient. If, for example, your elements were byte-sized values, and a and b both fairly big, then I would use a boolean array "flags" with 256 elements, initialize all to false. Then, for each element x of b, set flags[x] to true. Then iterate over a, and for each y in a, check if flags[y] is set.

在特殊情况下,其他方式可能更有效。例如,如果你的元素是字节大小的值,并且a和b都相当大,那么我将使用256个元素的布尔数组“flags”,将all初始化为false。然后,对于b的每个元素x,将flags [x]设置为true。然后迭代a,对于a中的每个y,检查是否设置了flags [y]。

#4


0  

If you not adverse to including a library use underscore.js it has a good intersection function http://documentcloud.github.com/underscore/

如果你不喜欢包含一个库使用underscore.js它有一个很好的交集功能http://documentcloud.github.com/underscore/