Let A
and B
be two sets. I'm looking for really fast or elegant ways to compute the set difference (A - B
or A \B
, depending on your preference) between them. The two sets are stored and manipulated as Javascript arrays, as the title says.
让A和B是两个集合。我正在寻找非常快速或优雅的方法来计算它们之间的集差(A - B或A \B,取决于您的喜好)。如标题所示,这两个集合被存储并作为Javascript数组进行操作。
Notes:
注:
- Gecko-specific tricks are okay
- Gecko-specific技巧好
- I'd prefer sticking to native functions (but I am open to a lightweight library if it's way faster)
- 我宁愿坚持使用本机函数(但如果速度更快的话,我可以使用轻量级库)
- I've seen, but not tested, JS.Set (see previous point)
- 我看过,但没有测试过,JS。集(见以前的点)
Edit: I noticed a comment about sets containing duplicate elements. When I say "set" I'm referring to the mathematical definition, which means (among other things) that they do not contain duplicate elements.
编辑:我注意到一个关于包含重复元素的集合的评论。当我说“set”时,我指的是数学定义,这意味着(除其他外)它们不包含重复的元素。
11 个解决方案
#1
137
if don't know if this is most effective, but perhaps the shortest
如果不知道这是否是最有效的,也许是最短的
A = [1, 2, 3, 4];
B = [1, 3, 4, 7];
diff = A.filter(function(x) { return B.indexOf(x) < 0 })
console.log(diff);
Updated to ES6:
ES6更新:
A = [1, 2, 3, 4];
B = [1, 3, 4, 7];
diff = A.filter(x => B.indexOf(x) < 0 );
console.log(diff);
#2
41
Well, 7 years later, with ES6's Set object it's quite easy (but still not as compact as pythons A - B), and reportedly faster than indexOf
for large arrays:
7年后,ES6的Set对象非常简单(但仍然没有python A - B那么紧凑),据说比大型数组的indexOf要快:
let a = new Set([1,2,3,4]);
let b = new Set([5,4,3,2]);
console.log(new Set([...a].filter(x => !b.has(x)))); //a\b => {1}
console.log(new Set([...b].filter(x => !a.has(x)))); //b\a => {5}
console.log(new Set([...a].filter(x => b.has(x)))); //a∩b => {2,3,4}
#3
16
You can use an object as a map to avoid linearly scanning B
for each element of A
as in user187291's answer:
您可以使用对象作为映射来避免对a的每个元素进行线性扫描B,如user187291的回答:
function setMinus(A, B) {
var map = {}, C = [];
for(var i = B.length; i--; )
map[B[i].toSource()] = null; // any other value would do
for(var i = A.length; i--; ) {
if(!map.hasOwnProperty(A[i].toSource()))
C.push(A[i]);
}
return C;
}
The non-standard toSource()
method is used to get unique property names; if all elements already have unique string representations (as is the case with numbers), you can speed up the code by dropping the toSource()
invocations.
非标准的toSource()方法用于获取唯一的属性名;如果所有元素都有唯一的字符串表示(就像数字一样),那么可以通过删除toSource()调用来加快代码的速度。
#4
9
The shortest, using jQuery, is:
最短的,使用jQuery,是:
var A = [1, 2, 3, 4];
var B = [1, 3, 4, 7];
var diff = $(A).not(B);
console.log(diff.toArray());
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
#5
5
I would hash the array B, then keep values from the array A not present in B:
我将数组B哈希,然后保留数组A中不存在的值:
function getHash(array){
// Hash an array into a set of properties
//
// params:
// array - (array) (!nil) the array to hash
//
// return: (object)
// hash object with one property set to true for each value in the array
var hash = {};
for (var i=0; i<array.length; i++){
hash[ array[i] ] = true;
}
return hash;
}
function getDifference(a, b){
// compute the difference a\b
//
// params:
// a - (array) (!nil) first array as a set of values (no duplicates)
// b - (array) (!nil) second array as a set of values (no duplicates)
//
// return: (array)
// the set of values (no duplicates) in array a and not in b,
// listed in the same order as in array a.
var hash = getHash(b);
var diff = [];
for (var i=0; i<a.length; i++){
var value = a[i];
if ( !hash[value]){
diff.push(value);
}
}
return diff;
}
#6
4
Incorporating the idea from Christoph and assuming a couple of non-standard iteration methods on arrays and objects/hashes (each
and friends), we can get set difference, union and intersection in linear time in about 20 lines total:
结合Christoph的想法,假设在数组和对象/散列(每个和朋友)上有一些非标准的迭代方法,我们可以在大约20行中得到设置差、合并和相交的线性时间:
var setOPs = {
minusAB : function (a, b) {
var h = {};
b.each(function (v) { h[v] = true; });
return a.filter(function (v) { return !h.hasOwnProperty(v); });
},
unionAB : function (a, b) {
var h = {}, f = function (v) { h[v] = true; };
a.each(f);
b.each(f);
return myUtils.keys(h);
},
intersectAB : function (a, b) {
var h = {};
a.each(function (v) { h[v] = 1; });
b.each(function (v) { h[v] = (h[v] || 0) + 1; });
var fnSel = function (v, count) { return count > 1; };
var fnVal = function (v, c) { return v; };
return myUtils.select(h, fnSel, fnVal);
}
};
This assumes that each
and filter
are defined for arrays, and that we have two utility methods:
假设每个和过滤器都是为数组定义的,并且我们有两个实用方法:
-
myUtils.keys(hash)
: returns an array with the keys of the hashmyUtils.keys(hash):返回一个具有散列的键的数组
-
myUtils.select(hash, fnSelector, fnEvaluator)
: returns an array with the results of callingfnEvaluator
on the key/value pairs for whichfnSelector
returns true.myUtils。select(hash, fnSelector, fnEvaluator):返回一个数组,其结果是在fnSelector返回的键/值对上调用fnEvaluator。
The select()
is loosely inspired by Common Lisp, and is merely filter()
and map()
rolled into one. (It would be better to have them defined on Object.prototype
, but doing so wrecks havoc with jQuery, so I settled for static utility methods.)
select()受到了Common Lisp的松散启发,只是将filter()和map()合并为一个。(最好把它们定义在对象上。原型,但这样做破坏了jQuery,所以我选择了静态实用工具方法。
Performance: Testing with
性能:测试
var a = [], b = [];
for (var i = 100000; i--; ) {
if (i % 2 !== 0) a.push(i);
if (i % 3 !== 0) b.push(i);
}
gives two sets with 50,000 and 66,666 elements. With these values A-B takes about 75ms, while union and intersection are about 150ms each. (Mac Safari 4.0, using Javascript Date for timing.)
给出了包含50,000和66,666个元素的两个集合。A-B的值约为75ms,而union和crossroads的值约为150ms。(Mac Safari 4.0,使用Javascript日期计时)
I think that's decent payoff for 20 lines of code.
我认为这对20行代码来说是个不错的回报。
#7
3
Using Underscore.js (Library for functional JS)
使用下划线。js(函数js库)
>>> var foo = [1,2,3]
>>> var bar = [1,2,4]
>>> _.difference(foo, bar);
[4]
#8
1
This works, but I think another one is much more shorter, and elegant too
这是可行的,但我认为另一个更短,也更优雅
A = [1, 'a', 'b', 12];
B = ['a', 3, 4, 'b'];
diff_set = {
ar : {},
diff : Array(),
remove_set : function(a) { ar = a; return this; },
remove: function (el) {
if(ar.indexOf(el)<0) this.diff.push(el);
}
}
A.forEach(diff_set.remove_set(B).remove,diff_set);
C = diff_set.diff;
#9
1
As for the fasted way, this isn't so elegant but I've run some tests to be sure. Loading one array as an object is far faster to process in large quantities:
至于固定方式,这并不是那么优雅,但我已经运行了一些测试来确定。将一个数组作为对象加载,处理大量数据要快得多:
var t, a, b, c, A;
// Fill some arrays to compare
a = Array(30000).fill(0).map(function(v,i) {
return i.toFixed();
});
b = Array(20000).fill(0).map(function(v,i) {
return (i*2).toFixed();
});
// Simple indexOf inside filter
t = Date.now();
c = b.filter(function(v) { return a.indexOf(v) < 0; });
console.log('completed indexOf in %j ms with result %j length', Date.now() - t, c.length);
// Load `a` as Object `A` first to avoid indexOf in filter
t = Date.now();
A = {};
a.forEach(function(v) { A[v] = true; });
c = b.filter(function(v) { return !a[v]; });
console.log('completed Object in %j ms with result %j length', Date.now() - t, c.length);
Results:
结果:
completed indexOf in 1219 ms with result 5000 length
completed Object in 8 ms with result 5000 length
However, this works with strings only. If you plan to compare numbered sets you'll want to map results with parseInt.
但是,这只适用于字符串。如果您计划比较编号集,您将希望使用parseInt映射结果。
#10
1
You can use this lightweight array-diff open source component.
您可以使用这个轻量级的array-diff开源组件。
Example:
例子:
diff([1,2,3], [1,2,3,4,5]) // => [4,5]
It works by concating the two arrays passed and filtering included vals, returning an array representing the difference between the two arrays:
它通过对所传递的两个数组进行处理,并过滤其中包含的vals,返回一个表示两个数组之间差异的数组:
function diff(firstArray: any[], secondArray: any[]): any[] {
return firstArray.concat(secondArray).filter((val) => {
return !(firstArray.includes(val) && secondArray.includes(val));
});
};
#11
1
Some simple functions, borrowing from @milan's answer:
一些简单的函数,借用@milan的回答:
const setDifference = (a, b) => new Set([...a].filter(x => !b.has(x)));
const setIntersection = (a, b) => new Set([...a].filter(x => b.has(x)));
const setUnion = (a, b) => new Set([...a, ...b]);
Usage:
用法:
const a = new Set([1, 2]);
const b = new Set([2, 3]);
setDifference(a, b); // Set { 1 }
setIntersection(a, b); // Set { 2 }
setUnion(a, b); // Set { 1, 2, 3 }
#1
137
if don't know if this is most effective, but perhaps the shortest
如果不知道这是否是最有效的,也许是最短的
A = [1, 2, 3, 4];
B = [1, 3, 4, 7];
diff = A.filter(function(x) { return B.indexOf(x) < 0 })
console.log(diff);
Updated to ES6:
ES6更新:
A = [1, 2, 3, 4];
B = [1, 3, 4, 7];
diff = A.filter(x => B.indexOf(x) < 0 );
console.log(diff);
#2
41
Well, 7 years later, with ES6's Set object it's quite easy (but still not as compact as pythons A - B), and reportedly faster than indexOf
for large arrays:
7年后,ES6的Set对象非常简单(但仍然没有python A - B那么紧凑),据说比大型数组的indexOf要快:
let a = new Set([1,2,3,4]);
let b = new Set([5,4,3,2]);
console.log(new Set([...a].filter(x => !b.has(x)))); //a\b => {1}
console.log(new Set([...b].filter(x => !a.has(x)))); //b\a => {5}
console.log(new Set([...a].filter(x => b.has(x)))); //a∩b => {2,3,4}
#3
16
You can use an object as a map to avoid linearly scanning B
for each element of A
as in user187291's answer:
您可以使用对象作为映射来避免对a的每个元素进行线性扫描B,如user187291的回答:
function setMinus(A, B) {
var map = {}, C = [];
for(var i = B.length; i--; )
map[B[i].toSource()] = null; // any other value would do
for(var i = A.length; i--; ) {
if(!map.hasOwnProperty(A[i].toSource()))
C.push(A[i]);
}
return C;
}
The non-standard toSource()
method is used to get unique property names; if all elements already have unique string representations (as is the case with numbers), you can speed up the code by dropping the toSource()
invocations.
非标准的toSource()方法用于获取唯一的属性名;如果所有元素都有唯一的字符串表示(就像数字一样),那么可以通过删除toSource()调用来加快代码的速度。
#4
9
The shortest, using jQuery, is:
最短的,使用jQuery,是:
var A = [1, 2, 3, 4];
var B = [1, 3, 4, 7];
var diff = $(A).not(B);
console.log(diff.toArray());
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
#5
5
I would hash the array B, then keep values from the array A not present in B:
我将数组B哈希,然后保留数组A中不存在的值:
function getHash(array){
// Hash an array into a set of properties
//
// params:
// array - (array) (!nil) the array to hash
//
// return: (object)
// hash object with one property set to true for each value in the array
var hash = {};
for (var i=0; i<array.length; i++){
hash[ array[i] ] = true;
}
return hash;
}
function getDifference(a, b){
// compute the difference a\b
//
// params:
// a - (array) (!nil) first array as a set of values (no duplicates)
// b - (array) (!nil) second array as a set of values (no duplicates)
//
// return: (array)
// the set of values (no duplicates) in array a and not in b,
// listed in the same order as in array a.
var hash = getHash(b);
var diff = [];
for (var i=0; i<a.length; i++){
var value = a[i];
if ( !hash[value]){
diff.push(value);
}
}
return diff;
}
#6
4
Incorporating the idea from Christoph and assuming a couple of non-standard iteration methods on arrays and objects/hashes (each
and friends), we can get set difference, union and intersection in linear time in about 20 lines total:
结合Christoph的想法,假设在数组和对象/散列(每个和朋友)上有一些非标准的迭代方法,我们可以在大约20行中得到设置差、合并和相交的线性时间:
var setOPs = {
minusAB : function (a, b) {
var h = {};
b.each(function (v) { h[v] = true; });
return a.filter(function (v) { return !h.hasOwnProperty(v); });
},
unionAB : function (a, b) {
var h = {}, f = function (v) { h[v] = true; };
a.each(f);
b.each(f);
return myUtils.keys(h);
},
intersectAB : function (a, b) {
var h = {};
a.each(function (v) { h[v] = 1; });
b.each(function (v) { h[v] = (h[v] || 0) + 1; });
var fnSel = function (v, count) { return count > 1; };
var fnVal = function (v, c) { return v; };
return myUtils.select(h, fnSel, fnVal);
}
};
This assumes that each
and filter
are defined for arrays, and that we have two utility methods:
假设每个和过滤器都是为数组定义的,并且我们有两个实用方法:
-
myUtils.keys(hash)
: returns an array with the keys of the hashmyUtils.keys(hash):返回一个具有散列的键的数组
-
myUtils.select(hash, fnSelector, fnEvaluator)
: returns an array with the results of callingfnEvaluator
on the key/value pairs for whichfnSelector
returns true.myUtils。select(hash, fnSelector, fnEvaluator):返回一个数组,其结果是在fnSelector返回的键/值对上调用fnEvaluator。
The select()
is loosely inspired by Common Lisp, and is merely filter()
and map()
rolled into one. (It would be better to have them defined on Object.prototype
, but doing so wrecks havoc with jQuery, so I settled for static utility methods.)
select()受到了Common Lisp的松散启发,只是将filter()和map()合并为一个。(最好把它们定义在对象上。原型,但这样做破坏了jQuery,所以我选择了静态实用工具方法。
Performance: Testing with
性能:测试
var a = [], b = [];
for (var i = 100000; i--; ) {
if (i % 2 !== 0) a.push(i);
if (i % 3 !== 0) b.push(i);
}
gives two sets with 50,000 and 66,666 elements. With these values A-B takes about 75ms, while union and intersection are about 150ms each. (Mac Safari 4.0, using Javascript Date for timing.)
给出了包含50,000和66,666个元素的两个集合。A-B的值约为75ms,而union和crossroads的值约为150ms。(Mac Safari 4.0,使用Javascript日期计时)
I think that's decent payoff for 20 lines of code.
我认为这对20行代码来说是个不错的回报。
#7
3
Using Underscore.js (Library for functional JS)
使用下划线。js(函数js库)
>>> var foo = [1,2,3]
>>> var bar = [1,2,4]
>>> _.difference(foo, bar);
[4]
#8
1
This works, but I think another one is much more shorter, and elegant too
这是可行的,但我认为另一个更短,也更优雅
A = [1, 'a', 'b', 12];
B = ['a', 3, 4, 'b'];
diff_set = {
ar : {},
diff : Array(),
remove_set : function(a) { ar = a; return this; },
remove: function (el) {
if(ar.indexOf(el)<0) this.diff.push(el);
}
}
A.forEach(diff_set.remove_set(B).remove,diff_set);
C = diff_set.diff;
#9
1
As for the fasted way, this isn't so elegant but I've run some tests to be sure. Loading one array as an object is far faster to process in large quantities:
至于固定方式,这并不是那么优雅,但我已经运行了一些测试来确定。将一个数组作为对象加载,处理大量数据要快得多:
var t, a, b, c, A;
// Fill some arrays to compare
a = Array(30000).fill(0).map(function(v,i) {
return i.toFixed();
});
b = Array(20000).fill(0).map(function(v,i) {
return (i*2).toFixed();
});
// Simple indexOf inside filter
t = Date.now();
c = b.filter(function(v) { return a.indexOf(v) < 0; });
console.log('completed indexOf in %j ms with result %j length', Date.now() - t, c.length);
// Load `a` as Object `A` first to avoid indexOf in filter
t = Date.now();
A = {};
a.forEach(function(v) { A[v] = true; });
c = b.filter(function(v) { return !a[v]; });
console.log('completed Object in %j ms with result %j length', Date.now() - t, c.length);
Results:
结果:
completed indexOf in 1219 ms with result 5000 length
completed Object in 8 ms with result 5000 length
However, this works with strings only. If you plan to compare numbered sets you'll want to map results with parseInt.
但是,这只适用于字符串。如果您计划比较编号集,您将希望使用parseInt映射结果。
#10
1
You can use this lightweight array-diff open source component.
您可以使用这个轻量级的array-diff开源组件。
Example:
例子:
diff([1,2,3], [1,2,3,4,5]) // => [4,5]
It works by concating the two arrays passed and filtering included vals, returning an array representing the difference between the two arrays:
它通过对所传递的两个数组进行处理,并过滤其中包含的vals,返回一个表示两个数组之间差异的数组:
function diff(firstArray: any[], secondArray: any[]): any[] {
return firstArray.concat(secondArray).filter((val) => {
return !(firstArray.includes(val) && secondArray.includes(val));
});
};
#11
1
Some simple functions, borrowing from @milan's answer:
一些简单的函数,借用@milan的回答:
const setDifference = (a, b) => new Set([...a].filter(x => !b.has(x)));
const setIntersection = (a, b) => new Set([...a].filter(x => b.has(x)));
const setUnion = (a, b) => new Set([...a, ...b]);
Usage:
用法:
const a = new Set([1, 2]);
const b = new Set([2, 3]);
setDifference(a, b); // Set { 1 }
setIntersection(a, b); // Set { 2 }
setUnion(a, b); // Set { 1, 2, 3 }