获取两个数组之间的差异(包括重复)

时间:2021-10-12 21:29:55

I see a lot of posts about how to get the difference and symmetric difference of an array in javascript, but I haven't found anything on how to find the difference, including duplicates.

我看到很多关于如何在javascript中获得数组的差异和对称差异的文章,但是我没有找到任何关于如何找到差异的文章,包括重复的。

For example:

例如:

let original = [1];
let updated = [1, 1, 2];

difference(updated, original);
// Expect: [1, 2]

Is there an elegant way to do this? I'm open to solutions using plain javascript or lodash.

有没有一种优雅的方式来做这件事?我对使用普通javascript或lodash的解决方案持开放态度。

Thanks!

谢谢!

UPDATE

更新

To clarify, an infinite number of duplicates should be supported. Another example:

为了澄清,应该支持无限数量的重复。另一个例子:

let original = [1, 1];
let updated = [1, 1, 1, 1, 1, 2];

difference(updated, original);
// Expect: [1, 1, 1, 2]

UPDATE 2

更新2

I realized that there may be some confusion on the original requirements. It is true that infinite duplicates should be supported, but the order should not affect the output.

我意识到最初的需求可能有些混乱。确实应该支持无限重复,但是顺序不应该影响输出。

Example:

例子:

let original = [1, 1, 2];
let updated = [1, 2, 1, 1, 1];

difference(updated, original);
// Expect: [1, 1]

6 个解决方案

#1


2  

I would suggest this solution, which avoids a time complexity of O(n²):

我建议这个解决方案,避免了时间复杂度O(n²):

function difference(a, b) {
    return [...b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) - 1),
            a.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() ) 
    )].reduce( (acc, [v, count]) => acc.concat(Array(Math.abs(count)).fill(v)), [] );
}

let original = [1, 1];
let updated = [1, 1, 1, 1, 1, 2];
let res = difference(updated, original);

console.log(res);

Explanation

This solution creates a Map with a key for every distinct value of the first array (a), and as value the count of occurrences of each. Then b is added to that Map in the same way, except that the count of occurrences counts negative. If that count ends up being zero, then of course this key should not end up in the final result. In fact, the number of occurrences in the final result is the absolute value of the count in the Map for each of its keys.

该解决方案为第一个数组(a)的每个不同值创建一个带有键的映射,并作为每个数组出现次数的计数。然后以同样的方式将b添加到映射中,除了出现次数计数为负。如果计数最终为零,那么这个键当然不应该在最终结果中结束。实际上,最终结果中出现的次数是映射中每个键的计数的绝对值。

Details

The code starts with:

代码的开始:

new Map()

It is the initial value of the accumulator of the inner reduce. That reduce iterates over a and updates the count of the corresponding key in the Map. The final result of this reduce is thus a Map.

它是内减法的累加器的初值。这样可以减少对a的迭代,并更新映射中相应键的计数。因此,这种减少的最终结果就是一张地图。

This Map then becomes the initial accumulator value for the outer reduce. That reduce iterates over b and decreases the count in the Map.

然后该映射成为外部reduce的初始累加器值。这样就减少了对b的迭代,并减少了映射中的计数。

This updated Map is spread into an array with the spread operator. This array consists of 2-element sub-arrays, which are key/value pairs. Note that the value in this case is a count which could be positive, zero or negative.

这个更新后的映射通过扩展操作符被扩展到一个数组中。这个数组由2个元素的子数组组成,它们是键/值对。注意,这种情况下的值是一个计数,可以是正数、零或负数。

This array is then iterated with the final reduce. Each count is used to create an array of that many elements (in absolute value) of the corresponding value. All this is concatenated to one array, being the return value of the function.

然后用最终的reduce迭代这个数组。每个计数用于创建相应值的多个元素(绝对值)的数组。所有这些都连接到一个数组,作为函数的返回值。

Follow-up Question

In comments you explained you actually needed something different, where the role of both arrays is not the same. The first array should be returned, but with the elements from the second array removed from it.

在注释中,您解释了您实际上需要一些不同的东西,其中两个数组的角色都不相同。第一个数组应该返回,但是从第二个数组中的元素删除。

You could use this code for that:

你可以用这个代码:

function difference2(a, b) {
    return a.filter(function(v) {
        return !this.get(v) || !this.set(v, this.get(v) - 1);
    }, b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() ));
}

let original = [1, 1, 2];
let updated = [1, 1];
let res = difference2(original, updated);

console.log(res);

#2


1  

So, I'd:

所以,我想:

  • Iterate the updated array, for each element check if it's present on original array, if it's present I remove it from original array (note: in the function below I copy the original object, so I don't affect it), else I push the element to the differences array. At the end, I return the differences array.
  • 迭代更新后的数组,对于每个元素检查它是否存在于原始数组中,如果它存在,我将它从原始数组中删除(注意:在下面的函数中,我复制原始对象,因此不影响它),否则我将该元素推到差异数组中。最后,返回差异数组。

This code is made to work on various browsers, thus I didn't use Array().indexOf and other newer methods of ECMAScript.

此代码用于在各种浏览器上工作,因此我没有使用Array()。ECMAScript的索引和其他更新的方法。

function difference(updated, original) {
  var i, l;
  /* copy original array */
  var degradation = [];
  for (var i = 0, ol = original.length; i < ol; ++i)
    degradation[i] = original[i]

  var diff = [];
  for (i = 0, l = Math.max(updated.length, ol); i < l; ++i) {
    var upd = updated[i];
    var index;
    var b, found;
    /* find updated item in degradation */
    for (b = 0, found = false; b < ol; ++b) {
      if (degradation[b] === upd) {
        /* remove item from degradation */
        delete degradation[b];
        found = true;
        break;
      }
    }
    if (!found)
      diff.push(upd);
  }
  return diff;
}

#3


1  

function count(n,arr) {
  return arr.filter(a=>a==n).length
}

function diffBetween(arr,arr2) {
  diff = [];
  new Set(arr.concat(arr2)).forEach(
  a => {
    for(x=0;x<Math.abs(count(a,arr)-count(a,arr2));x++)
      diff.push(a)
    }
  );
  return diff;
}

console.log(diffBetween([1],[1,1,2]));
console.log(diffBetween([1,1],[1,1,1,1,1,2]));
console.log(diffBetween([1,1,3,4],[1,2,3]));

How does this work for you?

这对你有什么用?

EDIT:

编辑:

function difference(a, b) { // trincot's code
    return [...b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) - 1),
            a.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() ) 
    )].reduce( (acc, [v, count]) => acc.concat(Array(Math.abs(count)).fill(v)), [] );
}

function count(n,arr) { // My code
  return arr.filter(a=>a==n).length
}

function diffBetween(arr,arr2) { // My code
  diff = [];
  new Set(arr.concat(arr2)).forEach(
  a => {
    for(x=0;x<Math.abs(count(a,arr)-count(a,arr2));x++)
      diff.push(a)
    }
  );
  return diff;
}

in1 = [1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2];
in2 = [1,2,3,4,5,6,1,2,3,4,5,6,7,1,1,1,1,1,1,2,2,2,2,2,2,2];

start = (new Date).getTime();
a = difference(in1,in2);
end = (new Date).getTime();
console.log("trincot done",end-start,"msec");
start = (new Date).getTime();
a = diffBetween(in1,in2);
end = (new Date).getTime();
console.log("stardust done",end-start,"msec");

@trincot's solution above is consistently faster in my testing, so his is clearly superior with large enough datasets.

@trincot的解决方案在我的测试中始终是更快的,所以他的解决方案在足够大的数据集上明显是优越的。

#4


0  

    Array.prototype.Diff = function( secondArray ) {
    var mergedArray = this.concat( secondArray );
    var mergedString = mergedArray.toString();
    var finalArray = new Array();

    for( var i = 0; i < mergedArray.length; i++ ) {
        if(mergedString.match(mergedArray[i])) {
            finalArray.push(mergedArray[i]);
            mergedString = mergedString.replace(new RegExp(mergedArray[i], "g"), "");
        }
    }
    return finalArray;
}

var let = [ 1 ];
var updated = [ 1, 1, 2 ];

console.log(let.Diff(updated));

I like the prototype way. You can save the prototype function above in a JS file and import in any page that you want, the it's possible to use as an embedded function to the object (Array for this case).

我喜欢原型方式。您可以将上面的prototype函数保存到一个JS文件中,并将其导入到您想要的任何页面中,这样就可以将其作为对象的嵌入函数(本例中为数组)使用。

#5


0  

You can do it the following steps (O(n)).

您可以执行以下步骤(O(n))。

Let a and b are two arrays

假设a和b是两个数组

Step 1. create map hash_map of array a value as key and number occurrences of this key as value.

步骤1。创建map hash_map,将数组的值作为键值,并将此键的次数作为值。

Step 2. push all the elements of array b in result which are not in a using hash_map.

步骤2。将数组b的所有元素推入结果,使其不属于使用hash_map的数组。

Step 3. push all the elements of array a in result which are not in b using hash_map.

步骤3。使用hash_map将数组a中的所有元素推入结果,使其不属于b。

Here is complete code

这是完整的代码

function diff(a, b) {
    //Step 1 starts here
	var hash_map = a.reduce(function(map, key) {
		map[key] = map[key] ? (map[key]+1) : 1;
		return map;
	}, {});
    //Step 1 ends here
    //Step 2 starts here
	var result = b.filter(function(val) {
		if(hash_map[val]) {
			hash_map[val] = hash_map[val]-1;
			return false;
		}
		return true;
	});
    //Step 2 ends hers
    //Step 3 starts here
	Object.keys(hash_map).forEach(function(key) {
		while (hash_map[key]) {
			result.push(key);
			hash_map[key] = hash_map[key]-1;
		}
	});
    //Step 3 ends here
	return result;
}

console.log(diff([1],[1,1,2]));
console.log(diff([1,1,1],[1,1,1,1,1,2]));
console.log(diff([1,1,3,4],[1,2,3]));
console.log(diff([1,1,1,1,1,2], [1, 2, 1, 1, 3]));

#6


0  

You might do as follows;

你可以这样做;

var original = [1, 1, 1, 1, 2],
     updated = [1, 2, 1, 1, 3],
      result = (...a) => { var [shorter,longer] = [a[0],a[1]].sort((a,b) => a.length - b.length),
                                              s = shorter.slice();
                           return shorter.reduce((p,c) => { var fil = p.indexOf(c),
                                                                fis = s.indexOf(c);
                                                            fil !== -1 && (p.splice(fil,1),s.splice(fis,1));
                                                            return p;
                                                          },longer).concat(s);
                         };
console.log(result(updated,original));

#1


2  

I would suggest this solution, which avoids a time complexity of O(n²):

我建议这个解决方案,避免了时间复杂度O(n²):

function difference(a, b) {
    return [...b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) - 1),
            a.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() ) 
    )].reduce( (acc, [v, count]) => acc.concat(Array(Math.abs(count)).fill(v)), [] );
}

let original = [1, 1];
let updated = [1, 1, 1, 1, 1, 2];
let res = difference(updated, original);

console.log(res);

Explanation

This solution creates a Map with a key for every distinct value of the first array (a), and as value the count of occurrences of each. Then b is added to that Map in the same way, except that the count of occurrences counts negative. If that count ends up being zero, then of course this key should not end up in the final result. In fact, the number of occurrences in the final result is the absolute value of the count in the Map for each of its keys.

该解决方案为第一个数组(a)的每个不同值创建一个带有键的映射,并作为每个数组出现次数的计数。然后以同样的方式将b添加到映射中,除了出现次数计数为负。如果计数最终为零,那么这个键当然不应该在最终结果中结束。实际上,最终结果中出现的次数是映射中每个键的计数的绝对值。

Details

The code starts with:

代码的开始:

new Map()

It is the initial value of the accumulator of the inner reduce. That reduce iterates over a and updates the count of the corresponding key in the Map. The final result of this reduce is thus a Map.

它是内减法的累加器的初值。这样可以减少对a的迭代,并更新映射中相应键的计数。因此,这种减少的最终结果就是一张地图。

This Map then becomes the initial accumulator value for the outer reduce. That reduce iterates over b and decreases the count in the Map.

然后该映射成为外部reduce的初始累加器值。这样就减少了对b的迭代,并减少了映射中的计数。

This updated Map is spread into an array with the spread operator. This array consists of 2-element sub-arrays, which are key/value pairs. Note that the value in this case is a count which could be positive, zero or negative.

这个更新后的映射通过扩展操作符被扩展到一个数组中。这个数组由2个元素的子数组组成,它们是键/值对。注意,这种情况下的值是一个计数,可以是正数、零或负数。

This array is then iterated with the final reduce. Each count is used to create an array of that many elements (in absolute value) of the corresponding value. All this is concatenated to one array, being the return value of the function.

然后用最终的reduce迭代这个数组。每个计数用于创建相应值的多个元素(绝对值)的数组。所有这些都连接到一个数组,作为函数的返回值。

Follow-up Question

In comments you explained you actually needed something different, where the role of both arrays is not the same. The first array should be returned, but with the elements from the second array removed from it.

在注释中,您解释了您实际上需要一些不同的东西,其中两个数组的角色都不相同。第一个数组应该返回,但是从第二个数组中的元素删除。

You could use this code for that:

你可以用这个代码:

function difference2(a, b) {
    return a.filter(function(v) {
        return !this.get(v) || !this.set(v, this.get(v) - 1);
    }, b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() ));
}

let original = [1, 1, 2];
let updated = [1, 1];
let res = difference2(original, updated);

console.log(res);

#2


1  

So, I'd:

所以,我想:

  • Iterate the updated array, for each element check if it's present on original array, if it's present I remove it from original array (note: in the function below I copy the original object, so I don't affect it), else I push the element to the differences array. At the end, I return the differences array.
  • 迭代更新后的数组,对于每个元素检查它是否存在于原始数组中,如果它存在,我将它从原始数组中删除(注意:在下面的函数中,我复制原始对象,因此不影响它),否则我将该元素推到差异数组中。最后,返回差异数组。

This code is made to work on various browsers, thus I didn't use Array().indexOf and other newer methods of ECMAScript.

此代码用于在各种浏览器上工作,因此我没有使用Array()。ECMAScript的索引和其他更新的方法。

function difference(updated, original) {
  var i, l;
  /* copy original array */
  var degradation = [];
  for (var i = 0, ol = original.length; i < ol; ++i)
    degradation[i] = original[i]

  var diff = [];
  for (i = 0, l = Math.max(updated.length, ol); i < l; ++i) {
    var upd = updated[i];
    var index;
    var b, found;
    /* find updated item in degradation */
    for (b = 0, found = false; b < ol; ++b) {
      if (degradation[b] === upd) {
        /* remove item from degradation */
        delete degradation[b];
        found = true;
        break;
      }
    }
    if (!found)
      diff.push(upd);
  }
  return diff;
}

#3


1  

function count(n,arr) {
  return arr.filter(a=>a==n).length
}

function diffBetween(arr,arr2) {
  diff = [];
  new Set(arr.concat(arr2)).forEach(
  a => {
    for(x=0;x<Math.abs(count(a,arr)-count(a,arr2));x++)
      diff.push(a)
    }
  );
  return diff;
}

console.log(diffBetween([1],[1,1,2]));
console.log(diffBetween([1,1],[1,1,1,1,1,2]));
console.log(diffBetween([1,1,3,4],[1,2,3]));

How does this work for you?

这对你有什么用?

EDIT:

编辑:

function difference(a, b) { // trincot's code
    return [...b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) - 1),
            a.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() ) 
    )].reduce( (acc, [v, count]) => acc.concat(Array(Math.abs(count)).fill(v)), [] );
}

function count(n,arr) { // My code
  return arr.filter(a=>a==n).length
}

function diffBetween(arr,arr2) { // My code
  diff = [];
  new Set(arr.concat(arr2)).forEach(
  a => {
    for(x=0;x<Math.abs(count(a,arr)-count(a,arr2));x++)
      diff.push(a)
    }
  );
  return diff;
}

in1 = [1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2];
in2 = [1,2,3,4,5,6,1,2,3,4,5,6,7,1,1,1,1,1,1,2,2,2,2,2,2,2];

start = (new Date).getTime();
a = difference(in1,in2);
end = (new Date).getTime();
console.log("trincot done",end-start,"msec");
start = (new Date).getTime();
a = diffBetween(in1,in2);
end = (new Date).getTime();
console.log("stardust done",end-start,"msec");

@trincot's solution above is consistently faster in my testing, so his is clearly superior with large enough datasets.

@trincot的解决方案在我的测试中始终是更快的,所以他的解决方案在足够大的数据集上明显是优越的。

#4


0  

    Array.prototype.Diff = function( secondArray ) {
    var mergedArray = this.concat( secondArray );
    var mergedString = mergedArray.toString();
    var finalArray = new Array();

    for( var i = 0; i < mergedArray.length; i++ ) {
        if(mergedString.match(mergedArray[i])) {
            finalArray.push(mergedArray[i]);
            mergedString = mergedString.replace(new RegExp(mergedArray[i], "g"), "");
        }
    }
    return finalArray;
}

var let = [ 1 ];
var updated = [ 1, 1, 2 ];

console.log(let.Diff(updated));

I like the prototype way. You can save the prototype function above in a JS file and import in any page that you want, the it's possible to use as an embedded function to the object (Array for this case).

我喜欢原型方式。您可以将上面的prototype函数保存到一个JS文件中,并将其导入到您想要的任何页面中,这样就可以将其作为对象的嵌入函数(本例中为数组)使用。

#5


0  

You can do it the following steps (O(n)).

您可以执行以下步骤(O(n))。

Let a and b are two arrays

假设a和b是两个数组

Step 1. create map hash_map of array a value as key and number occurrences of this key as value.

步骤1。创建map hash_map,将数组的值作为键值,并将此键的次数作为值。

Step 2. push all the elements of array b in result which are not in a using hash_map.

步骤2。将数组b的所有元素推入结果,使其不属于使用hash_map的数组。

Step 3. push all the elements of array a in result which are not in b using hash_map.

步骤3。使用hash_map将数组a中的所有元素推入结果,使其不属于b。

Here is complete code

这是完整的代码

function diff(a, b) {
    //Step 1 starts here
	var hash_map = a.reduce(function(map, key) {
		map[key] = map[key] ? (map[key]+1) : 1;
		return map;
	}, {});
    //Step 1 ends here
    //Step 2 starts here
	var result = b.filter(function(val) {
		if(hash_map[val]) {
			hash_map[val] = hash_map[val]-1;
			return false;
		}
		return true;
	});
    //Step 2 ends hers
    //Step 3 starts here
	Object.keys(hash_map).forEach(function(key) {
		while (hash_map[key]) {
			result.push(key);
			hash_map[key] = hash_map[key]-1;
		}
	});
    //Step 3 ends here
	return result;
}

console.log(diff([1],[1,1,2]));
console.log(diff([1,1,1],[1,1,1,1,1,2]));
console.log(diff([1,1,3,4],[1,2,3]));
console.log(diff([1,1,1,1,1,2], [1, 2, 1, 1, 3]));

#6


0  

You might do as follows;

你可以这样做;

var original = [1, 1, 1, 1, 2],
     updated = [1, 2, 1, 1, 3],
      result = (...a) => { var [shorter,longer] = [a[0],a[1]].sort((a,b) => a.length - b.length),
                                              s = shorter.slice();
                           return shorter.reduce((p,c) => { var fil = p.indexOf(c),
                                                                fis = s.indexOf(c);
                                                            fil !== -1 && (p.splice(fil,1),s.splice(fis,1));
                                                            return p;
                                                          },longer).concat(s);
                         };
console.log(result(updated,original));