查找数组中出现奇数次的所有元素

时间:2022-08-26 21:47:54

I came across following problem:

我遇到了以下问题:

'Find all the elements in an array which occur odd number of times'.

'找到数组中出现奇数次的所有元素'。

My thoughts about this are:

我对此的看法是:

  1. Use HashMap: Add values in the array as keys in the HashMap. The value corresponding to each key will be number of times that key is encountered.

    使用HashMap:在数组中添加值作为HashMap中的键。对应于每个键的值将是遇到键的次数。

  2. Sort the array using Quick sort in O(N log N) and then traverse through the array to check which elements occur odd number of times.

    使用O(N log N)中的快速排序对数组进行排序,然后遍历数组以检查哪些元素出现奇数次。

What do you think, is there any other approach to this? If no, then which of these 2 approaches is better?

您怎么看?有没有其他办法解决这个问题?如果不是,那么这两种方法中的哪一种更好?

Thanks in advance!

提前致谢!

3 个解决方案

#1


5  

"Better" depends on context. Using the hash map or hash set will be faster and has the advantage of not modifying the original array, but it requires O(N) extra memory. Sorting and counting takes longer and modifies the array, but requires no extra memory.

“更好”取决于背景。使用散列映射或散列集会更快,并且具有不修改原始数组的优点,但它需要额外的O(N)内存。排序和计数需要更长时间并修改数组,但不需要额外的内存。

Which solution you choose depends on whether you can afford the extra memory.

您选择哪种解决方案取决于您是否能够承受额外的内存。

#2


15  

You can modify your first approach to use a hash set instead of a hash map.

您可以修改第一种方法来使用哈希集而不是哈希映射。

Create an initially empty hash set. Go through the elements of your array. For each element, check the hash set: if the current array element is not in the set, add it; otherwise, remove it.

创建一个最初为空的哈希集。浏览数组的元素。对于每个元素,检查哈希集:如果当前数组元素不在集合中,则添加它;否则,删除它。

When you reach the end of the array, your hash set will contain every object that occurs in your array an odd number of times.

当您到达数组的末尾时,您的哈希集将包含数组中出现的每个对象奇数次。

Since accessing elements in a hash set is O(1), this algorithm has O(N) time complexity.

由于访问散列集中的元素是O(1),因此该算法具有O(N)时间复杂度。

#3


0  

Basically this can be done as an exercise in rebalancing efficiency: you go through your list of numbers, and for each number you already have in your set, you delete it again. That way your comparison set is about half the size of the possible max. Of course, that does not change O(n lg n), and you get an allocation/deallocation on each step which is likely quite more expensive.

基本上这可以作为重新平衡效率的练习来完成:您浏览数字列表,对于您在集合中已有的每个数字,您可以再次删除它。这样你的比较集大约是可能最大值的一半。当然,这不会改变O(n lg n),并且每个步骤都会得到一个分配/解除分配,这可能要贵得多。

So it might be interesting to use a mixed strategy: quicksort is pretty fast, but you basically want to coalesce two entries whenever they are equal, and at least (n-1)/2 entries are equal. Quicksort does not really accommodate the removal of elements while sorting.

所以使用混合策略可能会很有趣:quicksort非常快,但是你基本上想要在两个条目相等时合并两个条目,并且至少(n-1)/ 2个条目是相等的。 Quicksort在排序时并不能真正适应元素的删除。

So with an array-based approach (to cut down on allocation costs), I'd rather use some heap-based approach. Whenever you chance upon equal elements, you remove one. Heap sort is pretty competitive with quicksort, and being able to prune the tree in advance is going to turn out helpful.

因此,使用基于数组的方法(降低分配成本),我宁愿使用一些基于堆的方法。每当你冒险使用相同的元素时,你就删除一个元素。堆排序与quicksort非常有竞争力,并且能够提前修剪树将会变得有用。

#1


5  

"Better" depends on context. Using the hash map or hash set will be faster and has the advantage of not modifying the original array, but it requires O(N) extra memory. Sorting and counting takes longer and modifies the array, but requires no extra memory.

“更好”取决于背景。使用散列映射或散列集会更快,并且具有不修改原始数组的优点,但它需要额外的O(N)内存。排序和计数需要更长时间并修改数组,但不需要额外的内存。

Which solution you choose depends on whether you can afford the extra memory.

您选择哪种解决方案取决于您是否能够承受额外的内存。

#2


15  

You can modify your first approach to use a hash set instead of a hash map.

您可以修改第一种方法来使用哈希集而不是哈希映射。

Create an initially empty hash set. Go through the elements of your array. For each element, check the hash set: if the current array element is not in the set, add it; otherwise, remove it.

创建一个最初为空的哈希集。浏览数组的元素。对于每个元素,检查哈希集:如果当前数组元素不在集合中,则添加它;否则,删除它。

When you reach the end of the array, your hash set will contain every object that occurs in your array an odd number of times.

当您到达数组的末尾时,您的哈希集将包含数组中出现的每个对象奇数次。

Since accessing elements in a hash set is O(1), this algorithm has O(N) time complexity.

由于访问散列集中的元素是O(1),因此该算法具有O(N)时间复杂度。

#3


0  

Basically this can be done as an exercise in rebalancing efficiency: you go through your list of numbers, and for each number you already have in your set, you delete it again. That way your comparison set is about half the size of the possible max. Of course, that does not change O(n lg n), and you get an allocation/deallocation on each step which is likely quite more expensive.

基本上这可以作为重新平衡效率的练习来完成:您浏览数字列表,对于您在集合中已有的每个数字,您可以再次删除它。这样你的比较集大约是可能最大值的一半。当然,这不会改变O(n lg n),并且每个步骤都会得到一个分配/解除分配,这可能要贵得多。

So it might be interesting to use a mixed strategy: quicksort is pretty fast, but you basically want to coalesce two entries whenever they are equal, and at least (n-1)/2 entries are equal. Quicksort does not really accommodate the removal of elements while sorting.

所以使用混合策略可能会很有趣:quicksort非常快,但是你基本上想要在两个条目相等时合并两个条目,并且至少(n-1)/ 2个条目是相等的。 Quicksort在排序时并不能真正适应元素的删除。

So with an array-based approach (to cut down on allocation costs), I'd rather use some heap-based approach. Whenever you chance upon equal elements, you remove one. Heap sort is pretty competitive with quicksort, and being able to prune the tree in advance is going to turn out helpful.

因此,使用基于数组的方法(降低分配成本),我宁愿使用一些基于堆的方法。每当你冒险使用相同的元素时,你就删除一个元素。堆排序与quicksort非常有竞争力,并且能够提前修剪树将会变得有用。