Given an array a
of length N
, which is a list of integers, I want to extract the duplicate values, where I have a seperate list for each value containing the location of the duplicates. In pseudo-math:
给定一个长度为N的数组a,它是一个整数列表,我想提取重复值,其中我有一个包含重复项位置的每个值的单独列表。在伪数学中:
If |M| > 1:
val -> M = { i | a[i] == val }
Example (N=11
):
示例(N = 11):
a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
should give the following lists:
应该给出以下列表:
3 -> [1, 6, 7]
1 -> [2, 5]
10 -> [9, 10]
I added the python
tag since I'm currently programming in that language (numpy and scipy are available), but I'm more interestead in a general algorithm of how to do it. Code examples are fine, though.
我添加了python标签,因为我目前正在使用该语言进行编程(numpy和scipy可用),但我更倾向于使用通用算法来完成它。但代码示例很好。
One idea, which I did not yet flesh out: Construct a list of tuples, pairing each entry of a
with its index: (i, a[i])
. Sort the list with the second entry as key, then check consecutive entries for which the second entry is the same.
一个想法,我还没有充实:构建一个元组列表,将a的每个条目与其索引配对:(i,a [i])。使用第二个条目作为键对列表进行排序,然后检查第二个条目相同的连续条目。
3 个解决方案
#1
3
The idea is to create a dictionary mapping the values to the list of the position where it appears.
我们的想法是创建一个字典,将值映射到它出现的位置列表。
This can be done in a simple way with setdefault
. This can also be done using defaultdict
.
这可以通过setdefault以简单的方式完成。这也可以使用defaultdict完成。
>>> a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
>>> dup={}
>>> for i,x in enumerate(a):
... dup.setdefault(x,[]).append(i)
...
>>> dup
{0: [0], 1: [2, 5], 2: [8], 3: [1, 6, 7], 6: [3], 8: [4], 10: [9, 10]}
Then, actual duplicates can be extracted using set comprehension to filter out elements appearing only once.
然后,可以使用集合理解来提取实际重复项,以过滤掉仅出现一次的元素。
>>> {i:x for i,x in dup.iteritems() if len(x)>1}
{1: [2, 5], 10: [9, 10], 3: [1, 6, 7]}
#2
4
Here's an implementation using a python dictionary (actually a defaultdict, for convenience)
这是使用python字典的实现(为方便起见,实际上是一个defaultdict)
a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
from collections import defaultdict
d = defaultdict(list)
for k, item in enumerate(a):
d[item].append(k)
finalD = {key : value for key, value in d.items() if len(value) > 1} # Filter dict for items that only occurred once.
print(finalD)
# {1: [2, 5], 10: [9, 10], 3: [1, 6, 7]}
#3
1
Populate a dictionary whose keys are the values of the integers, and whose values are the lists of positions of those keys. Then go through that dictionary and remove all key/value pairs with only one position. You will be left with the ones that are duplicated.
填充字典,其字符的键是整数的值,其值是这些键的位置列表。然后浏览该字典并删除只有一个位置的所有键/值对。您将留下重复的那些。
#1
3
The idea is to create a dictionary mapping the values to the list of the position where it appears.
我们的想法是创建一个字典,将值映射到它出现的位置列表。
This can be done in a simple way with setdefault
. This can also be done using defaultdict
.
这可以通过setdefault以简单的方式完成。这也可以使用defaultdict完成。
>>> a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
>>> dup={}
>>> for i,x in enumerate(a):
... dup.setdefault(x,[]).append(i)
...
>>> dup
{0: [0], 1: [2, 5], 2: [8], 3: [1, 6, 7], 6: [3], 8: [4], 10: [9, 10]}
Then, actual duplicates can be extracted using set comprehension to filter out elements appearing only once.
然后,可以使用集合理解来提取实际重复项,以过滤掉仅出现一次的元素。
>>> {i:x for i,x in dup.iteritems() if len(x)>1}
{1: [2, 5], 10: [9, 10], 3: [1, 6, 7]}
#2
4
Here's an implementation using a python dictionary (actually a defaultdict, for convenience)
这是使用python字典的实现(为方便起见,实际上是一个defaultdict)
a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
from collections import defaultdict
d = defaultdict(list)
for k, item in enumerate(a):
d[item].append(k)
finalD = {key : value for key, value in d.items() if len(value) > 1} # Filter dict for items that only occurred once.
print(finalD)
# {1: [2, 5], 10: [9, 10], 3: [1, 6, 7]}
#3
1
Populate a dictionary whose keys are the values of the integers, and whose values are the lists of positions of those keys. Then go through that dictionary and remove all key/value pairs with only one position. You will be left with the ones that are duplicated.
填充字典,其字符的键是整数的值,其值是这些键的位置列表。然后浏览该字典并删除只有一个位置的所有键/值对。您将留下重复的那些。