在列表中找到与其他元素不同的元素的最pythonic方法是什么?

时间:2021-10-03 22:54:17

Suppose we have a list with unknown size and there is an element in the list that is different with other elements but we don't know the index of the element. the list only contains numerics and is fetched from a remote server and the length of the list and the index of the different element is changed every time. what is the most pythonic way to find that different element? I tried this but I'm not sure if it's the best solution.

假设我们有一个大小未知的列表,列表中有一个元素与其他元素不同,但我们不知道元素的索引。该列表仅包含数字并从远程服务器获取,并且每次更改列表的长度和不同元素的索引。什么是找到不同元素的最pythonic方式?我尝试了这个,但我不确定它是否是最好的解决方案。

a = 1
different_element = None
my_list = fetch_list()

b = my_list[0] - a

for elem in my_list[1::]:
    if elem - a != b:
        different_element = elem

print(different_element)

4 个解决方案

#1


2  

This is a great use for numpy

这对numpy很有用

Given some random uniform list with a single uniquely different number in it:

给定一些随机统一列表,其中包含一个唯一不同的数字:

>>> li=[1]*100+[200]+[1]*250

If the uniform value is known (in this case 1 and the unknown value is 200) you can use np.where on an array to get that value:

如果统一值已知(在本例中为1且未知值为200),则可以使用数组上的np.where来获取该值:

>>> import numpy as np
>>> a=np.array(li)
>>> a[a!=1]
array([200])

If the uniform values are not known, you can use np.uniques to get the counts of uniques:

如果统一值未知,则可以使用np.uniques来获取唯一值的计数:

>>> np.unique(a, return_counts=True)
(array([  1, 200]), array([350,   1]))

For a pure Python solution, use a generator with next to get the first value that is different than all the others:

对于纯Python解决方案,使用带有next的生成器获取与所有其他值不同的第一个值:

>>> next(e for i, e in enumerate(li) if li[i]!=1)
200

Or, you can use dropwhile from itertools:

或者,您可以使用来自itertools的dropwhile:

>>> from itertools import dropwhile
>>> next(dropwhile(lambda e: e==1, li))
200

If you do not know what the uniform value is, use a Counter on a slice big enough to get it:

如果您不知道统一值是什么,请在足够大的切片上使用计数器来获取它:

>>> uniform=Counter(li[0:3]).most_common()[0][0]
>>> uniform
1
>>> next(e for i, e in enumerate(li) if li[i]!=uniform)
200

In these cases, next will short-circuit at the first value that satisfies the condition.

在这些情况下,接下来将在满足条件的第一个值处短路。

#2


2  

Would this work for you?

这对你有用吗?

In [6]: my_list = [1,1,1,2,1,1,1]
In [7]: different = [ii for ii in set(my_list) if my_list.count(ii) == 1]
In [8]: different
Out[8]: [2]

#3


2  

You can use Counter from collections package

您可以使用Counter from collections包

from collections import Counter

a = [1,2,3,4,3,4,1]
b = Counter(a)  # Counter({1: 2, 2: 1, 3: 2, 4: 2})
elem = list(b.keys())[list(b.values()).index(1)]  # getting elem which is key with value that equals 1
print(a.index(elem))

Another possible solution that just differently compute elem

另一个可能的解决方案,只是不同的计算elem

a = [1,2,3,4,3,4,1]
b = Counter(a)  # Counter({1: 2, 2: 1, 3: 2, 4: 2})
elem = (k for k, v in b.items() if v == 1)
print(a.index(next(elem)))

UPDATE

UPDATE

Time consumption:

时间消耗:

As @Jblasco mentioned, Jblasco's method is not really efficient one, and i was curious to measure it.

正如@Jblasco所说,Jblasco的方法并不是真正有效的方法,我很想测量它。

So the initial data is array with 200-400 elements, with only one unique value. The code that generate that array is. At the end of snipped there is 100 first elements that prove that it has one unique

因此,初始数据是具有200-400个元素的数组,只有一个唯一值。生成该数组的代码是。在剪切结束时,有100个第一个元素证明它有一个独特的元素

import random
from itertools import chain
f = lambda x: [x]*random.randint(2,4)
a=list(chain.from_iterable(f(random.randint(0,100)) for _ in range(100)))
a[random.randint(1, 100)] = 101
print(a[:100])
# [5, 5, 5, 84, 84, 84, 46, 46, 46, 46, 6, 6, 6, 68, 68, 68, 68, 38,
# 38, 38, 44, 44, 61, 61, 15, 15, 15, 15, 36, 36, 36, 36, 73, 73, 73, 
# 28, 28, 28, 28, 6, 6, 93, 93, 74, 74, 74, 74, 12, 12, 72, 72, 22, 
# 22, 22, 22, 78, 78, 17, 17, 17, 93, 93, 93, 12, 12, 12, 23, 23, 23, 
# 23, 52, 52, 88, 88, 79, 79, 42, 42, 34, 34, 47, 47, 1, 1, 1, 1, 71,
# 71, 1, 1, 45, 45, 101, 45, 39, 39, 50, 50, 50, 50]

That's the code that show us results, i choose to execute 3 times with 10000 executions:

这是向我们展示结果的代码,我选择执行3次10000次执行:

from timeit import repeat


s = """\
import random
from itertools import chain
f = lambda x: [x]*random.randint(2,4)
a=list(chain.from_iterable(f(random.randint(0,100)) for _ in range(100)))
a[random.randint(1, 100)] = 101
"""

print('my 1st method:', repeat(stmt="""from collections import Counter
b=Counter(a)
elem = (k for k, v in b.items() if v == 1)
a.index(next(elem))""",
             setup=s, number=10000, repeat=3)

print('my 2nd method:', repeat(stmt="""from collections import Counter
b = Counter(a)
elem = list(b.keys())[list(b.values()).index(1)]
a.index(elem)""",
             setup=s, number=10000, repeat=3))

print('@Jblasco method:', repeat(stmt="""different = [ii for ii in set(a) if a.count(ii) == 1]
different""", setup=s, number=10000, repeat=3))

# my 1st method: [0.303596693000145, 0.27322746600111714, 0.2701447969993751]
# my 2nd method: [0.2715420649983571, 0.28590541199810104, 0.2821485950007627]
# @Jblasco method: [3.2133491599997797, 3.488262927003234, 2.884892332000163]

#4


1  

I would try maybe something like this:

我会尝试这样的事情:

newList = list(set(my_list))
print newList.pop()

Assuming there's only 1 different value and the rest are all the same. There's a little bit of ambiguity in your question which makes it difficult to answer but that's all I could think of optimally.

假设只有1个不同的值,其余的都是相同的。你的问题有点含糊不清,这让我很难回答,但这就是我能想到的最佳方案。

#1


2  

This is a great use for numpy

这对numpy很有用

Given some random uniform list with a single uniquely different number in it:

给定一些随机统一列表,其中包含一个唯一不同的数字:

>>> li=[1]*100+[200]+[1]*250

If the uniform value is known (in this case 1 and the unknown value is 200) you can use np.where on an array to get that value:

如果统一值已知(在本例中为1且未知值为200),则可以使用数组上的np.where来获取该值:

>>> import numpy as np
>>> a=np.array(li)
>>> a[a!=1]
array([200])

If the uniform values are not known, you can use np.uniques to get the counts of uniques:

如果统一值未知,则可以使用np.uniques来获取唯一值的计数:

>>> np.unique(a, return_counts=True)
(array([  1, 200]), array([350,   1]))

For a pure Python solution, use a generator with next to get the first value that is different than all the others:

对于纯Python解决方案,使用带有next的生成器获取与所有其他值不同的第一个值:

>>> next(e for i, e in enumerate(li) if li[i]!=1)
200

Or, you can use dropwhile from itertools:

或者,您可以使用来自itertools的dropwhile:

>>> from itertools import dropwhile
>>> next(dropwhile(lambda e: e==1, li))
200

If you do not know what the uniform value is, use a Counter on a slice big enough to get it:

如果您不知道统一值是什么,请在足够大的切片上使用计数器来获取它:

>>> uniform=Counter(li[0:3]).most_common()[0][0]
>>> uniform
1
>>> next(e for i, e in enumerate(li) if li[i]!=uniform)
200

In these cases, next will short-circuit at the first value that satisfies the condition.

在这些情况下,接下来将在满足条件的第一个值处短路。

#2


2  

Would this work for you?

这对你有用吗?

In [6]: my_list = [1,1,1,2,1,1,1]
In [7]: different = [ii for ii in set(my_list) if my_list.count(ii) == 1]
In [8]: different
Out[8]: [2]

#3


2  

You can use Counter from collections package

您可以使用Counter from collections包

from collections import Counter

a = [1,2,3,4,3,4,1]
b = Counter(a)  # Counter({1: 2, 2: 1, 3: 2, 4: 2})
elem = list(b.keys())[list(b.values()).index(1)]  # getting elem which is key with value that equals 1
print(a.index(elem))

Another possible solution that just differently compute elem

另一个可能的解决方案,只是不同的计算elem

a = [1,2,3,4,3,4,1]
b = Counter(a)  # Counter({1: 2, 2: 1, 3: 2, 4: 2})
elem = (k for k, v in b.items() if v == 1)
print(a.index(next(elem)))

UPDATE

UPDATE

Time consumption:

时间消耗:

As @Jblasco mentioned, Jblasco's method is not really efficient one, and i was curious to measure it.

正如@Jblasco所说,Jblasco的方法并不是真正有效的方法,我很想测量它。

So the initial data is array with 200-400 elements, with only one unique value. The code that generate that array is. At the end of snipped there is 100 first elements that prove that it has one unique

因此,初始数据是具有200-400个元素的数组,只有一个唯一值。生成该数组的代码是。在剪切结束时,有100个第一个元素证明它有一个独特的元素

import random
from itertools import chain
f = lambda x: [x]*random.randint(2,4)
a=list(chain.from_iterable(f(random.randint(0,100)) for _ in range(100)))
a[random.randint(1, 100)] = 101
print(a[:100])
# [5, 5, 5, 84, 84, 84, 46, 46, 46, 46, 6, 6, 6, 68, 68, 68, 68, 38,
# 38, 38, 44, 44, 61, 61, 15, 15, 15, 15, 36, 36, 36, 36, 73, 73, 73, 
# 28, 28, 28, 28, 6, 6, 93, 93, 74, 74, 74, 74, 12, 12, 72, 72, 22, 
# 22, 22, 22, 78, 78, 17, 17, 17, 93, 93, 93, 12, 12, 12, 23, 23, 23, 
# 23, 52, 52, 88, 88, 79, 79, 42, 42, 34, 34, 47, 47, 1, 1, 1, 1, 71,
# 71, 1, 1, 45, 45, 101, 45, 39, 39, 50, 50, 50, 50]

That's the code that show us results, i choose to execute 3 times with 10000 executions:

这是向我们展示结果的代码,我选择执行3次10000次执行:

from timeit import repeat


s = """\
import random
from itertools import chain
f = lambda x: [x]*random.randint(2,4)
a=list(chain.from_iterable(f(random.randint(0,100)) for _ in range(100)))
a[random.randint(1, 100)] = 101
"""

print('my 1st method:', repeat(stmt="""from collections import Counter
b=Counter(a)
elem = (k for k, v in b.items() if v == 1)
a.index(next(elem))""",
             setup=s, number=10000, repeat=3)

print('my 2nd method:', repeat(stmt="""from collections import Counter
b = Counter(a)
elem = list(b.keys())[list(b.values()).index(1)]
a.index(elem)""",
             setup=s, number=10000, repeat=3))

print('@Jblasco method:', repeat(stmt="""different = [ii for ii in set(a) if a.count(ii) == 1]
different""", setup=s, number=10000, repeat=3))

# my 1st method: [0.303596693000145, 0.27322746600111714, 0.2701447969993751]
# my 2nd method: [0.2715420649983571, 0.28590541199810104, 0.2821485950007627]
# @Jblasco method: [3.2133491599997797, 3.488262927003234, 2.884892332000163]

#4


1  

I would try maybe something like this:

我会尝试这样的事情:

newList = list(set(my_list))
print newList.pop()

Assuming there's only 1 different value and the rest are all the same. There's a little bit of ambiguity in your question which makes it difficult to answer but that's all I could think of optimally.

假设只有1个不同的值,其余的都是相同的。你的问题有点含糊不清,这让我很难回答,但这就是我能想到的最佳方案。