在一定范围内查找两个列表中的元素

时间:2021-05-29 21:25:32

So I have two lists L1 is fomatted like this:

所以我有两个列表L1是这样的:

L1 = ['12:55:35.87', '12:55:35.70', ...]
L2 = ['12:55:35.53', '12:55:35.30', ...]

I am trying to find pairs in both list that start with the same 4 characters i.e. xx:x and then return the indexes of the pairs for each list

我试图在两个列表中找到以相同的4个字符开头的对,即xx:x然后返回每个列表的对的索引

So far I have:

到目前为止我有:

for pair1 in L1:
    for pair2 in L2:
        if pair1[:4] in pair2:
            print(L1.index(pair1))

This doesn't seem to return the correct indexes and it obviously doesn't return the index of the second list. Any help would be greatly appreciated.

这似乎没有返回正确的索引,它显然不返回第二个列表的索引。任何帮助将不胜感激。

6 个解决方案

#1


7  

Here's how to make your code work. Keep in mind this is a naive solution, there are faster way to solve this if your lists are big. Runtime here is O(n^2) but this could be solved in linear time.

以下是如何使您的代码工作。请记住,这是一个天真的解决方案,如果您的列表很大,有更快的方法来解决这个问题。这里的运行时间是O(n ^ 2),但这可以在线性时间内解决。

for i,pair1 in enumerate(L1):
    for j,pair2 in enumerate(L2):
        if pair1[:4] == pair2[:4]:
            print("list1: %s , list2: %s" % (i,j))

Update: for future visitors here's an average linear time solution:

更新:对于未来的访客,这是一个平均线性时间解决方案:

from collections import defaultdict
l1_map = defaultdict([])

for i,val in enumerate(L1):
    prefix = val[:4]
    l1_map[prefix].append(i)


for j,val in enumerate(L2):
     prefix = val[:4]
     for l1 in l1_map[prefix]:
        print("list1: %s , list2: %s" % (l1,j))

#2


3  

Because OP lists seem to have lots of repeated "firsts 4 characters", I would do something like the following:

因为OP列表似乎有很多重复的“第一个4个字符”,我会做类似以下的事情:

indices = {}
for i, entry in enumerate(L1):
    indices.setdefault(entry[:4], [])
    indices[entry[:4]].append("L1-{}".format(i))
    if L2[i][:4] in indices:
        indices[L2[i][:4]].append("L2-{}".format(i))

Then you can access your repeated entries as:

然后您可以访问重复的条目:

for key in indices:
    print(key, indices[key])

This is better than O(n^2).

这比O(n ^ 2)好。

edit: as someone pointed out in the comments this is assuming that the lists do share the same length.

编辑:正如有人在评论中指出的那样,假设列表的长度相同。

In case they don't, assume L2 is larger than L1, then after doing the above you can do:

如果他们不这样做,假设L2大于L1,那么在完成上述操作后,您可以:

for j, entry in enumerate(L2[i+1:]):
    indices.setdefault(entry[:4], [])
    indices[entry[:4]].append("L2-{}".format(j))

If L2 is shorter than L1 just change the variables names in the code shown.

如果L2比L1短,只需更改所示代码中的变量名称即可。

#3


2  

You can use itertools.product to loop the Cartesian product.

您可以使用itertools.product循环Cartesian产品。

from itertools import product

L1 = ['12:55:35.87', '12:55:35.70']
L2 = ['12:55:35.53', '12:45:35.30']

res = [(i, j) for (i, x), (j, y) in 
       product(enumerate(L1), enumerate(L2)) 
       if x[:4] == y[:4]]

# [(0, 0), (1, 0)]

#4


1  

Use the range() or enumerate() function in the for-loops to provide you loop index.

在for循环中使用range()或enumerate()函数为您提供循环索引。

For example, using the range() function:

例如,使用range()函数:

for x in range(len(L1)):
   for y in range(len(L2)):
       if L1[x][:4] == L2[y][:4]:
           print(x, y)

#5


1  

enumerate is great for things like this.

枚举对于这样的事情很有用。

indexes = []
for index1, pair1 in enumerate(L1):
    pair1_slice = pair1[:4] 
    for index2, pair2 in enumerate(L2):        
        if pair1_slice == pair2[:4]:
            indexes.append([index1, index2])
            print(index1, index2)

#6


1  

I think the enumerate function is what you're looking for!

我认为枚举功能正是你要找的!

L1 = ['12:55:35.87', '12:55:35.70', 'spam']
L2 = ['12:55:35.53', 'eggs', '12:55:35.30']

idxs = []

for idx1, pair1 in enumerate(L1):
    for idx2, pair2 in enumerate(L2):
        if pair1[:4] == pair2[:4]:
            idxs.append((idx1, idx2))

print(idxs)

Output

产量

[(0, 0), (0, 2), (1, 0), (1, 2)]

#1


7  

Here's how to make your code work. Keep in mind this is a naive solution, there are faster way to solve this if your lists are big. Runtime here is O(n^2) but this could be solved in linear time.

以下是如何使您的代码工作。请记住,这是一个天真的解决方案,如果您的列表很大,有更快的方法来解决这个问题。这里的运行时间是O(n ^ 2),但这可以在线性时间内解决。

for i,pair1 in enumerate(L1):
    for j,pair2 in enumerate(L2):
        if pair1[:4] == pair2[:4]:
            print("list1: %s , list2: %s" % (i,j))

Update: for future visitors here's an average linear time solution:

更新:对于未来的访客,这是一个平均线性时间解决方案:

from collections import defaultdict
l1_map = defaultdict([])

for i,val in enumerate(L1):
    prefix = val[:4]
    l1_map[prefix].append(i)


for j,val in enumerate(L2):
     prefix = val[:4]
     for l1 in l1_map[prefix]:
        print("list1: %s , list2: %s" % (l1,j))

#2


3  

Because OP lists seem to have lots of repeated "firsts 4 characters", I would do something like the following:

因为OP列表似乎有很多重复的“第一个4个字符”,我会做类似以下的事情:

indices = {}
for i, entry in enumerate(L1):
    indices.setdefault(entry[:4], [])
    indices[entry[:4]].append("L1-{}".format(i))
    if L2[i][:4] in indices:
        indices[L2[i][:4]].append("L2-{}".format(i))

Then you can access your repeated entries as:

然后您可以访问重复的条目:

for key in indices:
    print(key, indices[key])

This is better than O(n^2).

这比O(n ^ 2)好。

edit: as someone pointed out in the comments this is assuming that the lists do share the same length.

编辑:正如有人在评论中指出的那样,假设列表的长度相同。

In case they don't, assume L2 is larger than L1, then after doing the above you can do:

如果他们不这样做,假设L2大于L1,那么在完成上述操作后,您可以:

for j, entry in enumerate(L2[i+1:]):
    indices.setdefault(entry[:4], [])
    indices[entry[:4]].append("L2-{}".format(j))

If L2 is shorter than L1 just change the variables names in the code shown.

如果L2比L1短,只需更改所示代码中的变量名称即可。

#3


2  

You can use itertools.product to loop the Cartesian product.

您可以使用itertools.product循环Cartesian产品。

from itertools import product

L1 = ['12:55:35.87', '12:55:35.70']
L2 = ['12:55:35.53', '12:45:35.30']

res = [(i, j) for (i, x), (j, y) in 
       product(enumerate(L1), enumerate(L2)) 
       if x[:4] == y[:4]]

# [(0, 0), (1, 0)]

#4


1  

Use the range() or enumerate() function in the for-loops to provide you loop index.

在for循环中使用range()或enumerate()函数为您提供循环索引。

For example, using the range() function:

例如,使用range()函数:

for x in range(len(L1)):
   for y in range(len(L2)):
       if L1[x][:4] == L2[y][:4]:
           print(x, y)

#5


1  

enumerate is great for things like this.

枚举对于这样的事情很有用。

indexes = []
for index1, pair1 in enumerate(L1):
    pair1_slice = pair1[:4] 
    for index2, pair2 in enumerate(L2):        
        if pair1_slice == pair2[:4]:
            indexes.append([index1, index2])
            print(index1, index2)

#6


1  

I think the enumerate function is what you're looking for!

我认为枚举功能正是你要找的!

L1 = ['12:55:35.87', '12:55:35.70', 'spam']
L2 = ['12:55:35.53', 'eggs', '12:55:35.30']

idxs = []

for idx1, pair1 in enumerate(L1):
    for idx2, pair2 in enumerate(L2):
        if pair1[:4] == pair2[:4]:
            idxs.append((idx1, idx2))

print(idxs)

Output

产量

[(0, 0), (0, 2), (1, 0), (1, 2)]