So I have two lists L1 is fomatted like this:
所以我有两个列表L1是这样的:
L1 = ['12:55:35.87', '12:55:35.70', ...]
L2 = ['12:55:35.53', '12:55:35.30', ...]
I am trying to find pairs in both list that start with the same 4 characters i.e. xx:x and then return the indexes of the pairs for each list
我试图在两个列表中找到以相同的4个字符开头的对,即xx:x然后返回每个列表的对的索引
So far I have:
到目前为止我有:
for pair1 in L1:
for pair2 in L2:
if pair1[:4] in pair2:
print(L1.index(pair1))
This doesn't seem to return the correct indexes and it obviously doesn't return the index of the second list. Any help would be greatly appreciated.
这似乎没有返回正确的索引,它显然不返回第二个列表的索引。任何帮助将不胜感激。
6 个解决方案
#1
7
Here's how to make your code work. Keep in mind this is a naive solution, there are faster way to solve this if your lists are big. Runtime here is O(n^2) but this could be solved in linear time.
以下是如何使您的代码工作。请记住,这是一个天真的解决方案,如果您的列表很大,有更快的方法来解决这个问题。这里的运行时间是O(n ^ 2),但这可以在线性时间内解决。
for i,pair1 in enumerate(L1):
for j,pair2 in enumerate(L2):
if pair1[:4] == pair2[:4]:
print("list1: %s , list2: %s" % (i,j))
Update: for future visitors here's an average linear time solution:
更新:对于未来的访客,这是一个平均线性时间解决方案:
from collections import defaultdict
l1_map = defaultdict([])
for i,val in enumerate(L1):
prefix = val[:4]
l1_map[prefix].append(i)
for j,val in enumerate(L2):
prefix = val[:4]
for l1 in l1_map[prefix]:
print("list1: %s , list2: %s" % (l1,j))
#2
3
Because OP lists seem to have lots of repeated "firsts 4 characters", I would do something like the following:
因为OP列表似乎有很多重复的“第一个4个字符”,我会做类似以下的事情:
indices = {}
for i, entry in enumerate(L1):
indices.setdefault(entry[:4], [])
indices[entry[:4]].append("L1-{}".format(i))
if L2[i][:4] in indices:
indices[L2[i][:4]].append("L2-{}".format(i))
Then you can access your repeated entries as:
然后您可以访问重复的条目:
for key in indices:
print(key, indices[key])
This is better than O(n^2).
这比O(n ^ 2)好。
edit: as someone pointed out in the comments this is assuming that the lists do share the same length.
编辑:正如有人在评论中指出的那样,假设列表的长度相同。
In case they don't, assume L2
is larger than L1
, then after doing the above you can do:
如果他们不这样做,假设L2大于L1,那么在完成上述操作后,您可以:
for j, entry in enumerate(L2[i+1:]):
indices.setdefault(entry[:4], [])
indices[entry[:4]].append("L2-{}".format(j))
If L2
is shorter than L1
just change the variables names in the code shown.
如果L2比L1短,只需更改所示代码中的变量名称即可。
#3
2
You can use itertools.product
to loop the Cartesian product.
您可以使用itertools.product循环Cartesian产品。
from itertools import product
L1 = ['12:55:35.87', '12:55:35.70']
L2 = ['12:55:35.53', '12:45:35.30']
res = [(i, j) for (i, x), (j, y) in
product(enumerate(L1), enumerate(L2))
if x[:4] == y[:4]]
# [(0, 0), (1, 0)]
#4
1
Use the range()
or enumerate()
function in the for-loops to provide you loop index.
在for循环中使用range()或enumerate()函数为您提供循环索引。
For example, using the range()
function:
例如,使用range()函数:
for x in range(len(L1)):
for y in range(len(L2)):
if L1[x][:4] == L2[y][:4]:
print(x, y)
#5
1
enumerate is great for things like this.
枚举对于这样的事情很有用。
indexes = []
for index1, pair1 in enumerate(L1):
pair1_slice = pair1[:4]
for index2, pair2 in enumerate(L2):
if pair1_slice == pair2[:4]:
indexes.append([index1, index2])
print(index1, index2)
#6
1
I think the enumerate
function is what you're looking for!
我认为枚举功能正是你要找的!
L1 = ['12:55:35.87', '12:55:35.70', 'spam']
L2 = ['12:55:35.53', 'eggs', '12:55:35.30']
idxs = []
for idx1, pair1 in enumerate(L1):
for idx2, pair2 in enumerate(L2):
if pair1[:4] == pair2[:4]:
idxs.append((idx1, idx2))
print(idxs)
Output
产量
[(0, 0), (0, 2), (1, 0), (1, 2)]
#1
7
Here's how to make your code work. Keep in mind this is a naive solution, there are faster way to solve this if your lists are big. Runtime here is O(n^2) but this could be solved in linear time.
以下是如何使您的代码工作。请记住,这是一个天真的解决方案,如果您的列表很大,有更快的方法来解决这个问题。这里的运行时间是O(n ^ 2),但这可以在线性时间内解决。
for i,pair1 in enumerate(L1):
for j,pair2 in enumerate(L2):
if pair1[:4] == pair2[:4]:
print("list1: %s , list2: %s" % (i,j))
Update: for future visitors here's an average linear time solution:
更新:对于未来的访客,这是一个平均线性时间解决方案:
from collections import defaultdict
l1_map = defaultdict([])
for i,val in enumerate(L1):
prefix = val[:4]
l1_map[prefix].append(i)
for j,val in enumerate(L2):
prefix = val[:4]
for l1 in l1_map[prefix]:
print("list1: %s , list2: %s" % (l1,j))
#2
3
Because OP lists seem to have lots of repeated "firsts 4 characters", I would do something like the following:
因为OP列表似乎有很多重复的“第一个4个字符”,我会做类似以下的事情:
indices = {}
for i, entry in enumerate(L1):
indices.setdefault(entry[:4], [])
indices[entry[:4]].append("L1-{}".format(i))
if L2[i][:4] in indices:
indices[L2[i][:4]].append("L2-{}".format(i))
Then you can access your repeated entries as:
然后您可以访问重复的条目:
for key in indices:
print(key, indices[key])
This is better than O(n^2).
这比O(n ^ 2)好。
edit: as someone pointed out in the comments this is assuming that the lists do share the same length.
编辑:正如有人在评论中指出的那样,假设列表的长度相同。
In case they don't, assume L2
is larger than L1
, then after doing the above you can do:
如果他们不这样做,假设L2大于L1,那么在完成上述操作后,您可以:
for j, entry in enumerate(L2[i+1:]):
indices.setdefault(entry[:4], [])
indices[entry[:4]].append("L2-{}".format(j))
If L2
is shorter than L1
just change the variables names in the code shown.
如果L2比L1短,只需更改所示代码中的变量名称即可。
#3
2
You can use itertools.product
to loop the Cartesian product.
您可以使用itertools.product循环Cartesian产品。
from itertools import product
L1 = ['12:55:35.87', '12:55:35.70']
L2 = ['12:55:35.53', '12:45:35.30']
res = [(i, j) for (i, x), (j, y) in
product(enumerate(L1), enumerate(L2))
if x[:4] == y[:4]]
# [(0, 0), (1, 0)]
#4
1
Use the range()
or enumerate()
function in the for-loops to provide you loop index.
在for循环中使用range()或enumerate()函数为您提供循环索引。
For example, using the range()
function:
例如,使用range()函数:
for x in range(len(L1)):
for y in range(len(L2)):
if L1[x][:4] == L2[y][:4]:
print(x, y)
#5
1
enumerate is great for things like this.
枚举对于这样的事情很有用。
indexes = []
for index1, pair1 in enumerate(L1):
pair1_slice = pair1[:4]
for index2, pair2 in enumerate(L2):
if pair1_slice == pair2[:4]:
indexes.append([index1, index2])
print(index1, index2)
#6
1
I think the enumerate
function is what you're looking for!
我认为枚举功能正是你要找的!
L1 = ['12:55:35.87', '12:55:35.70', 'spam']
L2 = ['12:55:35.53', 'eggs', '12:55:35.30']
idxs = []
for idx1, pair1 in enumerate(L1):
for idx2, pair2 in enumerate(L2):
if pair1[:4] == pair2[:4]:
idxs.append((idx1, idx2))
print(idxs)
Output
产量
[(0, 0), (0, 2), (1, 0), (1, 2)]