在python中交错两个或多个列表的最佳方法?

时间:2021-07-04 14:00:54

Suppose I have a list:

假设我有一个列表:

l=['a','b','c']

And its suffix list:

及其后缀列表:

l2 = ['a_1', 'b_1', 'c_1']

I'd like the desired output to be:

我希望所需的输出为:

out_l = ['a','a_1','b','b_2','c','c_3']

The result is the interleaved version of the two lists above.

结果是上面两个列表的交错版本。

I can write regular for loop to get this done, but I'm wondering if there's a more Pythonic way (e.g., using list comprehension or lambda) to get it done.

我可以编写常规for循环来完成这项工作,但我想知道是否有更多的Pythonic方式(例如,使用list comprehension或lambda)来完成它。

I've tried something like this:

我尝试过这样的事情:

list(map(lambda x: x[1]+'_'+str(x[0]+1), enumerate(a)))
# this only returns ['a_1', 'b_2', 'c_3']

Furthermore, what changes would need to be made for the general case i.e., for 2 or more lists where l2 is not necessarily a derivative of l?

此外,对于一般情况需要做出哪些改变,即对于2个或更多个列表,其中l2不一定是l的导数?

7 个解决方案

#1


48  

yield

You can use a generator for an elegant solution. At each iteration, yield twice—once with the original element, and once with the element with the added suffix.

您可以使用生成器来获得优雅的解决方案。在每次迭代中,使用原始元素生成两次一次,并使用添加后缀的元素生成一次。

The generator will need to be exhausted; that can be done by tacking on a list call at the end.

发电机需要耗尽;这可以通过最后一个列表调用来完成。

def transform(l):
    for i, x in enumerate(l, 1):
        yield x
        yield f'{x}_{i}'  # {}_{}'.format(x, i)

You can also re-write this using the yield from syntax for generator delegation:

您还可以使用生成器委派语法的yield来重写它:

def transform(l):
    for i, x in enumerate(l, 1):
        yield from (x, f'{x}_{i}') # (x, {}_{}'.format(x, i))

out_l = list(transform(l))
print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

If you're on versions older than python-3.6, replace f'{x}_{i}' with '{}_{}'.format(x, i).

如果您使用的是早于python-3.6的版本,请将f'{x} _ {i}'替换为'{} _ {}'。format(x,i)。

Generalising
Consider a general scenario where you have N lists of the form:

概括考虑一般情况,您有N个表格列表:

l1 = [v11, v12, ...]
l2 = [v21, v22, ...]
l3 = [v31, v32, ...]
...

Which you would like to interleave. These lists are not necessarily derived from each other.

你想要交错。这些列表不一定是相互派生的。

To handle interleaving operations with these N lists, you'll need to iterate over pairs:

要使用这N个列表处理交叉操作,您需要迭代对:

def transformN(*args):
    for vals in zip(*args):
        yield from vals

out_l = transformN(l1, l2, l3, ...)

Sliced list.__setitem__

I'd recommend this from the perspective of performance. First allocate space for an empty list, and then assign list items to their appropriate positions using sliced list assignment. l goes into even indexes, and l' (l modified) goes into odd indexes.

我从性能的角度推荐这个。首先为空列表分配空间,然后使用切片列表分配将列表项分配到其适当的位置。 l进入偶数索引,l'(修改后)进入奇数索引。

out_l = [None] * (len(l) * 2)
out_l[::2] = l
out_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)]  # [{}_{}'.format(x, i) ...]

print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

This is consistently the fastest from my timings (below).

从我的时间(下图)开始,这一直是最快的。

Generalising
To handle N lists, iteratively assign to slices.

概括为了处理N个列表,迭代地分配给切片。

list_of_lists = [l1, l2, ...]

out_l = [None] * len(list_of_lists[0]) * len(list_of_lists)
for i, l in enumerate(list_of_lists):
    out_l[i::2] = l

zip + chain.from_iterable

A functional approach, similar to @chrisz' solution. Construct pairs using zip and then flatten it using itertools.chain.

一种功能性方法,类似于@chrisz的解决方案。使用zip构造对,然后使用itertools.chain将其展平。

from itertools import chain
# [{}_{}'.format(x, i) ...]
out_l = list(chain.from_iterable(zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)]))) 

print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

iterools.chain is widely regarded as the pythonic list flattening approach.

iterools.chain被广泛认为是pythonic列表展平方法。

Generalising
This is the simplest solution to generalise, and I suspect the most efficient for multiple lists when N is large.

推广这是最简单的推广解决方案,我怀疑当N很大时,多个列表的效率最高。

list_of_lists = [l1, l2, ...]
out_l = list(chain.from_iterable(zip(*list_of_lists)))

Performance

Let's take a look at some perf-tests for the simple case of two lists (one list with its suffix). General cases will not be tested since the results widely vary with by data.

让我们看看两个列表的简单情况(一个带有后缀的列表)的一些性能测试。一般情况不会被测试,因为结果因数据而异。

from timeit import timeit

import pandas as pd
import matplotlib.pyplot as plt

res = pd.DataFrame(
       index=['ajax1234', 'cs0', 'cs1', 'cs2', 'cs3', 'chrisz', 'sruthiV'],
       columns=[10, 50, 100, 500, 1000, 5000, 10000, 50000, 100000],
       dtype=float
)

for f in res.index: 
    for c in res.columns:
        l = ['a', 'b', 'c', 'd'] * c
        stmt = '{}(l)'.format(f)
        setp = 'from __main__ import l, {}'.format(f)
        res.at[f, c] = timeit(stmt, setp, number=50)

ax = res.div(res.min()).T.plot(loglog=True) 
ax.set_xlabel("N"); 
ax.set_ylabel("time (relative)");

plt.show()

在python中交错两个或多个列表的最佳方法?

Functions

def ajax1234(l):
    return [
        i for b in [[a, '{}_{}'.format(a, i)] 
        for i, a in enumerate(l, start=1)] 
        for i in b
    ]

def cs0(l):
    # this is in Ajax1234's answer, but it is my suggestion
    return [j for i, a in enumerate(l, 1) for j in [a, '{}_{}'.format(a, i)]]

def cs1(l):
    def _cs1(l):
        for i, x in enumerate(l, 1):
            yield x
            yield f'{x}_{i}'

    return list(_cs1(l))

def cs2(l):
    out_l = [None] * (len(l) * 2)
    out_l[::2] = l
    out_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)]

    return out_l

def cs3(l):
    return list(chain.from_iterable(
        zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)]))
    )

def chrisz(l):
    return [
        val 
        for pair in zip(l, [f'{k}_{j+1}' for j, k in enumerate(l)]) 
        for val in pair
    ]

def sruthiV(l):
    return [ 
        l[int(i / 2)] + "_" + str(int(i / 2) + 1) if i % 2 != 0 else l[int(i/2)] 
        for i in range(0,2*len(l))
    ]

Software

System—Mac OS X High Sierra—2.4 GHz Intel Core i7
Python—3.6.0
IPython—6.2.1

System-Mac OS X High Sierra-2.4 GHz Intel Core i7 Python-3.6.0 IPython-6.2.1

#2


7  

You can use a list comprehension like so:

你可以像这样使用列表理解:

l=['a','b','c']
new_l = [i for b in [[a, '{}_{}'.format(a, i)] for i, a in enumerate(l, start=1)] for i in b]

Output:

['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

Optional, shorter method:

可选,更短的方法:

[j for i, a in enumerate(l, 1) for j in [a, '{}_{}'.format(a, i)]]

#3


4  

You could use zip:

你可以使用zip:

[val for pair in zip(l, [f'{k}_{j+1}' for j, k in enumerate(l)]) for val in pair]

Output:

['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

#4


2  

Here's my simple implementation

这是我的简单实现

l=['a','b','c']
# generate new list with the indices of the original list
new_list=l + ['{0}_{1}'.format(i, (l.index(i) + 1)) for i in l]
# sort the new list in ascending order
new_list.sort()
print new_list
# Should display ['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

#5


0  

(Edited)

Using list comprehension :

使用列表理解:

[ l[int(i/2)]+"_"+str(int(i/2)+1) if i%2!=0 else l[int(i/2)] for i in range(0,2*len(l))]

# l=['b', 'a', 'd', 'c']
# output : ['b', 'b_1', 'a', 'a_2', 'd', 'd_3', 'c', 'c_4']

#6


0  

If you wanted to return [["a","a_1"],["b","b_2"],["c","c_3"]] you could write

如果你想返回[[“a”,“a_1”],[“b”,“b_2”],[“c”,“c_3”]]你可以写

new_l=[[x,"{}_{}".format(x,i+1)] for i,x in enumerate(l)]

This isn't what you want, instead you want ["a","a_1"]+["b","b_2"]+["c","c_3"]. This can be made from the result of the operation above using sum(); since you're summing lists you need to add the empty list as an argument to avoid an error. So that gives

这不是你想要的,而是你想要[“a”,“a_1”] + [“b”,“b_2”] + [“c”,“c_3”]。这可以使用sum()从上面的操作结果中得出;由于您要求汇总列表,因此需要将空列表添加为参数以避免错误。所以这给了

new_l=sum(([x,"{}_{}".format(x,i+1)] for i,x in enumerate(l)),[])

I don't know how this compares speed-wise (probably not well), but I find it easier to understand what's going on than the other list-comprehension based answers.

我不知道这是如何比较速度的(可能不太好),但我发现比其他基于列表理解的答案更容易理解发生了什么。

#7


0  

A very simple solution:

非常简单的解决方案:

out_l=[]
for i,x in enumerate(l,1):
    out_l.extend([x,f"{x}_{i}"])

#1


48  

yield

You can use a generator for an elegant solution. At each iteration, yield twice—once with the original element, and once with the element with the added suffix.

您可以使用生成器来获得优雅的解决方案。在每次迭代中,使用原始元素生成两次一次,并使用添加后缀的元素生成一次。

The generator will need to be exhausted; that can be done by tacking on a list call at the end.

发电机需要耗尽;这可以通过最后一个列表调用来完成。

def transform(l):
    for i, x in enumerate(l, 1):
        yield x
        yield f'{x}_{i}'  # {}_{}'.format(x, i)

You can also re-write this using the yield from syntax for generator delegation:

您还可以使用生成器委派语法的yield来重写它:

def transform(l):
    for i, x in enumerate(l, 1):
        yield from (x, f'{x}_{i}') # (x, {}_{}'.format(x, i))

out_l = list(transform(l))
print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

If you're on versions older than python-3.6, replace f'{x}_{i}' with '{}_{}'.format(x, i).

如果您使用的是早于python-3.6的版本,请将f'{x} _ {i}'替换为'{} _ {}'。format(x,i)。

Generalising
Consider a general scenario where you have N lists of the form:

概括考虑一般情况,您有N个表格列表:

l1 = [v11, v12, ...]
l2 = [v21, v22, ...]
l3 = [v31, v32, ...]
...

Which you would like to interleave. These lists are not necessarily derived from each other.

你想要交错。这些列表不一定是相互派生的。

To handle interleaving operations with these N lists, you'll need to iterate over pairs:

要使用这N个列表处理交叉操作,您需要迭代对:

def transformN(*args):
    for vals in zip(*args):
        yield from vals

out_l = transformN(l1, l2, l3, ...)

Sliced list.__setitem__

I'd recommend this from the perspective of performance. First allocate space for an empty list, and then assign list items to their appropriate positions using sliced list assignment. l goes into even indexes, and l' (l modified) goes into odd indexes.

我从性能的角度推荐这个。首先为空列表分配空间,然后使用切片列表分配将列表项分配到其适当的位置。 l进入偶数索引,l'(修改后)进入奇数索引。

out_l = [None] * (len(l) * 2)
out_l[::2] = l
out_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)]  # [{}_{}'.format(x, i) ...]

print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

This is consistently the fastest from my timings (below).

从我的时间(下图)开始,这一直是最快的。

Generalising
To handle N lists, iteratively assign to slices.

概括为了处理N个列表,迭代地分配给切片。

list_of_lists = [l1, l2, ...]

out_l = [None] * len(list_of_lists[0]) * len(list_of_lists)
for i, l in enumerate(list_of_lists):
    out_l[i::2] = l

zip + chain.from_iterable

A functional approach, similar to @chrisz' solution. Construct pairs using zip and then flatten it using itertools.chain.

一种功能性方法,类似于@chrisz的解决方案。使用zip构造对,然后使用itertools.chain将其展平。

from itertools import chain
# [{}_{}'.format(x, i) ...]
out_l = list(chain.from_iterable(zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)]))) 

print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

iterools.chain is widely regarded as the pythonic list flattening approach.

iterools.chain被广泛认为是pythonic列表展平方法。

Generalising
This is the simplest solution to generalise, and I suspect the most efficient for multiple lists when N is large.

推广这是最简单的推广解决方案,我怀疑当N很大时,多个列表的效率最高。

list_of_lists = [l1, l2, ...]
out_l = list(chain.from_iterable(zip(*list_of_lists)))

Performance

Let's take a look at some perf-tests for the simple case of two lists (one list with its suffix). General cases will not be tested since the results widely vary with by data.

让我们看看两个列表的简单情况(一个带有后缀的列表)的一些性能测试。一般情况不会被测试,因为结果因数据而异。

from timeit import timeit

import pandas as pd
import matplotlib.pyplot as plt

res = pd.DataFrame(
       index=['ajax1234', 'cs0', 'cs1', 'cs2', 'cs3', 'chrisz', 'sruthiV'],
       columns=[10, 50, 100, 500, 1000, 5000, 10000, 50000, 100000],
       dtype=float
)

for f in res.index: 
    for c in res.columns:
        l = ['a', 'b', 'c', 'd'] * c
        stmt = '{}(l)'.format(f)
        setp = 'from __main__ import l, {}'.format(f)
        res.at[f, c] = timeit(stmt, setp, number=50)

ax = res.div(res.min()).T.plot(loglog=True) 
ax.set_xlabel("N"); 
ax.set_ylabel("time (relative)");

plt.show()

在python中交错两个或多个列表的最佳方法?

Functions

def ajax1234(l):
    return [
        i for b in [[a, '{}_{}'.format(a, i)] 
        for i, a in enumerate(l, start=1)] 
        for i in b
    ]

def cs0(l):
    # this is in Ajax1234's answer, but it is my suggestion
    return [j for i, a in enumerate(l, 1) for j in [a, '{}_{}'.format(a, i)]]

def cs1(l):
    def _cs1(l):
        for i, x in enumerate(l, 1):
            yield x
            yield f'{x}_{i}'

    return list(_cs1(l))

def cs2(l):
    out_l = [None] * (len(l) * 2)
    out_l[::2] = l
    out_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)]

    return out_l

def cs3(l):
    return list(chain.from_iterable(
        zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)]))
    )

def chrisz(l):
    return [
        val 
        for pair in zip(l, [f'{k}_{j+1}' for j, k in enumerate(l)]) 
        for val in pair
    ]

def sruthiV(l):
    return [ 
        l[int(i / 2)] + "_" + str(int(i / 2) + 1) if i % 2 != 0 else l[int(i/2)] 
        for i in range(0,2*len(l))
    ]

Software

System—Mac OS X High Sierra—2.4 GHz Intel Core i7
Python—3.6.0
IPython—6.2.1

System-Mac OS X High Sierra-2.4 GHz Intel Core i7 Python-3.6.0 IPython-6.2.1

#2


7  

You can use a list comprehension like so:

你可以像这样使用列表理解:

l=['a','b','c']
new_l = [i for b in [[a, '{}_{}'.format(a, i)] for i, a in enumerate(l, start=1)] for i in b]

Output:

['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

Optional, shorter method:

可选,更短的方法:

[j for i, a in enumerate(l, 1) for j in [a, '{}_{}'.format(a, i)]]

#3


4  

You could use zip:

你可以使用zip:

[val for pair in zip(l, [f'{k}_{j+1}' for j, k in enumerate(l)]) for val in pair]

Output:

['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

#4


2  

Here's my simple implementation

这是我的简单实现

l=['a','b','c']
# generate new list with the indices of the original list
new_list=l + ['{0}_{1}'.format(i, (l.index(i) + 1)) for i in l]
# sort the new list in ascending order
new_list.sort()
print new_list
# Should display ['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

#5


0  

(Edited)

Using list comprehension :

使用列表理解:

[ l[int(i/2)]+"_"+str(int(i/2)+1) if i%2!=0 else l[int(i/2)] for i in range(0,2*len(l))]

# l=['b', 'a', 'd', 'c']
# output : ['b', 'b_1', 'a', 'a_2', 'd', 'd_3', 'c', 'c_4']

#6


0  

If you wanted to return [["a","a_1"],["b","b_2"],["c","c_3"]] you could write

如果你想返回[[“a”,“a_1”],[“b”,“b_2”],[“c”,“c_3”]]你可以写

new_l=[[x,"{}_{}".format(x,i+1)] for i,x in enumerate(l)]

This isn't what you want, instead you want ["a","a_1"]+["b","b_2"]+["c","c_3"]. This can be made from the result of the operation above using sum(); since you're summing lists you need to add the empty list as an argument to avoid an error. So that gives

这不是你想要的,而是你想要[“a”,“a_1”] + [“b”,“b_2”] + [“c”,“c_3”]。这可以使用sum()从上面的操作结果中得出;由于您要求汇总列表,因此需要将空列表添加为参数以避免错误。所以这给了

new_l=sum(([x,"{}_{}".format(x,i+1)] for i,x in enumerate(l)),[])

I don't know how this compares speed-wise (probably not well), but I find it easier to understand what's going on than the other list-comprehension based answers.

我不知道这是如何比较速度的(可能不太好),但我发现比其他基于列表理解的答案更容易理解发生了什么。

#7


0  

A very simple solution:

非常简单的解决方案:

out_l=[]
for i,x in enumerate(l,1):
    out_l.extend([x,f"{x}_{i}"])