用于整数分区的优雅的Python代码

时间:2022-08-29 21:23:44

I tried to write code to solve the standard Integer Partition problem (Wikipedia). The code I wrote was a mess. I need an elegant solution to solve the problem, because I want to improve my coding style. This is not a homework question.

我尝试编写代码来解决标准的整数分区问题(Wikipedia)。我写的代码一团糟。我需要一个优雅的解决方案来解决这个问题,因为我想改进我的编码风格。这不是家庭作业的问题。

9 个解决方案

#1


32  

While this answer is fine, I'd recommend skovorodkin's answer below:

虽然这个答案很好,但我还是推荐以下skovorodkin的答案:

>>> def partition(number):
...     answer = set()
...     answer.add((number, ))
...     for x in range(1, number):
...         for y in partition(number - x):
...             answer.add(tuple(sorted((x, ) + y)))
...     return answer
... 
>>> partition(4)
set([(1, 3), (2, 2), (1, 1, 2), (1, 1, 1, 1), (4,)])

If you want all permutations(ie (1, 3) and (3, 1)) change answer.add(tuple(sorted((x, ) + y)) to answer.add((x, ) + y)

如果你想要所有的排列(如(1,3)和(3,1))改变答案。添加(tuple(已排序的(x,) + y)来回答。添加((x)+ y)

#2


25  

A smaller and faster than Nolen's function:

比诺兰的功能更小更快:

def partitions(n, I=1):
    yield (n,)
    for i in range(I, n//2 + 1):
        for p in partitions(n-i, i):
            yield (i,) + p

Let's compare them:

让我们来比较一下:

In [10]: %timeit -n 10 r0 = nolen(20)
1.37 s ± 28.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [11]: %timeit -n 10 r1 = list(partitions(20))
979 µs ± 82.9 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [13]: sorted(map(sorted, r0)) == sorted(map(sorted, r1))
Out[14]: True

Looks like it's 1370 times faster for n = 20.

看起来n = 20的速度要快1370倍。

Anyway, it's still far from accel_asc:

无论如何,它离accel_asc还很远:

def accel_asc(n):
    a = [0 for i in range(n + 1)]
    k = 1
    y = n - 1
    while k != 0:
        x = a[k - 1] + 1
        k -= 1
        while 2 * x <= y:
            a[k] = x
            y -= x
            k += 1
        l = k + 1
        while x <= y:
            a[k] = x
            a[l] = y
            yield a[:k + 2]
            x += 1
            y -= 1
        a[k] = x + y
        y = x + y - 1
        yield a[:k + 1]

It's not only slower, but requires much more memory (but apparently is much easier to remember):

它不仅速度慢,而且需要更多的内存(但显然更容易记住):

In [18]: %timeit -n 5 r2 = list(accel_asc(50))
114 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)

In [19]: %timeit -n 5 r3 = list(partitions(50))
527 ms ± 8.86 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)

In [24]: sorted(map(sorted, r2)) == sorted(map(sorted, r3))
Out[24]: True

You can find other versions on ActiveState: Generator For Integer Partitions (Python Recipe).

您可以在ActiveState: Generator中找到整数分区(Python菜谱)的其他版本。


I use Python 3.6.1 and IPython 6.0.0.

我使用Python 3.6.1和IPython 6.0.0。

#3


7  

I've compared the solution with perfplot (a little project of mine for such purposes) and found that Nolen's top-voted answer is also the slowest.

我将这个解决方案与perfplot(我的一个小项目就是为了这个目的)进行了比较,发现Nolen的最佳答案也是最慢的。

Both answers supplied by skovorodkin are much faster. (Note the log-scale.)

skovorodkin提供的两个答案都要快得多。(注意对数尺度)。

用于整数分区的优雅的Python代码


To to generate the plot:

产生情节:

import perfplot
import collections


def nolen(number):
    answer = set()
    answer.add((number, ))
    for x in range(1, number):
        for y in nolen(number - x):
            answer.add(tuple(sorted((x, ) + y)))
    return answer


def skovorodkin(n):
    return set(skovorodkin_yield(n))


def skovorodkin_yield(n, I=1):
    yield (n,)
    for i in range(I, n//2 + 1):
        for p in skovorodkin_yield(n-i, i):
            yield (i,) + p


def accel_asc(n):
    return set(accel_asc_yield(n))


def accel_asc_yield(n):
    a = [0 for i in range(n + 1)]
    k = 1
    y = n - 1
    while k != 0:
        x = a[k - 1] + 1
        k -= 1
        while 2 * x <= y:
            a[k] = x
            y -= x
            k += 1
        l = k + 1
        while x <= y:
            a[k] = x
            a[l] = y
            yield tuple(a[:k + 2])
            x += 1
            y -= 1
        a[k] = x + y
        y = x + y - 1
        yield tuple(a[:k + 1])


def mct(n):
    partitions_of = []
    partitions_of.append([()])
    partitions_of.append([(1,)])
    for num in range(2, n+1):
        ptitions = set()
        for i in range(num):
            for partition in partitions_of[i]:
                ptitions.add(tuple(sorted((num - i, ) + partition)))
        partitions_of.append(list(ptitions))
    return partitions_of[n]


perfplot.show(
    setup=lambda n: n,
    kernels=[
        nolen,
        mct,
        skovorodkin,
        accel_asc,
        ],
    n_range=range(1, 17),
    logy=True,
    # https://*.com/a/7829388/353337
    equality_check=lambda a, b:
        collections.Counter(set(a)) == collections.Counter(set(b)),
    xlabel='n'
    )

#4


3  

Much quicker than the accepted response and not bad looking, either. The accepted response does lots of the same work multiple times because it calculates the partitions for lower integers multiple times. For example, when n=22 the difference is 12.7 seconds against 0.0467 seconds.

比公认的反应快得多,也不难看。接受的响应多次执行许多相同的工作,因为它多次计算低整数的分区。例如,当n=22时,差是12.7秒而0。0467秒。

def partitions_dp(n):
    partitions_of = []
    partitions_of.append([()])
    partitions_of.append([(1,)])
    for num in range(2, n+1):
        ptitions = set()
        for i in range(num):
            for partition in partitions_of[i]:
                ptitions.add(tuple(sorted((num - i, ) + partition)))
        partitions_of.append(list(ptitions))
    return partitions_of[n]

The code is essentially the same except we save the partitions of smaller integers so we don't have to calculate them again and again.

代码本质上是相同的,除了我们保存较小整数的分区,所以我们不必一次又一次地计算它们。

#5


3  

In case anyone is interested: I needed to solve a similar problem, namely the partition of an integer n into d nonnegative parts, with permutations. For this, there's a simple recursive solution (see here):

如果有人感兴趣的话:我需要解决一个类似的问题,即整数n的分割为d非负部分,以及排列。为此,有一个简单的递归解决方案(参见此处):

def partition(n, d, depth=0):
    if d == depth:
        return [[]]
    return [
        item + [i]
        for i in range(n+1)
        for item in partition(n-i, d, depth=depth+1)
        ]


# extend with n-sum(entries)
n = 5
d = 3
lst = [[n-sum(p)] + p for p in partition(n, d-1)]

print(lst)

Output:

输出:

[
    [5, 0, 0], [4, 1, 0], [3, 2, 0], [2, 3, 0], [1, 4, 0],
    [0, 5, 0], [4, 0, 1], [3, 1, 1], [2, 2, 1], [1, 3, 1],
    [0, 4, 1], [3, 0, 2], [2, 1, 2], [1, 2, 2], [0, 3, 2],
    [2, 0, 3], [1, 1, 3], [0, 2, 3], [1, 0, 4], [0, 1, 4],
    [0, 0, 5]
]

#6


1  

I don't know if my code is the most elegant, but I've had to solve this many times for research purposes. If you modify the

我不知道我的代码是不是最优雅的,但为了研究的目的,我不得不多次解决这个问题。如果你修改

sub_nums

variable you can restrict what numbers are used in the partition.

变量可以限制分区中使用的数字。

def make_partitions(number):
    out = []
    tmp = []
    sub_nums = range(1,number+1)
    for num in sub_nums:
        if num<=number:
            tmp.append([num])
        for elm in tmp:
            sum_elm = sum(elm)
            if sum_elm == number:
                out.append(elm)
            else:
                for num in sub_nums:
                    if sum_elm + num <= number:
                         L = [i for i in elm]
                         L.append(num)
                         tmp.append(L)
    return out

#7


1  

# -*- coding: utf-8 -*-
import timeit

ncache = 0
cache = {}


def partition(number):
    global cache, ncache
    answer = {(number,), }
    if number in cache:
        ncache += 1
        return cache[number]
    if number == 1:
        cache[number] = answer
        return answer
    for x in range(1, number):
        for y in partition(number - x):
            answer.add(tuple(sorted((x, ) + y)))
    cache[number] = answer
    return answer


print('To 5:')
for r in sorted(partition(5))[::-1]:
    print('\t' + ' + '.join(str(i) for i in r))

print(
    'Time: {}\nCache used:{}'.format(
        timeit.timeit(
            "print('To 30: {} possibilities'.format(len(partition(30))))",
            setup="from __main__ import partition",
            number=1
        ), ncache
    )
)

or https://gist.github.com/sxslex/dd15b13b28c40e695f1e227a200d1646

或https://gist.github.com/sxslex/dd15b13b28c40e695f1e227a200d1646

#8


1  

I'm a bit late to the game, but I can offer a contribution which might qualify as more elegant in a few senses:

我有点晚了,但我可以提供一个贡献,可能是更优雅的一些感觉:

def partitions(n, m = None):
  """Partition n with a maximum part size of m. Yield non-increasing
  lists in decreasing lexicographic order. The default for m is
  effectively n, so the second argument is not needed to create the
  generator unless you do want to limit part sizes.
  """
  if m is None or m >= n: yield [n]
  for f in range(n-1 if (m is None or m >= n) else m, 0, -1):
    for p in partitions(n-f, f): yield [f] + p

Only 3 lines of code. Yields them in lexicographic order. Optionally allows imposition of a maximum part size.

只有3行代码。按字典顺序生成它们。可选地允许最大零件尺寸的强加。

I also have a variation on the above for partitions with a given number of parts:

对于给定数量的分区,我还对上面的分区进行了更改:

def sized_partitions(n, k, m = None):
  """Partition n into k parts with a max part of m.
  Yield non-increasing lists.  m not needed to create generator.
  """
  if k == 1:
    yield [n]
    return
  for f in range(n-k+1 if (m is None or m > n-k+1) else m, (n-1)//k, -1): 
    for p in sized_partitions(n-f, k-1, f): yield [f] + p

After composing the above, I ran across a solution I had created almost 5 years ago, but which I had forgotten about. Besides a maximum part size, this one offers the additional feature that you can impose a maximum length (as opposed to a specific length). FWIW:

在编写了上面的内容之后,我遇到了一个解决方案,这个解决方案是我5年前创建的,但我已经忘记了。除了最大的部件大小之外,这一版本还提供了附加特性,您可以设置最大长度(而不是特定的长度)。就其价值而言:

def partitions(sum, max_val=100000, max_len=100000):
    """ generator of partitions of sum with limits on values and length """
    # Yields lists in decreasing lexicographical order. 
    # To get any length, omit 3rd arg.
    # To get all partitions, omit 2nd and 3rd args. 

    if sum <= max_val:       # Can start with a singleton.
        yield [sum]

    # Must have first*max_len >= sum; i.e. first >= sum/max_len.
    for first in range(min(sum-1, max_val), max(0, (sum-1)//max_len), -1):
        for p in partitions(sum-first, first, max_len-1):
            yield [first]+p

#9


0  

F(x,n) = \union_(i>=n) { {i}U g| g in F(x-i,i) }

Just implement this recursion. F(x,n) is the set of all sets that sum to x and their elements are greater than or equal to n.

只是实现这个递归。F(x,n)是所有集合的集合,它们的元素都大于或等于n。

#1


32  

While this answer is fine, I'd recommend skovorodkin's answer below:

虽然这个答案很好,但我还是推荐以下skovorodkin的答案:

>>> def partition(number):
...     answer = set()
...     answer.add((number, ))
...     for x in range(1, number):
...         for y in partition(number - x):
...             answer.add(tuple(sorted((x, ) + y)))
...     return answer
... 
>>> partition(4)
set([(1, 3), (2, 2), (1, 1, 2), (1, 1, 1, 1), (4,)])

If you want all permutations(ie (1, 3) and (3, 1)) change answer.add(tuple(sorted((x, ) + y)) to answer.add((x, ) + y)

如果你想要所有的排列(如(1,3)和(3,1))改变答案。添加(tuple(已排序的(x,) + y)来回答。添加((x)+ y)

#2


25  

A smaller and faster than Nolen's function:

比诺兰的功能更小更快:

def partitions(n, I=1):
    yield (n,)
    for i in range(I, n//2 + 1):
        for p in partitions(n-i, i):
            yield (i,) + p

Let's compare them:

让我们来比较一下:

In [10]: %timeit -n 10 r0 = nolen(20)
1.37 s ± 28.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [11]: %timeit -n 10 r1 = list(partitions(20))
979 µs ± 82.9 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [13]: sorted(map(sorted, r0)) == sorted(map(sorted, r1))
Out[14]: True

Looks like it's 1370 times faster for n = 20.

看起来n = 20的速度要快1370倍。

Anyway, it's still far from accel_asc:

无论如何,它离accel_asc还很远:

def accel_asc(n):
    a = [0 for i in range(n + 1)]
    k = 1
    y = n - 1
    while k != 0:
        x = a[k - 1] + 1
        k -= 1
        while 2 * x <= y:
            a[k] = x
            y -= x
            k += 1
        l = k + 1
        while x <= y:
            a[k] = x
            a[l] = y
            yield a[:k + 2]
            x += 1
            y -= 1
        a[k] = x + y
        y = x + y - 1
        yield a[:k + 1]

It's not only slower, but requires much more memory (but apparently is much easier to remember):

它不仅速度慢,而且需要更多的内存(但显然更容易记住):

In [18]: %timeit -n 5 r2 = list(accel_asc(50))
114 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)

In [19]: %timeit -n 5 r3 = list(partitions(50))
527 ms ± 8.86 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)

In [24]: sorted(map(sorted, r2)) == sorted(map(sorted, r3))
Out[24]: True

You can find other versions on ActiveState: Generator For Integer Partitions (Python Recipe).

您可以在ActiveState: Generator中找到整数分区(Python菜谱)的其他版本。


I use Python 3.6.1 and IPython 6.0.0.

我使用Python 3.6.1和IPython 6.0.0。

#3


7  

I've compared the solution with perfplot (a little project of mine for such purposes) and found that Nolen's top-voted answer is also the slowest.

我将这个解决方案与perfplot(我的一个小项目就是为了这个目的)进行了比较,发现Nolen的最佳答案也是最慢的。

Both answers supplied by skovorodkin are much faster. (Note the log-scale.)

skovorodkin提供的两个答案都要快得多。(注意对数尺度)。

用于整数分区的优雅的Python代码


To to generate the plot:

产生情节:

import perfplot
import collections


def nolen(number):
    answer = set()
    answer.add((number, ))
    for x in range(1, number):
        for y in nolen(number - x):
            answer.add(tuple(sorted((x, ) + y)))
    return answer


def skovorodkin(n):
    return set(skovorodkin_yield(n))


def skovorodkin_yield(n, I=1):
    yield (n,)
    for i in range(I, n//2 + 1):
        for p in skovorodkin_yield(n-i, i):
            yield (i,) + p


def accel_asc(n):
    return set(accel_asc_yield(n))


def accel_asc_yield(n):
    a = [0 for i in range(n + 1)]
    k = 1
    y = n - 1
    while k != 0:
        x = a[k - 1] + 1
        k -= 1
        while 2 * x <= y:
            a[k] = x
            y -= x
            k += 1
        l = k + 1
        while x <= y:
            a[k] = x
            a[l] = y
            yield tuple(a[:k + 2])
            x += 1
            y -= 1
        a[k] = x + y
        y = x + y - 1
        yield tuple(a[:k + 1])


def mct(n):
    partitions_of = []
    partitions_of.append([()])
    partitions_of.append([(1,)])
    for num in range(2, n+1):
        ptitions = set()
        for i in range(num):
            for partition in partitions_of[i]:
                ptitions.add(tuple(sorted((num - i, ) + partition)))
        partitions_of.append(list(ptitions))
    return partitions_of[n]


perfplot.show(
    setup=lambda n: n,
    kernels=[
        nolen,
        mct,
        skovorodkin,
        accel_asc,
        ],
    n_range=range(1, 17),
    logy=True,
    # https://*.com/a/7829388/353337
    equality_check=lambda a, b:
        collections.Counter(set(a)) == collections.Counter(set(b)),
    xlabel='n'
    )

#4


3  

Much quicker than the accepted response and not bad looking, either. The accepted response does lots of the same work multiple times because it calculates the partitions for lower integers multiple times. For example, when n=22 the difference is 12.7 seconds against 0.0467 seconds.

比公认的反应快得多,也不难看。接受的响应多次执行许多相同的工作,因为它多次计算低整数的分区。例如,当n=22时,差是12.7秒而0。0467秒。

def partitions_dp(n):
    partitions_of = []
    partitions_of.append([()])
    partitions_of.append([(1,)])
    for num in range(2, n+1):
        ptitions = set()
        for i in range(num):
            for partition in partitions_of[i]:
                ptitions.add(tuple(sorted((num - i, ) + partition)))
        partitions_of.append(list(ptitions))
    return partitions_of[n]

The code is essentially the same except we save the partitions of smaller integers so we don't have to calculate them again and again.

代码本质上是相同的,除了我们保存较小整数的分区,所以我们不必一次又一次地计算它们。

#5


3  

In case anyone is interested: I needed to solve a similar problem, namely the partition of an integer n into d nonnegative parts, with permutations. For this, there's a simple recursive solution (see here):

如果有人感兴趣的话:我需要解决一个类似的问题,即整数n的分割为d非负部分,以及排列。为此,有一个简单的递归解决方案(参见此处):

def partition(n, d, depth=0):
    if d == depth:
        return [[]]
    return [
        item + [i]
        for i in range(n+1)
        for item in partition(n-i, d, depth=depth+1)
        ]


# extend with n-sum(entries)
n = 5
d = 3
lst = [[n-sum(p)] + p for p in partition(n, d-1)]

print(lst)

Output:

输出:

[
    [5, 0, 0], [4, 1, 0], [3, 2, 0], [2, 3, 0], [1, 4, 0],
    [0, 5, 0], [4, 0, 1], [3, 1, 1], [2, 2, 1], [1, 3, 1],
    [0, 4, 1], [3, 0, 2], [2, 1, 2], [1, 2, 2], [0, 3, 2],
    [2, 0, 3], [1, 1, 3], [0, 2, 3], [1, 0, 4], [0, 1, 4],
    [0, 0, 5]
]

#6


1  

I don't know if my code is the most elegant, but I've had to solve this many times for research purposes. If you modify the

我不知道我的代码是不是最优雅的,但为了研究的目的,我不得不多次解决这个问题。如果你修改

sub_nums

variable you can restrict what numbers are used in the partition.

变量可以限制分区中使用的数字。

def make_partitions(number):
    out = []
    tmp = []
    sub_nums = range(1,number+1)
    for num in sub_nums:
        if num<=number:
            tmp.append([num])
        for elm in tmp:
            sum_elm = sum(elm)
            if sum_elm == number:
                out.append(elm)
            else:
                for num in sub_nums:
                    if sum_elm + num <= number:
                         L = [i for i in elm]
                         L.append(num)
                         tmp.append(L)
    return out

#7


1  

# -*- coding: utf-8 -*-
import timeit

ncache = 0
cache = {}


def partition(number):
    global cache, ncache
    answer = {(number,), }
    if number in cache:
        ncache += 1
        return cache[number]
    if number == 1:
        cache[number] = answer
        return answer
    for x in range(1, number):
        for y in partition(number - x):
            answer.add(tuple(sorted((x, ) + y)))
    cache[number] = answer
    return answer


print('To 5:')
for r in sorted(partition(5))[::-1]:
    print('\t' + ' + '.join(str(i) for i in r))

print(
    'Time: {}\nCache used:{}'.format(
        timeit.timeit(
            "print('To 30: {} possibilities'.format(len(partition(30))))",
            setup="from __main__ import partition",
            number=1
        ), ncache
    )
)

or https://gist.github.com/sxslex/dd15b13b28c40e695f1e227a200d1646

或https://gist.github.com/sxslex/dd15b13b28c40e695f1e227a200d1646

#8


1  

I'm a bit late to the game, but I can offer a contribution which might qualify as more elegant in a few senses:

我有点晚了,但我可以提供一个贡献,可能是更优雅的一些感觉:

def partitions(n, m = None):
  """Partition n with a maximum part size of m. Yield non-increasing
  lists in decreasing lexicographic order. The default for m is
  effectively n, so the second argument is not needed to create the
  generator unless you do want to limit part sizes.
  """
  if m is None or m >= n: yield [n]
  for f in range(n-1 if (m is None or m >= n) else m, 0, -1):
    for p in partitions(n-f, f): yield [f] + p

Only 3 lines of code. Yields them in lexicographic order. Optionally allows imposition of a maximum part size.

只有3行代码。按字典顺序生成它们。可选地允许最大零件尺寸的强加。

I also have a variation on the above for partitions with a given number of parts:

对于给定数量的分区,我还对上面的分区进行了更改:

def sized_partitions(n, k, m = None):
  """Partition n into k parts with a max part of m.
  Yield non-increasing lists.  m not needed to create generator.
  """
  if k == 1:
    yield [n]
    return
  for f in range(n-k+1 if (m is None or m > n-k+1) else m, (n-1)//k, -1): 
    for p in sized_partitions(n-f, k-1, f): yield [f] + p

After composing the above, I ran across a solution I had created almost 5 years ago, but which I had forgotten about. Besides a maximum part size, this one offers the additional feature that you can impose a maximum length (as opposed to a specific length). FWIW:

在编写了上面的内容之后,我遇到了一个解决方案,这个解决方案是我5年前创建的,但我已经忘记了。除了最大的部件大小之外,这一版本还提供了附加特性,您可以设置最大长度(而不是特定的长度)。就其价值而言:

def partitions(sum, max_val=100000, max_len=100000):
    """ generator of partitions of sum with limits on values and length """
    # Yields lists in decreasing lexicographical order. 
    # To get any length, omit 3rd arg.
    # To get all partitions, omit 2nd and 3rd args. 

    if sum <= max_val:       # Can start with a singleton.
        yield [sum]

    # Must have first*max_len >= sum; i.e. first >= sum/max_len.
    for first in range(min(sum-1, max_val), max(0, (sum-1)//max_len), -1):
        for p in partitions(sum-first, first, max_len-1):
            yield [first]+p

#9


0  

F(x,n) = \union_(i>=n) { {i}U g| g in F(x-i,i) }

Just implement this recursion. F(x,n) is the set of all sets that sum to x and their elements are greater than or equal to n.

只是实现这个递归。F(x,n)是所有集合的集合,它们的元素都大于或等于n。