枚举具有N个元素的1d数组的所有k分区?

时间:2022-08-26 21:47:36

This seems like a simple request, but google is not my friend because "partition" scores a bunch of hits in database and filesystem space.

这似乎是一个简单的请求,但谷歌不是我的朋友,因为“分区”在数据库和文件系统空间中得分很多。

I need to enumerate all partitions of an array of N values (N is constant) into k sub-arrays. The sub-arrays are just that - a starting index and ending index. The overall order of the original array will be preserved.

我需要将N个值(N是常数)的数组的所有分区枚举成k个子数组。子数组就是 - 起始索引和结束索引。将保留原始数组的整体顺序。

For example, with N=4 and k=2:

例如,N = 4且k = 2:

[ | a b c d ] (0, 4)
[ a | b c d ] (1, 3)
[ a b | c d ] (2, 2)
[ a b c | d ] (3, 1)
[ a b c d | ] (4, 0)

And with k=3:

并且k = 3:

[ | | a b c d ] (0, 0, 4)
[ | a | b c d ] (0, 1, 3)
  :
[ a | b | c d ] (1, 1, 2)
[ a | b c | d ] (1, 2, 1)
  :
[ a b c d | | ] (4, 0, 0)

I'm pretty sure this isn't an original problem (and no, it's not homework), but I'd like to do it for every k <= N, and it'd be great if the later passes (as k grows) took advantage of earlier results.

我很确定这不是一个原始问题(不,它不是家庭作业),但我想为每个k <= N做这个,如果后来通过它会很好(因为k增长)利用了早期的结果。

If you've got a link, please share.

如果您有链接,请分享。

2 个解决方案

#1


7  

In order to re-use the prior results (for lesser values of k), you can do recursion.

为了重用先前的结果(对于较小的k值),可以进行递归。

Think of such partitioning as a list of ending indexes (starting index for any partition is just the ending index of the last partition or 0 for the first one).

将此类分区视为结束索引列表(任何分区的起始索引只是最后一个分区的结束索引,或者是第一个分区的结束索引)。

So, your set of partitionings are just a set of all arrays of k non-decreasing integers between 0 and N.

因此,您的分区集只是0到N之间的所有k个非递减整数数组的集合。

If k is bounded, you can do this via k nested loops

如果k是有界的,你可以通过k个嵌套循环来做到这一点

for (i[0]=0; i[0] < N; i[0]++) {
    for (i[1]=i[0]; i[1] < N; i[1]++) {
    ...
            for (i[10]=i[9]; i[10] < N; i[10]++) {
                push i[0]==>i[10] onto the list of partitionings.
            }
    ...
    }
}

If k is unbounded, you can do it recursively.

如果k是*的,你可以递归地进行。

A set of k partitions between indexes S and E is obtained by:

索引S和E之间的一组k个分区通过以下方式获得:

  • Looping the "end of first partition" EFP between S and E. For each value:

    在S和E之间循环“第一个分区的结尾”EFP。对于每个值:

    • Recursively find a list of k-1 partitions between EFP and S

      递归地找到EFP和S之间的k-1分区列表

    • For each vector in that list, pre-pend "EFP" to that vector.

      对于该列表中的每个向量,将“EFP”预先挂起到该向量。

    • resulting vector of length k is added to the list of results.

      将结果长度为k的矢量添加到结果列表中。

Please note that my answer produces lists of end-points of each slice. If you (as your example shows) want a list of LENGTHS of each slice, you need to obtain lengths by subtracting the last slice end from current slice end.

请注意,我的答案会生成每个切片的终点列表。如果您(如您的示例所示)想要每个切片的LENGTHS列表,则需要通过从当前切片结尾减去最后切片结束来获取长度。

#2


0  

Each partition can be described by the k-1 indexes separating the parts. Since order is preserved, these indices must be non-decreasing. That is, there is a direct correspondence between subsets of size k-1 and the partitions you seek.

每个分区可以通过分隔各部分的k-1索引来描述。由于保留了订单,因此这些指数必须不减少。也就是说,大小为k-1的子集与您寻找的分区之间存在直接对应关系。

For iterating over all subsets of size k-1, you can check out the question:

为了迭代大小为k-1的所有子集,您可以查看问题:

How to iteratively generate k elements subsets from a set of size n in java?

如何从java中的一组大小n迭代生成k个元素子集?

The only wrinkle is that if empty parts are allowed, several cut-points can coincide, but a subset can contain each index at most once. You'll have to adjust the algorithm slightly by replacing:

唯一的缺点是如果允许空白部分,几个切点可以重合,但是一个子集最多可以包含每个索引一次。你必须通过替换来稍微调整算法:

        processLargerSubsets(set, subset, subsetSize + 1, j + 1);

by

        processLargerSubsets(set, subset, subsetSize + 1, j);

#1


7  

In order to re-use the prior results (for lesser values of k), you can do recursion.

为了重用先前的结果(对于较小的k值),可以进行递归。

Think of such partitioning as a list of ending indexes (starting index for any partition is just the ending index of the last partition or 0 for the first one).

将此类分区视为结束索引列表(任何分区的起始索引只是最后一个分区的结束索引,或者是第一个分区的结束索引)。

So, your set of partitionings are just a set of all arrays of k non-decreasing integers between 0 and N.

因此,您的分区集只是0到N之间的所有k个非递减整数数组的集合。

If k is bounded, you can do this via k nested loops

如果k是有界的,你可以通过k个嵌套循环来做到这一点

for (i[0]=0; i[0] < N; i[0]++) {
    for (i[1]=i[0]; i[1] < N; i[1]++) {
    ...
            for (i[10]=i[9]; i[10] < N; i[10]++) {
                push i[0]==>i[10] onto the list of partitionings.
            }
    ...
    }
}

If k is unbounded, you can do it recursively.

如果k是*的,你可以递归地进行。

A set of k partitions between indexes S and E is obtained by:

索引S和E之间的一组k个分区通过以下方式获得:

  • Looping the "end of first partition" EFP between S and E. For each value:

    在S和E之间循环“第一个分区的结尾”EFP。对于每个值:

    • Recursively find a list of k-1 partitions between EFP and S

      递归地找到EFP和S之间的k-1分区列表

    • For each vector in that list, pre-pend "EFP" to that vector.

      对于该列表中的每个向量,将“EFP”预先挂起到该向量。

    • resulting vector of length k is added to the list of results.

      将结果长度为k的矢量添加到结果列表中。

Please note that my answer produces lists of end-points of each slice. If you (as your example shows) want a list of LENGTHS of each slice, you need to obtain lengths by subtracting the last slice end from current slice end.

请注意,我的答案会生成每个切片的终点列表。如果您(如您的示例所示)想要每个切片的LENGTHS列表,则需要通过从当前切片结尾减去最后切片结束来获取长度。

#2


0  

Each partition can be described by the k-1 indexes separating the parts. Since order is preserved, these indices must be non-decreasing. That is, there is a direct correspondence between subsets of size k-1 and the partitions you seek.

每个分区可以通过分隔各部分的k-1索引来描述。由于保留了订单,因此这些指数必须不减少。也就是说,大小为k-1的子集与您寻找的分区之间存在直接对应关系。

For iterating over all subsets of size k-1, you can check out the question:

为了迭代大小为k-1的所有子集,您可以查看问题:

How to iteratively generate k elements subsets from a set of size n in java?

如何从java中的一组大小n迭代生成k个元素子集?

The only wrinkle is that if empty parts are allowed, several cut-points can coincide, but a subset can contain each index at most once. You'll have to adjust the algorithm slightly by replacing:

唯一的缺点是如果允许空白部分,几个切点可以重合,但是一个子集最多可以包含每个索引一次。你必须通过替换来稍微调整算法:

        processLargerSubsets(set, subset, subsetSize + 1, j + 1);

by

        processLargerSubsets(set, subset, subsetSize + 1, j);