所有可能组合的numpy memory erros

时间:2021-12-30 00:57:10

I have been using _list = np.unique(np.stack(np.meshgrid(*_load), -1).reshape(-1, len(_load)), axis=0) to generate a list of all possible combinations, which worked fine on list of list that look like,

我一直在使用_list = np.unique(np.stack(np.meshgrid(* _ load), - 1).reshape(-1,len(_load)),axis = 0)来生成所有可能组合的列表,这在列表的列表上工作正常,看起来像,

[[1, 2, 3], [8, 4, 5, 6, 7], [8, 4, 5, 6, 7], [8, 4, 5, 6, 7], [8, 4, 5, 6, 7], [9, 10, 11, 12, 13, 14], [9, 10, 11, 12, 13, 14], [9, 10, 11, 12, 13, 14], [9, 10, 11, 12, 13, 14], [9, 10, 11, 12, 13, 14]]

[[1,2,3],[8,4,5,6,7],[8,4,5,6,7],[8,4,5,6,7],[8,4, 5,6,7],[9,10,11,12,13,14],[9,10,11,12,13,14],[9,10,11,12,13,14],[ 9,10,11,12,13,14],[9,10,11,12,13,14]]

Howeveer, if I want to find all possibles on something like

但是,如果我想在类似的东西上找到所有可能的话

[[3, 4, 69, 134, 39, 42, 46, 15, 99, 20, 120, 123, 93], [130, 5, 7, 139, 14, 143, 33, 48, 50, 51, 52, 53, 55, 58, 60, 62, 67, 84, 85, 87, 91, 105, 106, 107, 111, 121, 127], [130, 5, 7, 139, 14, 143, 33, 48, 50, 51, 52, 53, 55, 58, 60, 62, 67, 84, 85, 87, 91, 105, 106, 107, 111, 121, 127], [1, 132, 133, 135, 138, 11, 12, 142, 16, 147, 24, 25, 27, 28, 29, 30, 31, 35, 36, 40, 47, 54, 57, 63, 66, 68, 70, 71, 72, 140, 76, 81, 83, 88, 90, 92, 144, 98, 100, 103, 109, 110, 112, 114, 118, 122], [1, 132, 133, 135, 138, 11, 12, 142, 16, 147, 24, 25, 27, 28, 29, 30, 31, 35, 36, 40, 47, 54, 57, 63, 66, 68, 70, 71, 72, 140, 76, 81, 83, 88, 90, 92, 144, 98, 100, 103, 109, 110, 112, 114, 118, 122], [128, 129, 2, 131, 6, 8, 9, 10, 13, 141, 17, 18, 19, 21, 22, 23, 26, 32, 34, 37, 38, 41, 43, 44, 45, 49, 137, 56, 59, 61, 64, 65, 73, 74, 75, 77, 78, 79, 80, 82, 86, 89, 94, 95, 96, 97, 101, 102, 145, 104, 108, 146, 113, 115, 116, 117, 119, 136, 124, 125, 126], [128, 129, 2, 131, 6, 8, 9, 10, 13, 141, 17, 18, 19, 21, 22, 23, 26, 32, 34, 37, 38, 41, 43, 44, 45, 49, 137, 56, 59, 61, 64, 65, 73, 74, 75, 77, 78, 79, 80, 82, 86, 89, 94, 95, 96, 97, 101, 102, 145, 104, 108, 146, 113, 115, 116, 117, 119, 136, 124, 125, 126], [128, 129, 2, 131, 6, 8, 9, 10, 13, 141, 17, 18, 19, 21, 22, 23, 26, 32, 34, 37, 38, 41, 43, 44, 45, 49, 137, 56, 59, 61, 64, 65, 73, 74, 75, 77, 78, 79, 80, 82, 86, 89, 94, 95, 96, 97, 101, 102, 145, 104, 108, 146, 113, 115, 116, 117, 119, 136, 124, 125, 126], [128, 129, 2, 131, 6, 8, 9, 10, 13, 141, 17, 18, 19, 21, 22, 23, 26, 32, 34, 37, 38, 41, 43, 44, 45, 49, 137, 56, 59, 61, 64, 65, 73, 74, 75, 77, 78, 79, 80, 82, 86, 89, 94, 95, 96, 97, 101, 102, 145, 104, 108, 146, 113, 115, 116, 117, 119, 136, 124, 125, 126]]

[[3,4,69,134,39,42,46,15,99,20,120,123,93],[130,5,7,139,14,143,33,48,50,51, 52,53,55,58,60,62,67,84,85,87,91,105,106,107,111,121,127],[130,5,7,139,14,143,33, 48,50,51,52,53,55,58,60,62,67,84,85,87,91,105,106,107,111,121,127],[1,132,133,135, 138,11,12,142,16,147,24,25,27,28,29,30,31,35,36,40,47,54,57,63,66,68,70,71,72, 140,76,81,83,88,90,92,144,98,100,103,109,110,112,114,118,122],[1,132,133,135,138,11,12, 142,16,147,24,25,27,28,29,30,31,35,36,40,47,54,57,63,66,68,70,71,72,140,​​76,81, 83,88,90,92,144,98,100,103,109,110,112,114,118,122],[128,129,2,131,6,8,9,10,13,141, 17,18,19,21,22,23,26,32,34,37,38,41,43,44,45,49,137,56,59,61,64,65,73,74,75, 77,78,79,80,82,86,89,94,95,96,97,101,102,145,104,106,146,113,115,116,117,119,136,124,125, 126],[128,129,2,131,6,8,9,10,13,141,17,1 8,19,21,22,23,26,32,34,37,38,41,43,44,45,49,137,56,59,61,64,65,73,74,75,77, 78,79,80,82,86,89,94,95,96,97,101,102,145,104,108,146,113,115,116,117,119,136,124,125,126] ,[128,129,2,131,6,8,9,10,13,141,17,18,19,21,22,23,26,32,34,37,38,41,43,44, 45,49,137,56,59,61,64,65,73,74,75,77,78,79,80,82,86,89,94,95,96,97,101,102,145, 104,108,146,113,115,116,117,119,136,124,125,126],[128,129,2,113,6,8,9,10,13,141,17,18, 19,21,22,23,26,32,34,37,38,41,43,44,45,49,137,56,59,61,64,65,73,74,75,77,78, 79,80,82,86,89,94,95,96,97,101,102,145,104,108,146,113,115,116,117,119,136,124,125,126]]

I get a MemoryError in python, obviously I need to change my approach, any ideas? I was thinking I would need to end up writing the intermittent events to file, but I don't know how to get these built ins to do that.

我在python中得到一个MemoryError,显然我需要改变我的方法,任何想法?我以为我需要最终写出间歇性事件来存档,但我不知道如何让这些内置的事件来做到这一点。

1 个解决方案

#1


1  

Your algorithm is fine, but your result is unrepresentable.

您的算法很好,但您的结果无法代表。

Note in particular that your np.unique is useless, because each load[i] contains no duplicates, so the size of the result is the product of the lengths of the lists times the number of lists

请特别注意你的np.unique是无用的,因为每个load [i]都不包含重复项,因此结果的大小是列表长度乘以列表数量的乘积

>>> np.prod([len(i) for i in second_example], dtype=np.int64) * 9
2498897217529908

Assuming optimistically that each integer is a uint8, that's 2.2 PiB (Pebibytes), which far exceeds current RAM configurations.

假设乐观地认为每个整数都是uint8,即2.2 PiB(Pebibytes),远远超过当前的RAM配置。

Even if you don't try to put the whole result in memory at once, even iterating over this is going to take a long time - assuming a generous 4GHz processor and a single clock cycle per result, you're looking at longer than a week to finish

即使你不试图将整个结果同时放在内存中,即使迭代这个也需要很长时间 - 假设一个宽大的4GHz处理器和每个结果一个时钟周期,你看的时间比一个一周结束

#1


1  

Your algorithm is fine, but your result is unrepresentable.

您的算法很好,但您的结果无法代表。

Note in particular that your np.unique is useless, because each load[i] contains no duplicates, so the size of the result is the product of the lengths of the lists times the number of lists

请特别注意你的np.unique是无用的,因为每个load [i]都不包含重复项,因此结果的大小是列表长度乘以列表数量的乘积

>>> np.prod([len(i) for i in second_example], dtype=np.int64) * 9
2498897217529908

Assuming optimistically that each integer is a uint8, that's 2.2 PiB (Pebibytes), which far exceeds current RAM configurations.

假设乐观地认为每个整数都是uint8,即2.2 PiB(Pebibytes),远远超过当前的RAM配置。

Even if you don't try to put the whole result in memory at once, even iterating over this is going to take a long time - assuming a generous 4GHz processor and a single clock cycle per result, you're looking at longer than a week to finish

即使你不试图将整个结果同时放在内存中,即使迭代这个也需要很长时间 - 假设一个宽大的4GHz处理器和每个结果一个时钟周期,你看的时间比一个一周结束