numpy数组中有多少内存? RAM是限制因素吗?

时间:2022-05-31 21:22:20

I'm using numpy to create a cube array with sides of length 100, thus containing 1 million entries total. For each of the million entries, I am inserting a 100x100 matrix whose entries are comprised of randomly generated numbers. I am using the following code to do so:

我正在使用numpy创建一个边长为100的立方体数组,因此总共包含100万个条目。对于每百万个条目,我插入一个100x100矩阵,其条目由随机生成的数字组成。我使用以下代码来执行此操作:

import random
from numpy import *

cube = arange(1000000).reshape(100,100,100)

for element in cube.flat:
    matrix = arange(10000).reshape(100,100)
    for entry in matrix.flat:
        entry = random.random()*100
    element = matrix

I was expecting this to take a while, but with 10 billion random numbers being generated, I'm not sure my computer can even handle it. How much memory would such an array take up? Would RAM be a limiting factor, i.e. if my computer doesn't have enough RAM, could it fail to actually generate the array?

我期待这需要一段时间,但是生成了100亿个随机数,我不确定我的电脑是否可以处理它。这样一个阵列会占用多少内存? RAM是一个限制因素,即如果我的计算机没有足够的RAM,它是否无法实际生成阵列?

Also, if there is a more efficient to implement this code, I would appreciate tips :)

此外,如果有更高效的实现此代码,我会很感激提示:)

2 个解决方案

#1


21  

A couple points:

几点:

  • The size in memory of numpy arrays is easy to calculate. It's simply the number of elements times the data size, plus a small constant overhead. For example, if your cube.dtype is int64, and it has 1,000,000 elements, it will require 1000000 * 64 / 8 = 8,000,000 bytes (8Mb).
  • numpy数组的内存大小很容易计算。它只是元素的数量乘以数据大小,加上一个小的常量开销。例如,如果你的cube.dtype是int64,并且它有1,000,000个元素,那么它将需要1000000 * 64/8 = 8,000,000字节(8Mb)。
  • However, as @Gabe notes, 100 * 100 * 1,000,000 doubles will require about 80 Gb.
  • 但是,正如@Gabe指出的那样,100 * 100 * 1,000,000双打将需要大约80 Gb。
  • This will not cause anything to "break", per-se, but operations will be ridiculously slow because of all the swapping your computer will need to do.
  • 这不会导致任何“破坏”本身,但由于您的计算机需要进行的所有交换,操作将会非常缓慢。
  • Your loops will not do what you expect. Instead of replacing the element in cube, element = matrix will simply overwrite the element variable, leaving the cube unchanged. The same goes for the entry = random.rand() * 100.
  • 你的循环不会达到预期的效果。 element = matrix不是替换立方体中的元素,而是简单地覆盖元素变量,保持立方体不变。 entry = random.rand()* 100也是如此。
  • Instead, see: http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#modifying-array-values
  • 相反,请参阅:http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#modifying-array-values

#2


2  

for the "inner" part of your function, look at the numpy.random module

对于函数的“内部”部分,请查看numpy.random模块

import numpy as np
matrix = np.random.random((100,100))*100

#1


21  

A couple points:

几点:

  • The size in memory of numpy arrays is easy to calculate. It's simply the number of elements times the data size, plus a small constant overhead. For example, if your cube.dtype is int64, and it has 1,000,000 elements, it will require 1000000 * 64 / 8 = 8,000,000 bytes (8Mb).
  • numpy数组的内存大小很容易计算。它只是元素的数量乘以数据大小,加上一个小的常量开销。例如,如果你的cube.dtype是int64,并且它有1,000,000个元素,那么它将需要1000000 * 64/8 = 8,000,000字节(8Mb)。
  • However, as @Gabe notes, 100 * 100 * 1,000,000 doubles will require about 80 Gb.
  • 但是,正如@Gabe指出的那样,100 * 100 * 1,000,000双打将需要大约80 Gb。
  • This will not cause anything to "break", per-se, but operations will be ridiculously slow because of all the swapping your computer will need to do.
  • 这不会导致任何“破坏”本身,但由于您的计算机需要进行的所有交换,操作将会非常缓慢。
  • Your loops will not do what you expect. Instead of replacing the element in cube, element = matrix will simply overwrite the element variable, leaving the cube unchanged. The same goes for the entry = random.rand() * 100.
  • 你的循环不会达到预期的效果。 element = matrix不是替换立方体中的元素,而是简单地覆盖元素变量,保持立方体不变。 entry = random.rand()* 100也是如此。
  • Instead, see: http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#modifying-array-values
  • 相反,请参阅:http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#modifying-array-values

#2


2  

for the "inner" part of your function, look at the numpy.random module

对于函数的“内部”部分,请查看numpy.random模块

import numpy as np
matrix = np.random.random((100,100))*100