如何规范化4D numpy阵列?

时间:2022-02-13 21:35:10

I have a three dimensional numpy array of images (CIFAR-10 dataset). The image array shape is like below:

我有一个三维numpy图像阵列(CIFAR-10数据集)。图像阵列形状如下所示:

a = np.random.rand(32, 32, 3)

Before I do any deep learning, I want to normalize the data to get better result. With a 1D array, I know we can do min max normalization like this:

在我深入学习之前,我想对数据进行规范化以获得更好的结果。使用一维数组,我知道我们可以像这样做最小最大规范化:

v = np.random.rand(6)
(v - v.min())/(v.max() - v.min())

Out[68]:
array([ 0.89502294,  0.        ,  1.        ,  0.65069468,  0.63657915,
        0.08932196])

However, when it comes to a 3D array, I am totally lost. Specifically, I have the following questions:

然而,当谈到3D阵列时,我完全迷失了。具体来说,我有以下问题:

  1. Along which axis do we take the min and max?
  2. 沿着哪个轴我们采取最小值和最大值?
  3. How do we implement this with the 3D array?
  4. 我们如何用3D阵列实现这一点?

I appreciate your help!

我感谢您的帮助!


EDIT: It turns out I need to work with a 4D Numpy array with shape (202, 32, 32, 3), so the first dimension would be the index for the image, and the last 3 dimensions are the actual image. It'll be great if someone can provide me with the code to normalize such a 4D array. Thanks!

编辑:事实证明我需要使用具有形状(202,32,32,3)的4D Numpy数组,因此第一个维度将是图像的索引,最后3个维度是实际图像。如果有人可以为我提供规范化这样一个4D阵列的代码,那就太棒了。谢谢!


EDIT 2: Thanks to @Eric's code below, I've figured it out:

编辑2:感谢@ Eric的代码,我已经弄明白了:

x_min = x.min(axis=(1, 2), keepdims=True)
x_max = x.max(axis=(1, 2), keepdims=True)

x = (x - x_min)/(x_max-x_min)

2 个解决方案

#1


10  

Assuming you're working with image data of shape (W, H, 3), you should probably normalize over each channel (axis=2) separately, as mentioned in the other answer.

假设你正在处理形状(W,H,3)的图像数据,你应该分别对每个通道(轴= 2)进行标准化,如另一个答案所述。

You can do this with:

你可以这样做:

# keepdims makes the result shape (1, 1, 3) instead of (3,). This doesn't matter here, but
# would matter if you wanted to normalize over a different axis.
v_min = v.min(axis=(0, 1), keepdims=True)
v_max = v.max(axis=(0, 1), keepdims=True)
(v - v_min)/(v_max - v_min)

#2


2  

  1. Along which axis do we take the min and max?
  2. 沿着哪个轴我们采取最小值和最大值?

To answer this we probably need more information about your data, but in general, when discussing 3 channel images for example, we would normalize using the per-channel min and max. this means that we would perform the normalization 3 times - once per channel. Here's an example:

要回答这个问题,我们可能需要有关您的数据的更多信息,但一般来说,在讨论3个通道图像时,我们会使用每个通道的最小值和最大值进行标准化。这意味着我们将执行3次标准化 - 每个通道一次。这是一个例子:

    img = numpy.random.randint(0, 100, size=(10, 10, 3))  # Generating some random numbers
    img = img.astype(numpy.float32)  # converting array of ints to floats
    img_a = img[:, :, 0]
    img_b = img[:, :, 1]
    img_c = img[:, :, 2]  # Extracting single channels from 3 channel image
    # The above code could also be replaced with cv2.split(img) << which will return 3 numpy arrays (using opencv)

    # normalizing per channel data:
    img_a = (img_a - numpy.min(img_a)) / (numpy.max(img_a) - numpy.min(img_a))
    img_b = (img_b - numpy.min(img_b)) / (numpy.max(img_b) - numpy.min(img_b))
    img_c = (img_c - numpy.min(img_c)) / (numpy.max(img_c) - numpy.min(img_c))

    # putting the 3 channels back together:
    img_norm = numpy.empty((10, 10, 3), dtype=numpy.float32)
    img_norm[:, :, 0] = img_a
    img_norm[:, :, 1] = img_b
    img_norm[:, :, 2] = img_c

Edit: It just occurred to me that once you have the one channel data (32x32 image for instance) you can simply use:

编辑:我刚刚想到,一旦你拥有一个通道数据(例如32x32图像),你可以简单地使用:

from sklearn.preprocessing import normalize
img_a_norm = normalize(img_a)
  1. How do we work with the 3D array?
  2. 我们如何使用3D阵列?

Well, this is a bit of a big question. If you need functions like array-wise min and max I would use the Numpy versions. Indexing, for instance, is achieved through axis-wide separators - as you can see from my example above. Also, please refer to Numpy's documentation of ndarray @ https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html to learn more. they really have an amazing set of tools for n-dimensional arrays.

嗯,这是一个很大的问题。如果你需要像array-wise min和max这样的函数,我会使用Numpy版本。例如,索引是通过轴宽分离器实现的 - 正如您从上面的示例中可以看到的那样。另外,请参阅Numpy的文档ndarray @ https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html以了解更多信息。他们真的有一套惊人的n维数组工具。

#1


10  

Assuming you're working with image data of shape (W, H, 3), you should probably normalize over each channel (axis=2) separately, as mentioned in the other answer.

假设你正在处理形状(W,H,3)的图像数据,你应该分别对每个通道(轴= 2)进行标准化,如另一个答案所述。

You can do this with:

你可以这样做:

# keepdims makes the result shape (1, 1, 3) instead of (3,). This doesn't matter here, but
# would matter if you wanted to normalize over a different axis.
v_min = v.min(axis=(0, 1), keepdims=True)
v_max = v.max(axis=(0, 1), keepdims=True)
(v - v_min)/(v_max - v_min)

#2


2  

  1. Along which axis do we take the min and max?
  2. 沿着哪个轴我们采取最小值和最大值?

To answer this we probably need more information about your data, but in general, when discussing 3 channel images for example, we would normalize using the per-channel min and max. this means that we would perform the normalization 3 times - once per channel. Here's an example:

要回答这个问题,我们可能需要有关您的数据的更多信息,但一般来说,在讨论3个通道图像时,我们会使用每个通道的最小值和最大值进行标准化。这意味着我们将执行3次标准化 - 每个通道一次。这是一个例子:

    img = numpy.random.randint(0, 100, size=(10, 10, 3))  # Generating some random numbers
    img = img.astype(numpy.float32)  # converting array of ints to floats
    img_a = img[:, :, 0]
    img_b = img[:, :, 1]
    img_c = img[:, :, 2]  # Extracting single channels from 3 channel image
    # The above code could also be replaced with cv2.split(img) << which will return 3 numpy arrays (using opencv)

    # normalizing per channel data:
    img_a = (img_a - numpy.min(img_a)) / (numpy.max(img_a) - numpy.min(img_a))
    img_b = (img_b - numpy.min(img_b)) / (numpy.max(img_b) - numpy.min(img_b))
    img_c = (img_c - numpy.min(img_c)) / (numpy.max(img_c) - numpy.min(img_c))

    # putting the 3 channels back together:
    img_norm = numpy.empty((10, 10, 3), dtype=numpy.float32)
    img_norm[:, :, 0] = img_a
    img_norm[:, :, 1] = img_b
    img_norm[:, :, 2] = img_c

Edit: It just occurred to me that once you have the one channel data (32x32 image for instance) you can simply use:

编辑:我刚刚想到,一旦你拥有一个通道数据(例如32x32图像),你可以简单地使用:

from sklearn.preprocessing import normalize
img_a_norm = normalize(img_a)
  1. How do we work with the 3D array?
  2. 我们如何使用3D阵列?

Well, this is a bit of a big question. If you need functions like array-wise min and max I would use the Numpy versions. Indexing, for instance, is achieved through axis-wide separators - as you can see from my example above. Also, please refer to Numpy's documentation of ndarray @ https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html to learn more. they really have an amazing set of tools for n-dimensional arrays.

嗯,这是一个很大的问题。如果你需要像array-wise min和max这样的函数,我会使用Numpy版本。例如,索引是通过轴宽分离器实现的 - 正如您从上面的示例中可以看到的那样。另外,请参阅Numpy的文档ndarray @ https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html以了解更多信息。他们真的有一套惊人的n维数组工具。