I'm trying to view an 32x32 pixel RGB image in CIFAR-10 format. It's a numpy array where pixel values (uint8) are arranged as follows: "The first 1024 bytes are the red channel values, the next 1024 the green, and the final 1024 the blue. The values are stored in row-major order, so the first 32 bytes are the red channel values of the first row of the image."
我正在尝试以CIFAR-10格式查看32x32像素的RGB图像。它是一个numpy数组,其中像素值(uint8)排列如下:“前1024个字节是红色通道值,下一个1024是绿色,最后1024个是蓝色。值以行主顺序存储,所以前32个字节是图像第一行的红色通道值。“
Thus, the original image shape is:
因此,原始图像形状是:
numpy.shape(image)
(3072L,)
I reshape it like this:
我像这样重塑它:
im = numpy.reshape(image, (32,32,3))
However, when I try
但是,当我尝试
imshow(im)
in iPython console, I see 3 by 3 tiles of the original image:
在iPython控制台中,我看到原始图像的3×3个图块:
I expected to see a single image of a car instead. I saw this question here, but I'm not sure what are they doing there, and if it's relevant to my situation.
我希望看到一辆汽车的单一图像。我在这里看到了这个问题,但我不确定他们在那里做了什么,如果这与我的情况有关。
2 个解决方案
#1
8
Try changing the order. By default, it is C-contiguous (which is in fact row-major), but for matplotlib, you'll want the red channel values in [:,:,0]
. That means you should read that data in in Fortran order so that it first fills the "columns" (in this 3D context).
尝试更改订单。默认情况下,它是C-contiguous(实际上是row-major),但是对于matplotlib,你需要[:,:,0]中的红色通道值。这意味着您应该以Fortran顺序读取该数据,以便它首先填充“列”(在此3D上下文中)。
im = numpy.reshape(c, (32,32,3), order='F')
#2
12
I know it's been a while since the question was posted but I want to correct Oliver's answer. If you order by Fortran, the image is inverted and rotated by 90 degrees CCW.
我知道问题已经发布已经有一段时间了,但我想纠正奥利弗的答案。如果您通过Fortran订购,则图像会反转并旋转90度CCW。
You can still train on this data of course if you format all of your images this way. But to prevent you from going insane, you should do the following:
如果您以这种方式格式化所有图像,您仍然可以训练这些数据。但是为了防止你发疯,你应该做以下事情:
im = c.reshape(3,32,32).transpose(1,2,0)
What you are doing is first reshaping the matrix using the default format which gets you RGB in the first dimension and then rows and columns in the other two dimensions. Then you are shuffling the dimensions so that the first dimension in the original (RGB, indexed at 0) is switched to the third dimension and the second and third dimensions each move up by 1.
你要做的是首先使用默认格式重塑矩阵,在第一维中获得RGB,然后在另外两个维中获得行和列。然后,您正在改变尺寸,使原始中的第一个尺寸(RGB,索引为0)切换到第三个尺寸,第二个和第三个尺寸每个都向上移动1。
Hope that this helped.
希望这有所帮助。
#1
8
Try changing the order. By default, it is C-contiguous (which is in fact row-major), but for matplotlib, you'll want the red channel values in [:,:,0]
. That means you should read that data in in Fortran order so that it first fills the "columns" (in this 3D context).
尝试更改订单。默认情况下,它是C-contiguous(实际上是row-major),但是对于matplotlib,你需要[:,:,0]中的红色通道值。这意味着您应该以Fortran顺序读取该数据,以便它首先填充“列”(在此3D上下文中)。
im = numpy.reshape(c, (32,32,3), order='F')
#2
12
I know it's been a while since the question was posted but I want to correct Oliver's answer. If you order by Fortran, the image is inverted and rotated by 90 degrees CCW.
我知道问题已经发布已经有一段时间了,但我想纠正奥利弗的答案。如果您通过Fortran订购,则图像会反转并旋转90度CCW。
You can still train on this data of course if you format all of your images this way. But to prevent you from going insane, you should do the following:
如果您以这种方式格式化所有图像,您仍然可以训练这些数据。但是为了防止你发疯,你应该做以下事情:
im = c.reshape(3,32,32).transpose(1,2,0)
What you are doing is first reshaping the matrix using the default format which gets you RGB in the first dimension and then rows and columns in the other two dimensions. Then you are shuffling the dimensions so that the first dimension in the original (RGB, indexed at 0) is switched to the third dimension and the second and third dimensions each move up by 1.
你要做的是首先使用默认格式重塑矩阵,在第一维中获得RGB,然后在另外两个维中获得行和列。然后,您正在改变尺寸,使原始中的第一个尺寸(RGB,索引为0)切换到第三个尺寸,第二个和第三个尺寸每个都向上移动1。
Hope that this helped.
希望这有所帮助。