利用tf.nn得到反褶积层的输出形状。在tensorflow conv2d_transpose

According to this paper, the output shape is N + H - 1, N is input height or width, H is kernel height or width. This is obvious inverse process of convolution. This tutorial gives a formula to calculate the output shape of convolution which is (W−F+2P)/S+1, W - input size, F - filter size, P - padding size, S - stride. But in Tensorflow, there are test cases like:

根据本文所述，输出形状为N + H - 1, N为输入高度或宽度，H为内核高度或宽度。这是卷积的明显逆过程。本教程给出了一个计算卷积输出形状的公式，它是(W F+2P)/S+1, W -输入大小，F - filter大小，P - padding size, S - stride。但在Tensorflow中，有一些测试案例，比如:

  strides = [1, 2, 2, 1]

  # Input, output: [batch, height, width, depth]
  x_shape = [2, 6, 4, 3]
  y_shape = [2, 12, 8, 2]

  # Filter: [kernel_height, kernel_width, output_depth, input_depth]
  f_shape = [3, 3, 2, 3]

So we use y_shape, f_shape and x_shape, according to formula (W−F+2P)/S+1 to calculate padding size P. From (12 - 3 + 2P) / 2 + 1 = 6, we get P = 0.5, which is not an integer. How does deconvolution works in Tensorflow?

所以我们使用y_shape, f_shape和x_shape，根据公式(W F+2P)/S+1来计算填充大小P (12 - 3 +2P)/ 2 +1 = 6，我们得到P = 0.5，这不是一个整数。反褶积在肌腱流中是如何起作用的?

3 个解决方案

#1

The formula for the output size from the tutorial assumes that the padding P is the same before and after the image (left & right or top & bottom). Then, the number of places in which you put the kernel is: W (size of the image) - F (size of the kernel) + P (additional padding before) + P (additional padding after).

本教程的输出大小公式假定填充P在图像前后是相同的(左右或上下)。然后，您放入内核的位置的数量为:W(图像大小)- F(内核大小)+ P(额外填充之前)+ P(后面的填充)。

But tensorflow also handles the situation where you need to pad more pixels to one of the sides than to the other, so that the kernels would fit correctly. You can read more about the strategies to choose the padding ("SAME" and "VALID") in the docs. The test you're talking about uses method "VALID".

但是，tensorflow还处理了需要将更多像素填充到其中一个边而不是另一端的情况，这样内核就可以正确地匹配了。您可以阅读更多关于在文档中选择填充(“相同”和“有效”)的策略。您所讨论的测试使用方法“有效”。

#2

This discussion is really helpful. Just add some additional information. padding='SAME' can also let the bottom and right side get the one additional padded pixel. According to TensorFlow document, and the test case below

这个讨论真的很有帮助。只需添加一些额外的信息。填充='相同'也可以让底部和右侧得到一个额外的填充像素。根据TensorFlow文档和下面的测试用例。

strides = [1, 2, 2, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 12, 8, 2]

# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]

is using padding='SAME'. We can interpret padding='SAME' as:

使用填充=“相同”。我们可以解释填充='相同'

(W−F+pad_along_height)/S+1 = out_height,
(W−F+pad_along_width)/S+1 = out_width.

So (12 - 3 + pad_along_height) / 2 + 1 = 6, and we get pad_along_height=1. And pad_top=pad_along_height/2 = 1/2 = 0(integer division), pad_bottom=pad_along_height-pad_top=1.

所以(12 - 3 + pad_along_height) / 2 + 1 = 6，我们得到pad_along_height=1。而pad_top=pad_along_height/2 = 0(整数除法)，pad_bottom= pad_along_hei- pad_top=1。

As for padding='VALID', as the name suggested, we use padding when it is proper time to use it. At first, we assume that the padded pixel = 0, if this doesn't work well, then we add 0 padding where any value outside the original input image region. For example, the test case below,

至于填充='VALID'，顾名思义，我们在适当的时候使用填充。首先，我们假设padd像素= 0，如果这不能很好地工作，那么我们将在原始输入图像区域之外的任何值中添加0填充。例如，下面的测试用例，

strides = [1, 2, 2, 1]

# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 13, 9, 2]

# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]

The output shape of conv2d is

对流的输出形状是。

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
           = ceil(float(13 - 3 + 1) / float(3)) = ceil(11/3) = 6
           = (W−F)/S + 1.

Cause (W−F)/S+1 = (13-3)/2+1 = 6, the result is an integer, we don't need to add 0 pixels around the border of the image, and pad_top=1/2, pad_left=1/2 in the TensorFlow document padding='VALID' section are all 0.

原因(W−F)/ S + 1 =(13)/ 2 + 1 = 6,结果是一个整数,我们不需要添加0像素在图像的边界,和pad_top = 1/2,pad_left = 1/2 TensorFlow文档中填充=“有效”的部分都是0。

#3

for deconvolution,

反褶积,

output_size = strides * (input_size-1) + kernel_size - 2*padding

strides, input_size, kernel_size, padding are integer padding is zero for 'valid'

跨步，input_size, kernel_size，填充为整数填充为0，为“有效”

#1

#2