
时间:2022-06-29 07:17:28

I want to train a simple convolutional auto-encoder using Theano, which has been working great. However, I don't see how one can reverse the conv2d command when subsampling (stride) is used. Is there an efficient way to "invert" the convolution command when stride is used, like in the image below?



For example, I want to change the following ...


from theano.tensor.nnet.conv import conv2d
x = T.tensor4('x') 
y = T.tanh(  conv2d( x, W, border_mode='valid', subsample = (1,1) )  )
z = conv2d( y, Wprime, border_mode='full', subsample = (1,1)  )

... into the situation where subsample = (2,2). The first layer will work work just as expected. However, the second layer will effectively "do a convolution with stride 1, then throw away half of the outputs". This is clearly a different operation than what I'm looking for - z won't even have the same number of neurons as length as x. What should the second conv2d command be to "reconstruct" the original x?

…进入子样本=(2,2)的情况。第一层将按照预期工作。然而,第二层将有效地“与stride 1进行卷积,然后丢弃一半的输出”。这显然是一个不同于我正在寻找的操作- z甚至没有和x一样多的神经元数目。第二个conv2d命令应该是什么来“重建”原来的x呢?

1 个解决方案



I deduce from this that you intend to have tied weights, i.e. if the first operation were are matrix multiplication with W, then the output would be generated with W.T, the adjoint matrix. In your case you would thus be looking for the adjoint of the convolution operator followed by subsampling.


(EDIT: I deduced wrongly, you can use any filter whatsoever to 'deconvolve', as long as you get the shapes right. Talking about the adjoint is still informative, though. You will be able to relax the assumption afterwards.)


Since the convolution operator and subsampling operators are linear operator, lets denote them by C and S respectively and observe that convolution + subsampling an image x would be


S C x

and that the adjoint operation on y (which lives in the same space as S C x) would be

y上的伴随运算(和csc x在同一个空间里)会是

C.T S.T y

Now, S.T is nothing other than upsampling to the original image size by adding zeros around all entries of y until the right size is obtained.


From your post, you seem to be aware of the adjoint of the convolution operator of stride (1, 1) - it is the convolution with reversed filters and reversed border_mode, i.e. with filters.dimshuffle(1, 0, 2, 3)[:, :, ::-1, ::-1] and switch from border_mode='valid' to border_mode='full'.

从你的文章中,你似乎意识到卷积运算符的伴随(1,1)-它是与反向滤波器和反向边界模式的卷积,即滤波器。dimshuffle(1, 0, 2, 3)[:,:,:-1,::-1],从border_mode='valid'切换到border_mode='full'。

Concatenate upsampling and this reverse filter convolution and you obtain the adjoint you seek.


Note: There may be ways of exploiting the gradient T.grad or T.jacobian to obtain this automatically, but I am never sure how this is done exactly.


EDIT: There, I wrote it down :)


import theano
import theano.tensor as T
import numpy as np

filters = theano.shared(np.random.randn(4, 3, 6, 5).astype('float32'))

inp1 = T.tensor4(dtype='float32')

subsampled_convolution = T.nnet.conv2d(inp1, filters, border_mode='valid', subsample=(2, 2))

inp2 = T.tensor4(dtype='float32')
shp = inp2.shape
upsample = T.zeros((shp[0], shp[1], shp[2] * 2, shp[3] * 2), dtype=inp2.dtype)
upsample = T.set_subtensor(upsample[:, :, ::2, ::2], inp2)
upsampled_convolution = T.nnet.conv2d(upsample,
     filters.dimshuffle(1, 0, 2, 3)[:, :, ::-1, ::-1], border_mode='full')

f1 = theano.function([inp1], subsampled_convolution)
f2 = theano.function([inp2], upsampled_convolution)

x = np.random.randn(1, 3, 10, 10).astype(np.float32)
f1x = f1(x)
y = np.random.randn(*f1x.shape).astype(np.float32)
f2y = f2(y)

p1 =, y.ravel())
p2 =, f2y[:, :, :-1].ravel())

print p1 - p2

p1 being equal to p2 corroborates that f2 is the adjoint of f1




I deduce from this that you intend to have tied weights, i.e. if the first operation were are matrix multiplication with W, then the output would be generated with W.T, the adjoint matrix. In your case you would thus be looking for the adjoint of the convolution operator followed by subsampling.


(EDIT: I deduced wrongly, you can use any filter whatsoever to 'deconvolve', as long as you get the shapes right. Talking about the adjoint is still informative, though. You will be able to relax the assumption afterwards.)


Since the convolution operator and subsampling operators are linear operator, lets denote them by C and S respectively and observe that convolution + subsampling an image x would be


S C x

and that the adjoint operation on y (which lives in the same space as S C x) would be

y上的伴随运算(和csc x在同一个空间里)会是

C.T S.T y

Now, S.T is nothing other than upsampling to the original image size by adding zeros around all entries of y until the right size is obtained.


From your post, you seem to be aware of the adjoint of the convolution operator of stride (1, 1) - it is the convolution with reversed filters and reversed border_mode, i.e. with filters.dimshuffle(1, 0, 2, 3)[:, :, ::-1, ::-1] and switch from border_mode='valid' to border_mode='full'.

从你的文章中,你似乎意识到卷积运算符的伴随(1,1)-它是与反向滤波器和反向边界模式的卷积,即滤波器。dimshuffle(1, 0, 2, 3)[:,:,:-1,::-1],从border_mode='valid'切换到border_mode='full'。

Concatenate upsampling and this reverse filter convolution and you obtain the adjoint you seek.


Note: There may be ways of exploiting the gradient T.grad or T.jacobian to obtain this automatically, but I am never sure how this is done exactly.


EDIT: There, I wrote it down :)


import theano
import theano.tensor as T
import numpy as np

filters = theano.shared(np.random.randn(4, 3, 6, 5).astype('float32'))

inp1 = T.tensor4(dtype='float32')

subsampled_convolution = T.nnet.conv2d(inp1, filters, border_mode='valid', subsample=(2, 2))

inp2 = T.tensor4(dtype='float32')
shp = inp2.shape
upsample = T.zeros((shp[0], shp[1], shp[2] * 2, shp[3] * 2), dtype=inp2.dtype)
upsample = T.set_subtensor(upsample[:, :, ::2, ::2], inp2)
upsampled_convolution = T.nnet.conv2d(upsample,
     filters.dimshuffle(1, 0, 2, 3)[:, :, ::-1, ::-1], border_mode='full')

f1 = theano.function([inp1], subsampled_convolution)
f2 = theano.function([inp2], upsampled_convolution)

x = np.random.randn(1, 3, 10, 10).astype(np.float32)
f1x = f1(x)
y = np.random.randn(*f1x.shape).astype(np.float32)
f2y = f2(y)

p1 =, y.ravel())
p2 =, f2y[:, :, :-1].ravel())

print p1 - p2

p1 being equal to p2 corroborates that f2 is the adjoint of f1
