I am trying to code up an implementation of the variational autoencoder, however I am facing some difficulties regarding the loss function:
我正在尝试编写变量自动编码器的实现,但是我在丢失函数方面遇到了一些困难:
def vae_loss(sigma, mu):
def loss(y_true, y_pred):
recon = K.sum(K.binary_crossentropy(y_true, y_pred), axis=-1)
kl = 0.5 * K.sum(K.exp(sigma) + K.square(mu) - 1. - sigma, axis=-1)
return recon + kl
return loss
The binary crossentropy part works fine, but whenever I return only the divergence term kl for testing I get the following error: ValueError: "Tried to convert 'x' to a tensor and failed. Error: None values not supported.".
二进制交叉熵部分工作正常,但每当我只返回发散项kl进行测试时,我得到以下错误:ValueError:“试图将'x'转换为张量并失败。错误:不支持任何值。”。
I am looking forward to possible hints as to what I have done wrong. You will find my entire code below. Thank you for your time!
我期待有关我做错了什么的可能暗示。您将在下面找到我的完整代码。感谢您的时间!
import numpy as np
from keras import Model
from keras.layers import Input, Dense, Lambda
import keras.backend as K
from keras.datasets import mnist
from matplotlib import pyplot as plt
class VAE(object):
def __init__(self, n_latent, batch_size):
self.encoder, self.encoder_input, self.mu, self.sigma = self.create_encoder(n_latent, batch_size)
self.decoder, self.decoder_input, self.decoder_output = self.create_decoder(n_latent, batch_size)
pipeline = self.decoder(self.encoder.outputs[0])
def vae_loss(sigma, mu):
def loss(y_true, y_pred):
recon = K.sum(K.binary_crossentropy(y_true, y_pred), axis=-1)
kl = 0.5 * K.sum(K.exp(sigma) + K.square(mu) - 1. - sigma, axis=-1)
return recon + kl
return loss
self.VAE = Model(self.encoder_input, pipeline)
self.VAE.compile(optimizer="adadelta", loss=vae_loss(self.sigma, self.mu))
def create_encoder(self, n_latent, batch_size):
input_layer = Input(shape=(784,))
#net = Dense(512, activation="relu")(input_layer)
mu = Dense(n_latent, activation="linear")(input_layer)
print(mu)
sigma = Dense(n_latent, activation="linear")(input_layer)
def sample_z(args):
mu, log_sigma = args
eps = K.random_normal(shape=(K.shape(input_layer)[0], n_latent), mean=0., stddev=1.)
K.print_tensor(K.shape(eps))
return mu + K.exp(log_sigma / 2) * eps
sample_z = Lambda(sample_z)([mu, sigma])
model = Model(inputs=input_layer, outputs=[sample_z, mu, sigma])
return model, input_layer, mu, sigma
def create_decoder(self, n_latent, batch_size):
input_layer = Input(shape=(n_latent,))
#net = Dense(512, activation="relu")(input_layer)
reconstruct = Dense(784, activation="linear")(input_layer)
model = Model(inputs=input_layer, outputs=reconstruct)
return model, input_layer, reconstruct
1 个解决方案
#1
0
I am going to assume the error appears when you are "testing"/debugging your training phase, during backpropagation (let me if I am wrong).
当你在反向传播过程中“测试”/调试训练阶段时,我会假设出现错误(如果我错了,请告诉我)。
If so, the problem is that you are asking Keras to optimize your whole network (model.VAE.fit(...)
) while using a loss (kl
) covering only the encoder part. The gradients for the decoder stay undefined (without a loss like recon
covering it), causing the optimization error.
如果是这样,问题是你要求Keras优化你的整个网络(model.VAE.fit(...)),同时使用仅覆盖编码器部分的丢失(kl)。解码器的梯度保持不确定(没有像重新覆盖它的损失),导致优化错误。
For your debugging purpose, the error would disappear if you try to compile and fit only the encoder with this amputated loss (kl
), or if you come up with a dummy (differentiable) loss covering also the decoder (e.g. K.sum(y_pred - y_pred, axis=-1) + kl
).
为了您的调试目的,如果您尝试编译并仅适合具有此截断损耗(kl)的编码器,或者如果您提出覆盖解码器的虚拟(可微分)损失(例如K.sum(y_pred),则错误将消失 - y_pred,axis = -1)+ kl)。
#1
0
I am going to assume the error appears when you are "testing"/debugging your training phase, during backpropagation (let me if I am wrong).
当你在反向传播过程中“测试”/调试训练阶段时,我会假设出现错误(如果我错了,请告诉我)。
If so, the problem is that you are asking Keras to optimize your whole network (model.VAE.fit(...)
) while using a loss (kl
) covering only the encoder part. The gradients for the decoder stay undefined (without a loss like recon
covering it), causing the optimization error.
如果是这样,问题是你要求Keras优化你的整个网络(model.VAE.fit(...)),同时使用仅覆盖编码器部分的丢失(kl)。解码器的梯度保持不确定(没有像重新覆盖它的损失),导致优化错误。
For your debugging purpose, the error would disappear if you try to compile and fit only the encoder with this amputated loss (kl
), or if you come up with a dummy (differentiable) loss covering also the decoder (e.g. K.sum(y_pred - y_pred, axis=-1) + kl
).
为了您的调试目的,如果您尝试编译并仅适合具有此截断损耗(kl)的编码器,或者如果您提出覆盖解码器的虚拟(可微分)损失(例如K.sum(y_pred),则错误将消失 - y_pred,axis = -1)+ kl)。