
时间:2022-01-24 01:41:40

I am trying to code up an implementation of the variational autoencoder, however I am facing some difficulties regarding the loss function:


 def vae_loss(sigma, mu):
        def loss(y_true, y_pred):
            recon = K.sum(K.binary_crossentropy(y_true, y_pred), axis=-1)
            kl = 0.5 * K.sum(K.exp(sigma) + K.square(mu) - 1. - sigma, axis=-1)
            return recon + kl
        return loss

The binary crossentropy part works fine, but whenever I return only the divergence term kl for testing I get the following error: ValueError: "Tried to convert 'x' to a tensor and failed. Error: None values not supported.".


I am looking forward to possible hints as to what I have done wrong. You will find my entire code below. Thank you for your time!


import numpy as np
from keras import Model
from keras.layers import Input, Dense, Lambda
import keras.backend as K
from keras.datasets import mnist
from matplotlib import pyplot as plt

class VAE(object):

    def __init__(self, n_latent, batch_size):

        self.encoder, self.encoder_input,, self.sigma = self.create_encoder(n_latent, batch_size)
        self.decoder, self.decoder_input, self.decoder_output = self.create_decoder(n_latent, batch_size)
        pipeline = self.decoder(self.encoder.outputs[0])

        def vae_loss(sigma, mu):
            def loss(y_true, y_pred):
                recon = K.sum(K.binary_crossentropy(y_true, y_pred), axis=-1)
                kl = 0.5 * K.sum(K.exp(sigma) + K.square(mu) - 1. - sigma, axis=-1)
                return recon + kl
            return loss

        self.VAE = Model(self.encoder_input, pipeline)
        self.VAE.compile(optimizer="adadelta", loss=vae_loss(self.sigma,

    def create_encoder(self, n_latent, batch_size):

        input_layer = Input(shape=(784,))
        #net = Dense(512, activation="relu")(input_layer)
        mu = Dense(n_latent, activation="linear")(input_layer)
        sigma = Dense(n_latent, activation="linear")(input_layer)

        def sample_z(args):
            mu, log_sigma = args
            eps = K.random_normal(shape=(K.shape(input_layer)[0], n_latent), mean=0., stddev=1.)
            return mu + K.exp(log_sigma / 2) * eps

        sample_z = Lambda(sample_z)([mu, sigma])

        model = Model(inputs=input_layer, outputs=[sample_z, mu, sigma])
        return model, input_layer,  mu, sigma

    def create_decoder(self, n_latent, batch_size):

        input_layer = Input(shape=(n_latent,))
        #net = Dense(512, activation="relu")(input_layer)
        reconstruct = Dense(784, activation="linear")(input_layer)

        model = Model(inputs=input_layer, outputs=reconstruct)
        return model, input_layer, reconstruct

1 个解决方案



I am going to assume the error appears when you are "testing"/debugging your training phase, during backpropagation (let me if I am wrong).


If so, the problem is that you are asking Keras to optimize your whole network ( while using a loss (kl) covering only the encoder part. The gradients for the decoder stay undefined (without a loss like recon covering it), causing the optimization error.


For your debugging purpose, the error would disappear if you try to compile and fit only the encoder with this amputated loss (kl), or if you come up with a dummy (differentiable) loss covering also the decoder (e.g. K.sum(y_pred - y_pred, axis=-1) + kl).

为了您的调试目的,如果您尝试编译并仅适合具有此截断损耗(kl)的编码器,或者如果您提出覆盖解码器的虚拟(可微分)损失(例如K.sum(y_pred),则错误将消失 - y_pred,axis = -1)+ kl)。



I am going to assume the error appears when you are "testing"/debugging your training phase, during backpropagation (let me if I am wrong).


If so, the problem is that you are asking Keras to optimize your whole network ( while using a loss (kl) covering only the encoder part. The gradients for the decoder stay undefined (without a loss like recon covering it), causing the optimization error.


For your debugging purpose, the error would disappear if you try to compile and fit only the encoder with this amputated loss (kl), or if you come up with a dummy (differentiable) loss covering also the decoder (e.g. K.sum(y_pred - y_pred, axis=-1) + kl).

为了您的调试目的,如果您尝试编译并仅适合具有此截断损耗(kl)的编码器,或者如果您提出覆盖解码器的虚拟(可微分)损失(例如K.sum(y_pred),则错误将消失 - y_pred,axis = -1)+ kl)。