使用Y_True作为中间层的输入

I am trying to implement a structure that looks like the following block diagram. I have the ability to implement it from scratch but when I want to implement it in Keras I have some difficulties. Any help would be appreciated. Specifically I have two questions about its implementation in Keras.

我正在尝试实现一个类似于下面的框图的结构。我有能力从头开始实现它，但是当我想在Keras中实现它时，我遇到了一些困难。任何帮助，将不胜感激。具体来说，我有两个关于它在Keras实施的问题。

1) How can I have my actual output as a separate input layer as it is shown in the following block diagram. As each input is fed into the network, I want the corresponding gold standard output in the Y_true section that I showed in the diagram.
2) If I want to back propagate the cost function from cost section, is it possible to just go backward from the vertical path and not the path that has the copy of the third layer.

1）如何将实际输出作为单独的输入层，如下面的框图所示。当每个输入被馈入网络时，我想要在图中显示的Y_true部分中相应的金标准输出。 2）如果我想从成本部分反向传播成本函数，是否可以从垂直路径向后，而不是具有第三层副本的路径。

2 个解决方案

#1

Please try this. The main idea is you create a model with 2 outputs, one for the y_pred and one for the loss. When compiling that model, use a list of loss functions and we just care about the second loss

请试试这个。主要思想是创建一个具有2个输出的模型，一个用于y_pred，另一个用于损失。在编译该模型时，使用损失函数列表，我们只关心第二次丢失

from keras.models import Model
from keras.layers import Dense, Input
from keras.layers.merge import _Merge
from keras import backend as K
import numpy as np

class CustomMerge(_Merge):
    def _merge_function(self, inputs):
        output = inputs[0]
        for i in range(1, len(inputs)):
            output += inputs[i]
        return output

class CustomLoss(_Merge):

    def _merge_function(self, inputs):
        output = inputs[0]
        for i in range(1, len(inputs)):
            output -= inputs[i]
        return output


input = Input(name= 'input', shape=[100])
y_true = Input(name = 'y_true', shape=[1])
layer1 = Dense(1024)(input)
layer2 = Dense(128)(layer1)
layer3 = Dense(1)(layer2)

y_pred = CustomMerge()([layer3, y_true]) # do whatever you want to calculate y_pred
loss = CustomLoss()([layer3, y_pred]) # do whatever you want to calculate loss

model = Model(inputs=[input, y_true], outputs = [y_pred, loss])
losses = [
            lambda y_true, y_pred: K.zeros([1]),  # don't care about this loss
            lambda y_true, y_pred: K.mean(K.square(y_pred), axis=-1),  # we only care about this loss and just care about y_pred, no matter what the y_true is.
        ]
model.compile(loss=losses, optimizer='adam')
model.summary()

batch_size = 32

X, Y = get_batch(batch_size)
L = np.zeros(batch_size)

model.train_on_batch([X, Y], [Y, L])

#2

I experimented on a custom loss function, it's possible, but it's a little more complicated than usual (and I have no idea if training will succeed...):

我尝试了自定义丢失函数，这是可能的，但它比平时稍微复杂一些（我不知道训练是否会成功......）：

import keras.backend as K

def customLoss(yTrue,yPred): 

    #starting with tensors shaped like (batch,5,3)

    #let's find the predicted class to compare - this example works with categorical classification (only one true class per element in a sequence)   
    trueMax = K.argmax(yTrue,axis=-1)
    predMax = K.argmax(yPred,axis=-1)
                #at this point, shapes become (batch,5)

    #let's find the different results:
    neq = K.not_equal(trueMax,predMax)

    #now we sum the different results. The ones with sum=0 are true
    neqsum = K.sum(neq,axis=-1)
                #shape now is only (batch)

    #to avoid false values being greater than 1, we do another comparison:
    trueFalse = K.equal(neqsum,0)

    #we adjust from values between 0 and 1 to values between -1 and 1:
    adj = (2*trueFalse) - 1

    #now it's time to create Loss1 and Loss2 (which I don't know)   
    #they are different from regular losses, because you must keep the batch size so you can multiply the result with "adj":

    l1 = someLoss keeping batch size   
    l2 = someLoss keeping batch size
              #these two must be shaped also like (batch)

    #then apply your formula:
    res = ((1-adj)*l1 + ((adj-1)*l2)
               #this step could perhaps be replaced by the K.switch function    
               #it would be probably much more efficient, but I'd have to learn how to use it first   

    #and finally, sum over the batch dimension, or use a mean value or anything similar
    return K.sum(res) #or K.mean(res)

A test (shapes are a little different, but keep the same number of dimensions):

测试（形状稍有不同，但保持相同的尺寸）：

def tprint(t):
    print(K.shape(t).eval())
    print(t.eval())
    print("\n")

x = np.array([[[.2,.7,.1],[.6,.3,.1],[.3,.3,.4],[.6,.3,.1],[.3,.6,.1]],[[.5,.2,.3],[.3,.6,.1],[.2,.7,.1],[.7,.15,.15],[.5,.2,.3]]])
y = np.array([[[0.,1.,0.],[1.,0.,0.],[0.,0.,1.],[1.,0.,0.],[0.,1.,0.]],[[0.,1.,0.],[0.,0.,1.],[0.,1.,0.],[1.,00.,00.],[1.,0.,0.]]])


x = K.variable(x)
y = K.variable(y)

xM = K.argmax(x,axis=-1)
yM = K.argmax(y,axis=-1)

neq = K.not_equal(xM,yM)

neqsum = K.sum(neq,axis=-1,keepdims=False)
trueFalse = K.equal(neqsum,0)
adj = (2*trueFalse) - 1

l1 = 3 * K.sum(K.sum(y,axis=-1),axis=-1)
l2 = 7 * K.sum(K.sum(y,axis=-1),axis=-1)

res = ((1-adj)*l1) +((adj-1)*l2)
sumres = K.sum(res) #or K.mean, or something similar
tprint(xM)
tprint(yM)
tprint(neq)
tprint(neqsum)
tprint(trueFalse)
tprint(adj)
tprint(l1)
tprint(l2)
tprint(res)

#1