I have three networks, call them V, V_target, and Actor, and I'm trying to achieve the following setup:
我有三个网络,称为V,V_target和Actor,我正在尝试实现以下设置:
- V and Actor share certain layers.
- V_target is an exact duplicate of V.
V和Actor共享某些图层。
V_target与V完全相同。
For those familiar with deep RL, I'm using this within an actor-critic algorithm with shared layers between the value and policy networks, plus a target network V_target. I tried the following:
对于那些熟悉深度RL的人,我在演员评论算法中使用它,在价值和策略网络之间共享层,加上目标网络V_target。我尝试了以下方法:
def shared(...):
# define some variables, e.g.
W = get_variable('W', ...)
def Actor(...):
with tf.variable_scope("shared"):
shared_out = shared(...)
... actor-specific layers ...
def V(...):
with tf.variable_scope("shared", reuse=True):
shared_out = shared(...)
... V-specific layers...
with tf.variable_scope("Policy"):
actor_out = Actor(...)
with tf.variable_scope("V_main"):
V_out = V(...)
with tf.variable_scope("V_target"):
V_target = V(...)
As expected, this doesn't work because the use of the outermost variable_scope
prevents sharing between Policy and V_main: the Variable W
has name Policy/shared/W
in one scope but has name V_main/shared/W
under the second scope.
正如预期的那样,这不起作用,因为使用最外层的variable_scope会阻止Policy和V_main之间的共享:变量W在一个范围内具有名称Policy / shared / W,但在第二个范围内具有名称V_main / shared / W.
Why not use tf.name_scope("Policy")
and tf.name_scope("V_main")
? If I do that, the shared
variables can be defined, but then I don't have a good way to get the variables under V_main
and V_target
. Specifically, because tf.name_scope
does not append anything to names created by tf.get_variable
, I cannot use tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES ,'V_main')
and tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES ,'V_target')
to get both sets of objects for the so-called "target updates".
为什么不使用tf.name_scope(“Policy”)和tf.name_scope(“V_main”)?如果我这样做,可以定义共享变量,但是我没有一个很好的方法来获取V_main和V_target下的变量。具体来说,因为tf.name_scope没有向tf.get_variable创建的名称附加任何内容,所以我不能使用tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'V_main')和tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'V_target')为所谓的“目标更新”获取两组对象。
Is there any clever way around this?
这有什么聪明的方法吗?
1 个解决方案
#1
0
I suggest you do the trick described in this question: How to create variable outside of current scope in Tensorflow?
我建议你做这个问题中描述的技巧:如何在Tensorflow中创建当前范围之外的变量?
You can clear the current variable scope by providing an instance of an existing scope.
您可以通过提供现有范围的实例来清除当前变量范围。
So you simply need to define tf.variable_scope("shared")
once, remember the reference to this instance and use it inside all other variable scopes (with reuse=True
). W
variable will be created in shared
scope, no matter what the outer scope is.
所以你只需要定义一次tf.variable_scope(“shared”),记住对这个实例的引用并在所有其他变量范围内使用它(使用reuse = True)。无论外部范围是什么,W变量都将在共享范围内创建。
#1
0
I suggest you do the trick described in this question: How to create variable outside of current scope in Tensorflow?
我建议你做这个问题中描述的技巧:如何在Tensorflow中创建当前范围之外的变量?
You can clear the current variable scope by providing an instance of an existing scope.
您可以通过提供现有范围的实例来清除当前变量范围。
So you simply need to define tf.variable_scope("shared")
once, remember the reference to this instance and use it inside all other variable scopes (with reuse=True
). W
variable will be created in shared
scope, no matter what the outer scope is.
所以你只需要定义一次tf.variable_scope(“shared”),记住对这个实例的引用并在所有其他变量范围内使用它(使用reuse = True)。无论外部范围是什么,W变量都将在共享范围内创建。