Let's begin by a short introduction to variable sharing. It is a mechanism in TensorFlow that allows for sharing variables accessed in different parts of the code without passing references to the variable around. The method tf.get_variable
can be used with the name of the variable as argument to either create a new variable with such name or retrieve the one that was created before. This is different from using the tf.Variable
constructor which will create a new variable every time it is called (and potentially add a suffix to the variable name if a variable with such name already exists). It is for the purpose of the variable sharing mechanism that a separate type of scope (variable scope) was introduced.
As a result, we end up having two different types of scopes:
- name scope, created using
tf.name_scope
ortf.op_scope
- variable scope, created using
tf.variable_scope
ortf.variable_op_scope
Both scopes have the same effect on all operations as well as variables created using tf.Variable
, i.e. the scope will be added as a prefix to the operation or variable name.
However, name scope is ignored by tf.get_variable
. We can see that in the following example:
with tf.name_scope("my_scope"): v1 = tf.get_variable("var1", [1], dtype=tf.float32) v2 = tf.Variable(1, name="var2", dtype=tf.float32) a = tf.add(v1, v2) print(v1.name) # var1:0 print(v2.name) # my_scope/var2:0 print(a.name) # my_scope/Add:0
The only way to place a variable accessed using tf.get_variable
in a scope is to use variable scope, as in the following example:
with tf.variable_scope("my_scope"): v1 = tf.get_variable("var1", [1], dtype=tf.float32) v2 = tf.Variable(1, name="var2", dtype=tf.float32) a = tf.add(v1, v2) print(v1.name) # my_scope/var1:0 print(v2.name) # my_scope/var2:0 print(a.name) # my_scope/Add:0
Finally, let's look at the difference between the different methods for creating scopes. We can group them in two categories:
tf.name_scope(name)
(for name scope) andtf.variable_scope(name_or_scope, ...)
(for variable scope) create a scope with the name specified as argumenttf.op_scope(values, name, default_name=None)
(for name scope) andtf.variable_op_scope(values, name_or_scope, default_name=None, ...)
(for variable scope) create a scope, just like the functions above, but besides the scopename
, they accept an argumentdefault_name
which is used instead ofname
when it is set toNone
. Moreover, they accept a list of tensors (values
) in order to check if all the tensors are from the same, default graph. This is useful when creating new operations, for example, see the implementation oftf.histogram_summary
.
大意是说 name_scope及variable_scope的作用都是为了不传引用而访问跨代码区域变量的一种方式,其内部功能是在其代码块内显式创建的变量都会带上scope前缀(如上面例子中的a),这一点它们几乎一样。而它们的差别是,在其作用域中获取变量,它们对 tf.get_variable() 函数的作用是一个会自动添加前缀,一个不会添加前缀。