+1 vote

I want different learning layers in different layers just like we do in Caffe. I just want to speed up the training for newly added layers without distorting them. Ex. I have a 6-convy-layer pre-trained model and I want to add a new convy-layer, The Starting 6 layers have a learning speed of 0.00002 and last one of 0.002, How can I do this?

0 votes

Just use these two optimizers:

var_list2 = [variables from first 6 layers]

var_list3 = [the rest of variables]

train_op1 = GradientDescentOptimizer(0.00002).minimize(loss, var_list=var_list2)

train_op2 = GradientDescentOptimizer(0.0002).minimize(loss, var_list=var_list3)

train_op = tf.group(train_op1, train_op2)

0 votes

@Krishna,You can easily achieve it using the two optimizers but this approach has one major disadvantage which is, it computes tf.gradient(.) twice inside the optimizers and thus, the rate of execution is slow. This problem can be solved by explicitly calling the tf.gradient(.) then splitting the list into two halves and passing corresponding gradients to each optimizer.

So you can use tf.trainable_variables() as a substitute to tf.gradient() to get all training variables and then select from them :

variable_list1 = [variables from first 6 layers]

variable_list2 = [the rest of variables]

op1 = tf.train.GradientDescentOptimizer(0.00002)

op2 = tf.train.GradientDescentOptimizer(0.0002)

grads = tf.gradients(loss, variable_list1 + variable_list2)

grads_one = grads[:len(variable_list1)]

grads_two = grads[len(variable_list1):]

train_op1 = op1.apply_gradients(zip(grads_one, variable_list1))

train_op2 = op2.apply_gradients(zip(grads_two, variable_list2))

train_op = tf.group(train_op1, train_op2)