0 votes
1 view
in Machine Learning by (14.7k points)

I have an assignment that involves introducing generalization to the network with one hidden ReLU layer using L2 loss. I wonder how to properly introduce it so that ALL weights are penalized, not only weights of the output layer.

Code for a network without generalization is at the bottom of the post (code to actually run the training is out of the scope of the question).

Obvious way of introducing the L2 is to replace the loss calculation with something like this (if beta is 0.01):

loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(out_layer, tf_train_labels) + 0.01*tf.nn.l2_loss(out_weights))

But in such a case, it will take into account the values of the output layer's weights. I am not sure, how do we properly penalize the weights which come INTO the hidden ReLU layer. Is it needed at all or introducing penalization of the output layer will somehow keep the hidden weights in check also?

1 Answer

0 votes
by (33.2k points)

The neural network works on weights and biases. There are some model parameters e.g.  hidden_weights, hidden_biases, out_weights, and out_biases that are used to create a neural network. You can add L2 regularization on these parameters.  

For example:

loss = (tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(

    logits=out_layer, labels=tf_train_labels)) +

    0.01*tf.nn.l2_loss(hidden_weights) +

    0.01*tf.nn.l2_loss(hidden_biases) +

    0.01*tf.nn.l2_loss(out_weights) +


Hope this answer helps.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !