The first and therefore the second loss functions calculate a similar issue, however during a slightly completely different manner. The third function calculates something completely different. You can see this by executing this code:
import tensorflow as tf
shape_obj = (5, 5)
shape_obj = (100, 6, 12)
Y1 = tf.random_normal(shape=shape_obj)
Y2 = tf.random_normal(shape=shape_obj)
loss1 = tf.reduce_sum(tf.pow(Y1 - Y2, 2)) / (reduce(lambda x, y: x*y, shape_obj))
loss2 = tf.reduce_mean(tf.squared_difference(Y1, Y2))
loss3 = tf.nn.l2_loss(Y1 - Y2)
with tf.Session() as sess:
print sess.run([loss1, loss2, loss3])
# after I run it, I got: [2.0291963, 2.0291963, 7305.1069]
Now you'll be able to verify that 1-st and 2-nd calculates a similar issue (in theory) by noticing that tf.pow(a - b, 2) is the same as tf.squared_difference(a - b, 2). Also, reduce_mean is that the same as reduce_sum / number_of_element. The issue is that computers cannot calculate everything precisely. To see what numerical instabilities can do to your calculations take a look at this:
import tensorflow as tf
shape_obj = (5000, 5000, 10)
Y1 = tf.zeros(shape=shape_obj)
Y2 = tf.ones(shape=shape_obj)
loss1 = tf.reduce_sum(tf.pow(Y1 - Y2, 2)) / (reduce(lambda x, y: x*y, shape_obj))
loss2 = tf.reduce_mean(tf.squared_difference(Y1, Y2))
with tf.Session() as sess:
print sess.run([loss1, loss2])
It is simple to envision that the solution should be 1, however, you'll get one thing like this: [1.0, 0.26843545].
Regarding your last function, the documentation says that:
Computes half the L2 norm of a tensor without the sqrt: output = sum(t ** 2) / 2
So if you want it to calculate the same thing (in theory) as the first one you need to scale it appropriately:
loss3 = tf.nn.l2_loss(Y1 - Y2) * 2 / (reduce(lambda x, y: x*y, shape_obj))