I was looking at the docs of TensorFlow about tf.nn.conv2d here. But I can't understand what it does or what it is trying to achieve. It says on the docs,
# 1: Flattens the filter to a 2-D matrix with the shape
[filter_height * filter_width * in_channels, output_channels].
Now, what does that do? Is that element-wise multiplication or just plain matrix multiplication? I also could not understand the other two points mentioned in the docs. I have written them below :
# 2: Extracts image patches from the input tensor to form a virtual tensor of shape
[batch, out_height, out_width, filter_height * filter_width * in_channels].
# 3: For each patch, right-multiplies the filter matrix and the image patch vector.
It would be really helpful if anyone could give an example, a piece of code (extremely helpful) maybe and explain what is going on there and why the operation is like this.
I've tried coding a small portion and printing out the shape of the operation. Still, I can't understand.
I tried something like this:
Code:
op = tf.shape(tf.nn.conv2d(tf.random_normal([1,10,10,10]),
tf.random_normal([2,10,10,10]),
strides=[1, 2, 2, 1], padding='SAME'))
with tf.Session() as sess:
result = sess.run(op)
print(result)
I understand bits and pieces of convolutional neural networks. I studied them here. But the implementation of TensorFlow is not what I expected. So it raised the question.