I was looking at the docs of TensorFlow about **tf.nn.conv2d** here. But I can't understand what it does or what it is trying to achieve. It says on the docs,

# 1: Flattens the filter to a 2-D matrix with the shape

[filter_height * filter_width * in_channels, output_channels].

Now, what does that do? Is that element-wise multiplication or just plain matrix multiplication? I also could not understand the other two points mentioned in the docs. I have written them below :

# 2: Extracts image patches from the input tensor to form a virtual tensor of shape

[batch, out_height, out_width, filter_height * filter_width * in_channels].

# 3: For each patch, right-multiplies the filter matrix and the image patch vector.

It would be really helpful if anyone could give an example, a piece of code (extremely helpful) maybe and explain what is going on there and why the operation is like this.

I've tried coding a small portion and printing out the shape of the operation. Still, I can't understand.

I tried something like this:

Code:

op = tf.shape(tf.nn.conv2d(tf.random_normal([1,10,10,10]),

tf.random_normal([2,10,10,10]),

strides=[1, 2, 2, 1], padding='SAME'))

with tf.Session() as sess:

result = sess.run(op)

print(result)

I understand bits and pieces of convolutional neural networks. I studied them here. But the implementation of TensorFlow is not what I expected. So it raised the question.