Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (1.5k points)
edited by

I am running a CNN for a classification problem. I have 3 conv layers with 3 pooling layers. P3 is the output of the last pooling layer, whose dimensions are: [Batch_size, 4, 12, 48]_, and I want to flatten that matrix into a [Batch_size, 2304] size matrix, being 2304 = 4*12*48. I had been working with "Option A" (see below) for a while, but one day I wanted to try out "Option B", which would theoretically give me the same result. However, it did not. I have checked the following thread before

https://intellipaat.com/community/733/is-tf-contrib-layers-flatten-x-the-same-as-tf-reshape-x-n-1

but that just added more confusion, since trying "Option C" (taken from the aforementioned thread) gave a new different result.


P3 = tf.nn.max_pool(A3, ksize = [1, 2, 2, 1], strides = [1, 2, 2, 1], padding='VALID')

P3_shape = P3.get_shape().as_list()

P = tf.contrib.layers.flatten(P3)                             <-----Option A

P = tf.reshape(P3, [-1, P3_shape[1]*P3_shape[2]*P3_shape[3]]) <---- Option B

P = tf.reshape(P3, [tf.shape(P3)[0], -1])                     <---- Option C


I am more inclined to go with "Option B" since that is the one I have seen in a video by Dandelion Mane (

), but I would like to understand why these 3 options are giving different results.
 

1 Answer

0 votes
by (33.1k points)

In deep learning, we use rescaling methods to fit images of different shapes into a particular shape. Rescale options help to reshape the image easily.

There all three methods are used to rescale images:

import tensorflow as tf

import numpy as np

p3 = tf.placeholder(tf.float32, [None, 1, 2, 4])

p3_shape = p3.get_shape().as_list()

p_a = tf.contrib.layers.flatten(p3)

p_b = tf.reshape(p3, [-1, p3_shape[1] * p3_shape[2] * p3_shape[3]])

p_c = tf.reshape(p3, [tf.shape(p3)[0], -1])                         

print(p_a.get_shape())

print(p_b.get_shape())

print(p_c.get_shape())

with tf.Session() as sess:

    i_p3 = np.arange(16, dtype=np.float32).reshape([2, 1, 2, 4])

    print("a", sess.run(p_a, feed_dict={p3: i_p3}))

    print("b", sess.run(p_b, feed_dict={p3: i_p3}))

    print("c", sess.run(p_c, feed_dict={p3: i_p3}))

Here you can see, the above code yields the same result 3 times. But different results here are caused by something else, not by the reshaping.

Hope this answer helps you!

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

29.3k questions

30.6k answers

501 comments

104k users

Browse Categories

...