Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in AI and Deep Learning by (50.2k points)

I have an input image 416x416. How can I create an output of 4 x 10, where 4 is the number of columns and 10 the number of rows?

My label data is a 2D array with 4 columns and 10 rows.

I know about the reshape() method but it requires that the resulted shape has same number of elements as the input.

With 416 x 416 input size and max pools layers I can get max 13 x 13 output.

Is there a way to achieve 4x10 output without loss of data?

My input label data looks like for example like

[[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[116 16 128 51

[132 16 149 52

[ 68 31 77 88

[ 79 34 96 92

[126 37 147 112

[100 41 126 116]]

Which indicates there are 6 objects on my images that I want to detect, first value is xmin, second ymin, third xmax, fourth ymax.

The last layer of my networks looks like

(None, 13, 13, 1024)

closed

1 Answer

0 votes
by (108k points)
selected by
 
Best answer

First, you have to transform your 3D shape to 1D. The process consists of steps like:

Flatten your layer (None, 13, 13, 1024) with the help of this syntax:

model.add(Flatten())

It will give you: 13*13*1024=173056

Now for 1-dimensional tensor, you have to add a dense layer with the help of this syntax:

model.add(Dense(4*10)) it will output to 40.

Now you can easily resize it to your necessity:

model.add(Reshape(4,10))

You can also refer to the following link for more information regarding the reshaping of the layers: https://www.programcreek.com/python/example/89685/keras.layers.Reshape

If you wish to learn about Keras then visit this Python Course.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.5k answers

500 comments

108k users

Browse Categories

...