Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in AI and Deep Learning by (50.2k points)

I have an input image 416x416. How can I create an output of 4 x 10, where 4 is the number of columns and 10 the number of rows?

My label data is a 2D array with 4 columns and 10 rows.

I know about the reshape() method but it requires that the resulted shape has same number of elements as the input.

With 416 x 416 input size and max pools layers I can get max 13 x 13 output.

Is there a way to achieve 4x10 output without loss of data?

My input label data looks like for example like

[[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[ 0 0 0 0

[116 16 128 51

[132 16 149 52

[ 68 31 77 88

[ 79 34 96 92

[126 37 147 112

[100 41 126 116]]

Which indicates there are 6 objects on my images that I want to detect, first value is xmin, second ymin, third xmax, fourth ymax.

The last layer of my networks looks like

(None, 13, 13, 1024)

closed

1 Answer

0 votes
by (108k points)
selected by
 
Best answer

First, you have to transform your 3D shape to 1D. The process consists of steps like:

Flatten your layer (None, 13, 13, 1024) with the help of this syntax:

model.add(Flatten())

It will give you: 13*13*1024=173056

Now for 1-dimensional tensor, you have to add a dense layer with the help of this syntax:

model.add(Dense(4*10)) it will output to 40.

Now you can easily resize it to your necessity:

model.add(Reshape(4,10))

You can also refer to the following link for more information regarding the reshaping of the layers: https://www.programcreek.com/python/example/89685/keras.layers.Reshape

If you wish to learn about Keras then visit this Python Course.

Browse Categories

...