I have an input image 416x416. How can I create an output of 4 x 10, where 4 is the number of columns and 10 the number of rows?
My label data is a 2D array with 4 columns and 10 rows.
I know about the reshape() method but it requires that the resulted shape has same number of elements as the input.
With 416 x 416 input size and max pools layers I can get max 13 x 13 output.
Is there a way to achieve 4x10 output without loss of data?
My input label data looks like for example like
[[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[116 16 128 51]
[132 16 149 52]
[ 68 31 77 88]
[ 79 34 96 92]
[126 37 147 112]
[100 41 126 116]]
Which indicates there are 6 objects on my images that I want to detect, first value is xmin, second ymin, third xmax, fourth ymax.
The last layer of my networks looks like
(None, 13, 13, 1024)