Convolutional Neural Network that takes as input an RGB image and outputs a 10 element vector per pixel

Question

asked Jul 27, 2019 in Data Science by sourav (17.6k points)

I have a Keras implementation of a CNN, specifically a UNet, which I train by providing as input a 256x256x3 (RGB) retina image and an accompanying image mask of the same size:

The mask is my ground truth. Each pixel in the mask is one of 10 unique colours (white, black, blue etc) which maps to the location of one of 10 biological layers in the original retina image.

The UNet output is a 256x256x3 image where each pixel should be the same colour value as the corresponding colour in the image mask. What I want the output to be, however, is a 256x256x10 array where each pixel holds the probability (0.0 to 1.0) of one of the 10 colours occupying that position at that pixel.

Here is the code of my Unet:

# --------------------------------------------------------------------------------------
# CONV 2D BLOCK
# --------------------------------------------------------------------------------------
def conv2d_block(input_tensor, n_filters, kernel_size = 3, batchnorm = True):
"""Function to add 2 convolutional layers with the parameters passed to it"""
# first layer
x = Conv2D(filters = n_filters, kernel_size = kernel_size, data_format="channels_last", \
kernel_initializer = 'he_normal', padding = 'same')(input_tensor)
if batchnorm:
x = BatchNormalization()(x)
x = Activation('relu')(x)
# second layer
x = Conv2D(filters = n_filters, kernel_size = kernel_size, data_format="channels_last", \
kernel_initializer = 'he_normal', padding = 'same')(input_tensor)
if batchnorm:
x = BatchNormalization()(x)
x = Activation('relu')(x)
return x
# --------------------------------------------------------------------------------------
# GET THE U-NET ARCHITECTURE
# --------------------------------------------------------------------------------------
def get_unet(input_img, n_filters = 16, dropout = 0.1, batchnorm = True):
# Contracting Path (256 x 256 x 3)
c1 = conv2d_block(input_img, n_filters * 1, kernel_size = (3, 3), batchnorm = batchnorm)
p1 = MaxPooling2D((2, 2))(c1)
p1 = Dropout(dropout)(p1)
c2 = conv2d_block(p1, n_filters * 2, kernel_size = (3, 3), batchnorm = batchnorm)
p2 = MaxPooling2D((2, 2))(c2)
p2 = Dropout(dropout)(p2)
c3 = conv2d_block(p2, n_filters * 4, kernel_size = (3, 3), batchnorm = batchnorm)
p3 = MaxPooling2D((2, 2))(c3)
p3 = Dropout(dropout)(p3)
c4 = conv2d_block(p3, n_filters * 8, kernel_size = (3, 3), batchnorm = batchnorm)
p4 = MaxPooling2D((2, 2))(c4)
p4 = Dropout(dropout)(p4)
c5 = conv2d_block(p4, n_filters = n_filters * 16, kernel_size = (3, 3), batchnorm = batchnorm)
# Expansive Path
u6 = Conv2DTranspose(n_filters * 8, 3, strides = (2, 2), padding = 'same')(c5)
u6 = concatenate([u6, c4])
u6 = Dropout(dropout)(u6)
c6 = conv2d_block(u6, n_filters * 8, kernel_size = 3, batchnorm = batchnorm)
u7 = Conv2DTranspose(n_filters * 4, 3, strides = (2, 2), padding = 'same')(c6)
u7 = concatenate([u7, c3])
u7 = Dropout(dropout)(u7)
c7 = conv2d_block(u7, n_filters * 4, kernel_size = 3, batchnorm = batchnorm)
u8 = Conv2DTranspose(n_filters * 2, 3, strides = (2, 2), padding = 'same')(c7)
u8 = concatenate([u8, c2])
u8 = Dropout(dropout)(u8)
c8 = conv2d_block(u8, n_filters * 2, kernel_size = 3, batchnorm = batchnorm)
u9 = Conv2DTranspose(n_filters * 1, 3, strides = (2, 2), padding = 'same')(c8)
u9 = concatenate([u9, c1])
u9 = Dropout(dropout)(u9)
c9 = conv2d_block(u9, n_filters * 1, kernel_size = 3, batchnorm = batchnorm)
outputs = Conv2D(3, 1, activation='sigmoid')(c9)
model = Model(inputs=[input_img], outputs=[outputs])
return model

My question is how can I alter the design of the network so that it takes the same inputs but produces a 256x256x10 prediction for each pixel of the corresponding input image and mask?

1 Answer

sourav · Answer 1 · 2019-08-01T05:53:08+0000

You can use this:

outputs = Conv2D(10, 1, activation='softmax')(c9)

Where outputs is a tensor with shape [?,256,256,10] and softmax activation is taken along the last axis (axis=-1).

Convolutional Neural Network that takes as input an RGB image and outputs a 10 element vector per pixel

1 Answer

Related questions

Browse Categories