Explore Courses Blog Tutorials Interview Questions
0 votes
in AI and Deep Learning by (50.2k points)

I don't quite understand the following:

In the proposed FCN for Semantic Segmentation by Shelhamer et al, they propose a pixel-to-pixel prediction to construct masks/exact locations of objects in an image.

In the slightly modified version of the FCN for biomedical image segmentation, the U-net, the main difference seems to be "a concatenation with the correspondingly cropped feature map from the contracting path."

Now, why does this feature make a difference particularly for biomedical segmentation? The main differences I can point out for biomedical images vs other data sets is that in biomedical images there are not as a rich set of features defining an object as for common everyday objects. Also, the size of the data set is limited. But is this extra feature inspired by these two facts or some other reason?

1 Answer

0 votes
by (108k points)


  • It upsamples only once. i.e. it has only one layer in the decoder

  • The variants of FCN-[FCN 16s and FCN 8s] add the skip connections from lower layers to make the output robust to scale changes


  • It has multiple upsampling layers

  • It uses skip connections and concatenates instead of adding up

  • It uses learnable weight filters instead of fixed interpolation technique

You can refer the following link for Understanding Semantic Segmentation with UNET:

If you are looking to learn more about Artificial Intelligence then you visit Artificial Intelligence(AI) Tutorial. Also, if you are appearing for job profiles of AI Engineer or AI Expert then you can prepare for the interviews on Artificial Intelligence Interview Questions.

Browse Categories