Intuition behind U-net vs FCN for semantic segmentation

Question

asked Aug 6, 2019 in AI and Deep Learning by ashely (50.2k points)

I don't quite understand the following:

In the proposed FCN for Semantic Segmentation by Shelhamer et al, they propose a pixel-to-pixel prediction to construct masks/exact locations of objects in an image.

In the slightly modified version of the FCN for biomedical image segmentation, the U-net, the main difference seems to be "a concatenation with the correspondingly cropped feature map from the contracting path."

Now, why does this feature make a difference particularly for biomedical segmentation? The main differences I can point out for biomedical images vs other data sets is that in biomedical images there are not as a rich set of features defining an object as for common everyday objects. Also, the size of the data set is limited. But is this extra feature inspired by these two facts or some other reason?

1 Answer

vinita · Answer 1 · 2019-08-06T12:48:36+0000

FCN

It upsamples only once. i.e. it has only one layer in the decoder
The variants of FCN-[FCN 16s and FCN 8s] add the skip connections from lower layers to make the output robust to scale changes

U-Net

It has multiple upsampling layers
It uses skip connections and concatenates instead of adding up
It uses learnable weight filters instead of fixed interpolation technique

You can refer the following link for Understanding Semantic Segmentation with UNET: https://towardsdatascience.com/understanding-semantic-segmentation-with-unet-6be4f42d4b47

If you are looking to learn more about Artificial Intelligence then you visit Artificial Intelligence(AI) Tutorial. Also, if you are appearing for job profiles of AI Engineer or AI Expert then you can prepare for the interviews on Artificial Intelligence Interview Questions.

Intuition behind U-net vs FCN for semantic segmentation

1 Answer

Related questions

Browse Categories