# Why is the Cross Entropy method preferred over Mean Squared Error? In what cases does this doesn't hold up?

1 view

Although both of the above methods provide a better score for the better closeness of prediction, still cross-entropy is preferred. Is it in every case or there are some peculiar scenarios where we prefer cross-entropy over MSE?

by (33.2k points)

Cross-entropy loss, or log loss, measure the performance of a classification model whose output is a probability value between 0 and 1. It is preferred for classification, while mean squared error (MSE) is one of the best choices for regression. This comes directly from the statement of your problems itself. In classification you work with a very particular set of possible output values thus MSE is badly defined.

To better understand the phenomena it is good to follow and understand the relations between

1. Cross-entropy

2. Logistic regression (binary cross-entropy)

3. Linear regression (MSE)

You will notice that both can be seen as a maximum likelihood estimator (MLE), simply with different assumptions about the dependent variable.

When you derive the cost function from the aspect of probability and distribution, you can observe that MSE happens when you assume the error follows Normal Distribution and cross-entropy when you assume binomial distribution. It means that implicitly when you use MSE, you are doing regression (estimation) and when you use CE, you are doing classification.

I hope it helps a little bit.