Cross-Entropy (log loss) is used in classification tasks mainly when the model outputs probabilities, for instance by using sigmoid or SoftMax. It compares best with the true class labels as predicted probabilities and assumes alignment either with binomial (binary) or multinomial (multi-class) distributions. It is preferred over Mean Squared Error in classification since it maximizes the probability of the categorical output by adjusting for the correct class.
Mean Squared Error (MSE) is used when working on regression because we are predicting continuous values — lets average out the squares of the differences between the actual and predicted values. MSE assumes that errors follow a normal distribution. It is apt for regression but not for classification, classification outputs categorical data.
Both are Maximum Likelihood Estimators (MLE), but:
Cross-Entropy is based on the binomial distribution (classification).
MSE is based on the normal distribution (regression).
In summary, Cross-Entropy is for classification (probabilistic outputs) and MSE is for regression (continuous outputs).