2 views

I really can't understand the following equation, especially 1/(2m).

What's the purpose of this equation? And where does 1/(2m) came from?

J(theta_0, theta_1) = 1/(2m) * sum_(i=1)^m [ h_theta(x^i) - y^i ]^2

by (33.1k points)

The general formula to calculate the cost function is:

J(theta_0, theta_1) = 1/(2m) * sum_(i=1)^m [ h_theta(x^i) - y^i ]^2

The h_theta(x^i) term denotes what model outputs for x^i, so h_theta(x^i) - y^i is its error.

Now, we can calculate the square of this errors using [ h_theta(x^i) - y^i ]^2 and sum it over all samples, and to bound it, we normalize them, simply by dividing by m, so we have mean squared error (MSE) :

1/m * sum_(i=1)^m [ h_theta(x^i) - y^i ]^2