The general formula to calculate the cost function is:

J(theta_0, theta_1) = 1/(2m) * sum_(i=1)^m [ h_theta(x^i) - y^i ]^2

The h_theta(x^i) term denotes what model outputs for x^i, so h_theta(x^i) - y^i is its error.

Now, we can calculate the square of this errors using [ h_theta(x^i) - y^i ]^2 and sum it over all samples, and to bound it, we normalize them, simply by dividing by m, so we have mean squared error (MSE) :

1/m * sum_(i=1)^m [ h_theta(x^i) - y^i ]^2

Hope this answer helps.