Explore Courses Blog Tutorials Interview Questions
+1 vote
in Machine Learning by (4.2k points)

I am coding gradient descent in matlab. For two features, I get for the update step:

temp0 = theta(1,1) - (alpha/m)*sum((X*theta-y).*X(:,1));
temp1 = theta(2,1) - (alpha/m)*sum((X*theta-y).*X(:,2));
theta(1,1) = temp0;
theta(2,1) = temp1;

However, I want to vectorize this code and to be able to apply it to any number of features. For the vectorization part, it was pointed out to me that what I am trying to do is a matrix multiplication

theta = theta - (alpha/m) * (X' * (X*theta-y));

This is well seen, but when I tried, I realized that it doesn't work for gradient descent because the parameters are not updated simultaneously.

Then, how can I vectorize this code and make sure the parameters and updated at the same time?

1 Answer

+2 votes
by (6.8k points)

Your vectorization is correct. I also tried both of your code, and it got me the same theta. Just remember don't use your updated theta in your second implementation.

This also works but less simplified than your 2nd implementation:

Error = X * theta - y;

for i = 1:2

    S(i) = sum(Error.*X(:,i));


theta = theta - alpha * (1/m) * S'

Studying Machine Learning Algorithms would easily solve the problem. Also, studying this will also make Machine Learning Course

Browse Categories