I'm trying to implement "Stochastic gradient descent" in MATLAB. I followed the algorithm exactly but I'm getting a VERY VERY large w (coefficients) for the prediction/fitting function. Do I have a mistake in the algorithm?

x = 0:0.1:2*pi      // X-axis

    n = size(x,2);      

    r = -0.2+(0.4).*rand(n,1);  //generating random noise to be added to the sin(x) function



    for i=1:n

        t(i)=sin(x(i))+r(i);          // adding the noise

        y(i)=sin(x(i));               // the function without noise


    f = round(1+rand(20,1)*n);        //generating random indexes

    h = x(f);                         //choosing random x points

    k = t(f);                         //chossing random y points

    m=size(h,2);                     // length of the h vector

    scatter(h,k,'Red');              // drawing the training points (with noise)


    hold on;

    plot(x,sin(x));                 // plotting the Sin function

    w = [0.3 1 0.5];                    // starting point of w

    a=0.05;                         // learning rate "alpha"

// ---------------- ALGORITHM ---------------------//

    for i=1:20

        v = [1 h(i) h(i).^2];                      // X vector

        e = ((w*v') - k(i)).*v;            // prediction - observation

        w = w - a*e;                       // updating w


    hold on;

    l = 0:1:6;

    g = w(1)+w(2)*l+w(3)*(l.^2);

    plot(l,g,'Yellow');                      // drawing the prediction function

Typically, if we ended up with too large values, there is .overfitting. If you take too big learning rate, SGD is likely to diverge. The learning rate should converge to zero

