Explore Courses Blog Tutorials Interview Questions
0 votes
in Machine Learning by (19k points)

I am trying to implement the SVM loss function and its gradient. I found some example projects that implement these two, but I could not figure out how they can use the loss function when computing the gradient.

Here is the formula of loss function: 

enter image description here

What I cannot understand is that how can I use the loss function's result while computing gradient?

The example project computes the gradient as follows:

for i in xrange(num_train):

    scores = X[i].dot(W)

    correct_class_score = scores[y[i]]

    for j in xrange(num_classes):

      if j == y[i]:


      margin = scores[j] - correct_class_score + 1 # note delta = 1

      if margin > 0:

        loss += margin

        dW[:,j] += X[i]

        dW[:,y[i]] -= X[i] 

dW is for gradient result. And X is the array of training data. But I didn't understand how the derivative of the loss function results in this code.

1 Answer

0 votes
by (33.1k points)
edited by

There is a method to calculate the gradient is Calculus. It differentiates loss function with respect to W(yi) like this:

enter image description here

and with respect to W(j) when j!=yi is:

      enter image description here

Here 1 is just an indicator function so that we can ignore the middle form when the condition is true. Also, SVM Algorithms are useful as well.

Hope this helps!

If you want to know about Artificial Intelligence and also undergo Deep Learning Tutorial, then you can watch this video:

Browse Categories