Explore Courses Blog Tutorials Interview Questions
0 votes
in Machine Learning by (19k points)

I see that in scikit-learn I can build an SVM classifier with the linear kernel in at last 3 different ways:


SVC with kernel='linear' parameter

Stochastic Gradient Descent with loss='hinge' parameter

Now, I see that the difference between the first two classifiers is that the former is implemented in terms of liblinear and the latter in terms of libsvm.

How the first two classifiers differ from the third one?

1 Answer

0 votes
by (33.1k points)

You should always use the complete data to solve a convex optimizations problem.

You need to treat the data in batches and performs a gradient descent aiming to minimize expected loss with respect to the sample distribution, assuming that the examples are iid samples of that distribution.

You can call the partial_fit function and feed it chunks of data. 

Hope this answer helps. To know more SGD Classifiers and other subsequent aforementioned topics, study SVM Algorithms. Since SVM Algorithms are quite interrelated with Machine Learning, one can always go through its Machine Learning Algorithms as well. 

Browse Categories