Explore Courses Blog Tutorials Interview Questions
0 votes
in Machine Learning by (19k points)

I'm building some predictive models in Python and have been using scikits learn's SVM implementation. It's been really great, easy to use, and relatively fast.

Unfortunately, I'm beginning to become constrained by my runtime. I run an rbf SVM on a full dataset of about 4 - 5000 with 650 features. Each run takes about a minute. But with a 5 fold cross validation + grid search (using a coarse to fine search), it's getting a bit unfeasible for my task at hand. So generally, do people have any recommendations in terms of the fastest SVM implementation that can be used in Python? That, or any way to speed up my modeling?

I've heard of LIBSVM's GPU implementation, which seems like it could work. I don't know of any other GPU SVM implementations usable in Python, but it would definitely be open to others. Also, does using the GPU significantly increase runtime?

I've also heard that there are ways of approximating the rbf SVM by using a linear SVM + feature map in scikits. Not sure what people think about this approach. Again, anyone using this approach, is it a significant increase in runtime?

All ideas for increasing the speed of the program are most welcome.

1 Answer

0 votes
by (33.1k points)

The most scalable kernel SVM implementation is LaSVM. This class is written in C, so you can use the Cython library to convert it in python. You can also use this class from the command line.

You can also use the utilities stored in sklearn.datasets to load convert data from CSR format to svmlight files so that LaSVM can be used as a training / test set. 

I hope this answer can clear your doubts.

Browse Categories