Explore Courses Blog Tutorials Interview Questions
0 votes
in Machine Learning by (19k points)

I'm slightly confused in regard to how I save a trained classifier. As in, re-training a classifier each time I want to use it is obviously really bad and slow, how do I save it and the load it again when I need it? The code is below, thanks in advance for your help. I'm using Python with NLTK Naive Bayes Classifier.

classifier = nltk.NaiveBayesClassifier.train(training_set)

# look inside the classifier train method in the source code of the NLTK library

def train(labeled_featuresets, estimator=nltk.probability.ELEProbDist):

    # Create the P(label) distribution

    label_probdist = estimator(label_freqdist)

    # Create the P(fval|label, fname) distribution

    feature_probdist = {}

    return NaiveBayesClassifier(label_probdist, feature_probdist)

1 Answer

0 votes
by (33.1k points)

You can use Python’s Pickle library to save most of the machine learning model and you can also restore the saved models later using same library.

To save the model:

import pickle

f = open('my_classifier.pickle', 'wb')

pickle.dump(classifier, f)


To load/restore the saved model:

import pickle

f = open('my_classifier.pickle', 'rb')

classifier = pickle.load(f)


I hope this answer would help

Browse Categories