You can HashingVectorizer, which will work if you iteratively feed your data into batches of 10k or 100k documents that fit in memory for instance.
You should pass the batch of transformed documents to a linear classifier that supports the partial_fit method, e.g. SGDClassifier or PassiveAggressiveClassifier, and then simply iterate on new batches.
Evaluate the model on a held-out validation set. Then check the accuracy of the partially trained model without waiting for the samples.
Then average the resulting coef_ and intercept_ attribute to get a final linear model for all datasets.
Hope this helps answer you!