My question: How to train a classifier with only positive and neutral data?
I am building a personalized article recommendation system for educational purposes. The data I use is from Instapaper.
I only have positive data: - Articles that I have read and "liked", regardless of reading/unread status
And neutral data (because I have expressed interest in it, but I may not like it later anyway): - Articles that are unread - Articles that I have read and marked as read but I did not "like" it
The data I do not have is negative data: - Articles that I did not send to Instapaper to read it later (I am not interested, although I have browsed that page/article) - Articles that I might not even have clicked into, but I might have or might not have archived it.
In such a problem, negative data is basically missing. I have thought of the following solution(s) but did not resolve to them yet:
1) Feed a number of negative data to the classifier Pros: Immediate negative data to teach the classifier Cons: As the number of articles I like increase, the negative data effect on the classifier dims out
2) Turn the "neutral" data into negative data Pros: Now I have all the positive and (new) negative data I need Cons: Despite the neutral data is of mild interest to me, I'd still like to get recommendations on such article, but perhaps as a less value class.