I would like to programmatically determine the language that the content of a website is written in.
The only thing that comes into my mind is to compare the content of the website with some set of words that are common to the particular language, and based on match percentage determine the language.
Are there any better and more robust ways to solve the problem?