What are the major differences and benefits of Porter and Lancaster Stemming algorithms?

Question

1 Answer

Anurag · Answer 1 · 2019-07-02T11:57:30+0000

The main difference between the Porter and Lancaster Stemming algorithms is that the Lancaster stemmer is significantly more dynamic than the Porter Stemmer.

The three major stemming algorithms in use nowadays:

Porter Stemmer
Snowball Stemmer
Lancaster Stemmer

Porter is the least aggressive algorithm, with the description of each algorithm actually being somewhat lengthy and technical.

Porter: It is the most commonly used stemmer nowadays. It is one of the few stemmers that actually have Java support and it is also the most computationally intensive of the algorithms. It is also the oldest stemming algorithm by a large margin.

Snowball: This is an improvement over porter. It is slightly faster computation time than porter, with a reasonably large community around it.

Lancaster: It is a very aggressive stemming algorithm. With Porter and Snowball, the stemmed representations are intuitive to a reader, not so with Lancaster, as many shorter words will become totally confusing. The fastest algorithm here, and will reduce your working set of words hugely, but if you want more distinction, not the tool you would want.

I’d suggest that Snowball is better than Porter and Lancaster.

Hope this answer helps.

What are the major differences and benefits of Porter and Lancaster Stemming algorithms?

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources