Author:Butgereit, LL; Botha, RADate:Sep 2011N-grams are used to quantify the similarity between two documents or the similarity between two collections of words. This paper shows how N-grams of length 3 and 4 both coupled with text processing (including stop word removal and stemming ...Read more