Author:Botha, G; Zimu, V; Barnard, EDate:Nov 2006The authors investigate the performance of text-based language identification systems on the 11 official languages of South Africa, when n-gram statistics are used as features for classification. In particular, the authors compare support ...Read more
Author:Botha, G; Barnard, EDate:Nov 2005Many applications of pattern recognition to natural language processing require large text corpora in a specified language. For many of the languages of the world, such corpora are not readily available, but significant quantities of text are ...Read more