Comparing TR-Classifier and KNN by using Reduced Sizes of Vocabularies

Abstract : The aim of this study is topic identification by using two methods, in this case, a new one that we have proposed: TR-classifier which is based on computing triggers, and the well-known k Nearest Neighbors. Performances are acceptable, particularly for TR-classifier, though we have used reduced sizes of vocabularies. For the TR-Classifier, each topic is represented by a vocabulary which has been built using the corresponding training corpus. Whereas, the kNN method uses a general vocabulary, obtained by the concatenation of those used by the TR-Classifier. For the evaluation task, six topics have been selected to be identified: Culture, religion, economy, local news, international news and sports. An Arabic corpus has been used to achieve experiments.
Keywords : TR-Classifier
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [26 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01586533
Contributor : Kamel Smaïli <>
Submitted on : Wednesday, September 13, 2017 - 12:50:19 AM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM
Document(s) archivé(s) le : Thursday, December 14, 2017 - 12:25:38 PM

File

Citala.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01586533, version 1

Collections

Citation

Mourad Abbas, Kamel Smaïli, D Berkani. Comparing TR-Classifier and KNN by using Reduced Sizes of Vocabularies. 3rd International Conference on Arabic Language Processing, May 2009, Rabat, Morocco. ⟨hal-01586533⟩

Share

Metrics

Record views

288

Files downloads

85