Arabic Text Categorization using k-nearest neighbour, Decision Trees (C4.5) and Rocchio Classifier: A Comparative Study

Authors

  • Adel Hamdan Mohammad Computer Science Department, The world Islamic Sciences and Education University, Amman-Jordan Author
  • Omar Al- Momani Network Department, The World Islamic Sciences and Education University, Amman-Jordan Author
  • Tariq Alwada’ n Computer Science Department, The world Islamic Sciences and Education University, Amman-Jordan Author

DOI:

https://doi.org/10.14741/

Keywords:

Text Categorization, k-nearest neighbour, Decision tress, C4.5, Rocchio classifier

Abstract

No doubt that text classification is an important research area in information retrieval. In fact there are many researches about text classification in English language. A few researchers in general talk about text classification using Arabic data set. This research applies three well known classification algorithm. Algorithm applied are K-Nearest neighbour (K-NN), C4.5 and Rocchio algorithm. These well-known algorithms are applied on in-house collected Arabic data set. Data set used consists from 1400 documents belongs to 8 categories. Results show that precision and recall values using Rocchio classifier and K-NN are better than C4.5. This research makes a comparative study between mentioned algorithms. Also this study used a fixed number of documents for all categories of documents in training and testing phase.

References

Downloads

Published

2016-04-30

Issue

Section

Articles

How to Cite

Arabic Text Categorization using k-nearest neighbour, Decision Trees (C4.5) and Rocchio Classifier: A Comparative Study. (2016). International Journal of Current Engineering and Technology, 6(2), 477-482. https://doi.org/10.14741/