skip to main content
10.5555/1369599.1369724guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Text classification: a recent overview

Authors Info & Claims
Published:14 July 2005Publication History

ABSTRACT

Text classification is becoming more and more important with the rapid growth of on-line information available. This paper describes the text classification process. Of course, a single article cannot be a complete review of the text classification domain. Despite this, we hope that the references cited cover the major theoretical issues and guide the researcher to interesting research directions.

References

  1. {1} Bao Y. and Ishii N., "Combining Multiple kNN Classifiers for Text Categorization by Reducts", LNCS 2534, 2002, pp. 340-347. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {2} Bi Y., Bell D., Wang H., Guo G., Greer K., "Combining Multiple Classifiers Using Dempster's Rule of Combination for Text Categorization", MDAI, 2004, 127-138.Google ScholarGoogle Scholar
  3. {3} Brank J., Grobelnik M., Milic-Frayling N., Mladenic D., "Interaction of Feature Selection Methods and Linear Classification Models", Proc. of the 19th International Conference on Machine Learning, Australia, 2002.Google ScholarGoogle Scholar
  4. {4} Chawla, N. V., Bowyer, K. W., Hall, L. O., Kegelmeyer, W. P., "SMOTE: Synthetic Minority Over-sampling Technique," Journal of AI Research, 16 2002, pp. 321-357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. {5} Forman, G., An Experimental Study of Feature Selection Metrics for Text Categorization. Journal of Machine Learning Research, 3 2003, pp. 1289-1305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. {6} Fragoudis D., Meretakis D., Likothanassis S., "Integrating Feature and Instance Selection for Text Classification", SIGKDD '02, July 23-26, 2002, Edmonton, Alberta, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. {7} Guan J., Zhou S., "Pruning Training Corpus to Speedup Text Classification", DEXA 2002, pp. 831-840. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. {8} D. E. Johnson, F. J. Oles, T. Zhang, T. Goetz, "A decision-tree-based symbolic rule induction system for text categorization", IBM Systems Journal, September 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {9} Han X., Zu G., Ohyama W., Wakabayashi T., Kimura F., Accuracy Improvement of Automatic Text Classification Based on Feature Transformation and Multiclassifier Combination, LNCS, Volume 3309, Jan 2004, pp. 463-468.Google ScholarGoogle Scholar
  10. {10} Ke H., Shaoping M., "Text categorization based on Concept indexing and principal component analysis", Proc. TENCON 2002 Conference on Computers, Communications, Control and Power Engineering, 2002, pp. 51-56.Google ScholarGoogle Scholar
  11. {11} Kehagias A., Petridis V., Kaburlasos V., Fragkou P., "A Comparison of Word- and Sense-Based Text Categorization Using Several Classification Algorithms", JIIS, Volume 21, Issue 3, 2003, pp. 227-247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {12} Kim S. B., Rim H. C., Yook D. S. and Lim H. S., "Effective Methods for Improving Naive Bayes Text Classifiers", LNAI 2417, 2002, pp. 414-423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. {13} Klopotek M. and Woch M., "Very Large Bayesian Networks in Text Classification", ICCS 2003, LNCS 2657, 2003, pp. 397-406. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. {14} Leopold, Edda & Kindermann, Jöörg, "Text Categorization with Support Vector Machines. How to Represent Texts in Input Space?", Machine Learning 46, 2002, pp. 423-444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. {15} Lewis D., Yang Y., Rose T., Li F., "RCV1: A New Benchmark Collection for Text Categorization Research", Journal of Machine Learning Research 5, 2004, pp. 361-397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. {16} Heui Lim, Improving kNN Based Text Classification with Well Estimated Parameters, LNCS, Vol. 3316, Oct 2004, Pages 516-523.Google ScholarGoogle Scholar
  17. {17} Madsen R. E., Sigurdsson S., Hansen L. K. and Lansen J., "Pruning the Vocabulary for Better Context Recognition", 7th International Conference on Pattern Recognition, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. {18} Montanes E., Quevedo J. R. and Diaz I., "A Wrapper Approach with Support Vector Machines for Text Categorization", LNCS 2686, 2003, pp. 230-237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. {19} Nardiello P., Sebastiani F., Sperduti A., "Discretizing Continuous Attributes in AdaBoost for Text Categorization", LNCS, Volume 2633, Jan 2003, pp. 320-334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. {20} Qiang W., XiaoLong W., Yi G., "A Study of Semi-discrete Matrix Decomposition for LSI in Automated Text Categorization", LNCS, Volume 3248, Jan 2005, pp. 606-615. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. {21} Schneider, K., Techniques for Improving the Performance of Naive Bayes for Text Classification, LNCS, Vol. 3406, 2005, 682-693. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. {22} Sebastiani F., "Machine Learning in Automated Text Categorization", ACM Computing Surveys, vol. 34 (1), 2002, pp. 1-47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. {23} Shanahan J. and Roma N., Improving SVM Text Classification Performance through Threshold Adjustment, LNAI 2837, 2003, 361-372.Google ScholarGoogle Scholar
  24. {24} Soucy P. and Mineau G., "Feature Selection Strategies for Text Categorization", AI 2003, LNAI 2671, 2003, pp. 505-509. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. {25} Sousa P., Pimentao J. P., Santos B. R. and Moura-Pires F., "Feature Selection Algorithms to Improve Documents Classification Performance", LNAI 2663, 2003, pp. 288-296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. {26} Torkkola K., "Discriminative Features for Text Document Classification", Proc. International Conference on Pattern Recognition, Canada, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. {27} Vinciarelli A., "Noisy Text Categorization, Pattern Recognition", 17th International Conference on (ICPR'04), 2004, pp. 554-557. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. {28} Y. Yang, J. Zhang and B. Kisiel., "A scalability analysis of classifiers in text categorization", ACM SIGIR'03, 2003, pp 96-103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. {29} Zu G., Ohyama W., Wakabayashi T., Kimura F., "Accuracy improvement of automatic text classification based on feature transformation": Proc: the 2003 ACM Symposium on Document Engineering, November 20-22, 2003, pp. 118-120. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Text classification: a recent overview

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image Guide Proceedings
                ICCOMP'05: Proceedings of the 9th WSEAS International Conference on Computers
                July 2005
                768 pages
                ISBN:9608457297
                • Editor:
                • Nikos E. Mastorakis

                Publisher

                World Scientific and Engineering Academy and Society (WSEAS)

                Stevens Point, Wisconsin, United States

                Publication History

                • Published: 14 July 2005

                Qualifiers

                • Article