Article

Text classification: a recent overview

Authors:
M. Ikonomakis

Department of Mathematics, University of Patras, Greece

Department of Mathematics, University of Patras, Greece
View Profile

,
S. Kotsiantis

Department of Mathematics, University of Patras, Greece

Department of Mathematics, University of Patras, Greece
View Profile

,
V. Tampakas

Technological Educational, Institute of Patras, Greece

Technological Educational, Institute of Patras, Greece
View Profile

Authors Info & Claims

ICCOMP'05: Proceedings of the 9th WSEAS International Conference on ComputersJuly 2005Article No.: 125Pages 1–6

Published:14 July 2005Publication History

ICCOMP'05: Proceedings of the 9th WSEAS International Conference on Computers

Pages 1–6

ABSTRACT

Text classification is becoming more and more important with the rapid growth of on-line information available. This paper describes the text classification process. Of course, a single article cannot be a complete review of the text classification domain. Despite this, we hope that the references cited cover the major theoretical issues and guide the researcher to interesting research directions.

References

{1} Bao Y. and Ishii N., "Combining Multiple kNN Classifiers for Text Categorization by Reducts", LNCS 2534, 2002, pp. 340-347. Google ScholarDigital Library
{2} Bi Y., Bell D., Wang H., Guo G., Greer K., "Combining Multiple Classifiers Using Dempster's Rule of Combination for Text Categorization", MDAI, 2004, 127-138.Google Scholar
{3} Brank J., Grobelnik M., Milic-Frayling N., Mladenic D., "Interaction of Feature Selection Methods and Linear Classification Models", Proc. of the 19th International Conference on Machine Learning, Australia, 2002.Google Scholar
{4} Chawla, N. V., Bowyer, K. W., Hall, L. O., Kegelmeyer, W. P., "SMOTE: Synthetic Minority Over-sampling Technique," Journal of AI Research, 16 2002, pp. 321-357. Google ScholarDigital Library
{5} Forman, G., An Experimental Study of Feature Selection Metrics for Text Categorization. Journal of Machine Learning Research, 3 2003, pp. 1289-1305. Google ScholarDigital Library
{6} Fragoudis D., Meretakis D., Likothanassis S., "Integrating Feature and Instance Selection for Text Classification", SIGKDD '02, July 23-26, 2002, Edmonton, Alberta, Canada. Google ScholarDigital Library
{7} Guan J., Zhou S., "Pruning Training Corpus to Speedup Text Classification", DEXA 2002, pp. 831-840. Google ScholarDigital Library
{8} D. E. Johnson, F. J. Oles, T. Zhang, T. Goetz, "A decision-tree-based symbolic rule induction system for text categorization", IBM Systems Journal, September 2002. Google ScholarDigital Library
{9} Han X., Zu G., Ohyama W., Wakabayashi T., Kimura F., Accuracy Improvement of Automatic Text Classification Based on Feature Transformation and Multiclassifier Combination, LNCS, Volume 3309, Jan 2004, pp. 463-468.Google Scholar
{10} Ke H., Shaoping M., "Text categorization based on Concept indexing and principal component analysis", Proc. TENCON 2002 Conference on Computers, Communications, Control and Power Engineering, 2002, pp. 51-56.Google Scholar
{11} Kehagias A., Petridis V., Kaburlasos V., Fragkou P., "A Comparison of Word- and Sense-Based Text Categorization Using Several Classification Algorithms", JIIS, Volume 21, Issue 3, 2003, pp. 227-247. Google ScholarDigital Library
{12} Kim S. B., Rim H. C., Yook D. S. and Lim H. S., "Effective Methods for Improving Naive Bayes Text Classifiers", LNAI 2417, 2002, pp. 414-423. Google ScholarDigital Library
{13} Klopotek M. and Woch M., "Very Large Bayesian Networks in Text Classification", ICCS 2003, LNCS 2657, 2003, pp. 397-406. Google ScholarDigital Library
{14} Leopold, Edda & Kindermann, Jöörg, "Text Categorization with Support Vector Machines. How to Represent Texts in Input Space?", Machine Learning 46, 2002, pp. 423-444. Google ScholarDigital Library
{15} Lewis D., Yang Y., Rose T., Li F., "RCV1: A New Benchmark Collection for Text Categorization Research", Journal of Machine Learning Research 5, 2004, pp. 361-397. Google ScholarDigital Library
{16} Heui Lim, Improving kNN Based Text Classification with Well Estimated Parameters, LNCS, Vol. 3316, Oct 2004, Pages 516-523.Google Scholar
{17} Madsen R. E., Sigurdsson S., Hansen L. K. and Lansen J., "Pruning the Vocabulary for Better Context Recognition", 7th International Conference on Pattern Recognition, 2004. Google ScholarDigital Library
{18} Montanes E., Quevedo J. R. and Diaz I., "A Wrapper Approach with Support Vector Machines for Text Categorization", LNCS 2686, 2003, pp. 230-237. Google ScholarDigital Library
{19} Nardiello P., Sebastiani F., Sperduti A., "Discretizing Continuous Attributes in AdaBoost for Text Categorization", LNCS, Volume 2633, Jan 2003, pp. 320-334. Google ScholarDigital Library
{20} Qiang W., XiaoLong W., Yi G., "A Study of Semi-discrete Matrix Decomposition for LSI in Automated Text Categorization", LNCS, Volume 3248, Jan 2005, pp. 606-615. Google ScholarDigital Library
{21} Schneider, K., Techniques for Improving the Performance of Naive Bayes for Text Classification, LNCS, Vol. 3406, 2005, 682-693. Google ScholarDigital Library
{22} Sebastiani F., "Machine Learning in Automated Text Categorization", ACM Computing Surveys, vol. 34 (1), 2002, pp. 1-47. Google ScholarDigital Library
{23} Shanahan J. and Roma N., Improving SVM Text Classification Performance through Threshold Adjustment, LNAI 2837, 2003, 361-372.Google Scholar
{24} Soucy P. and Mineau G., "Feature Selection Strategies for Text Categorization", AI 2003, LNAI 2671, 2003, pp. 505-509. Google ScholarDigital Library
{25} Sousa P., Pimentao J. P., Santos B. R. and Moura-Pires F., "Feature Selection Algorithms to Improve Documents Classification Performance", LNAI 2663, 2003, pp. 288-296. Google ScholarDigital Library
{26} Torkkola K., "Discriminative Features for Text Document Classification", Proc. International Conference on Pattern Recognition, Canada, 2002. Google ScholarDigital Library
{27} Vinciarelli A., "Noisy Text Categorization, Pattern Recognition", 17th International Conference on (ICPR'04), 2004, pp. 554-557. Google ScholarDigital Library
{28} Y. Yang, J. Zhang and B. Kisiel., "A scalability analysis of classifiers in text categorization", ACM SIGIR'03, 2003, pp 96-103. Google ScholarDigital Library
{29} Zu G., Ohyama W., Wakabayashi T., Kimura F., "Accuracy improvement of automatic text classification based on feature transformation": Proc: the 2003 ACM Symposium on Document Engineering, November 20-22, 2003, pp. 118-120. Google ScholarDigital Library

Index Terms

Text classification: a recent overview

Recommendations

Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
Read More
Urdu text classification
FIT '09: Proceedings of the 7th International Conference on Frontiers of Information Technology

This paper compares statistical techniques for text classification using Naïve Bayes and Support Vector Machines, in context of Urdu language. A large corpus is used for training and testing purpose of the classifiers. However, those classifiers cannot ...
Read More
Increasing the Accuracy of Discriminative of Multinomial Bayesian Classifier in Text Classification
ICCIT '09: Proceedings of the 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology

Text Classification plays an important role in information extraction and summarization, text retrieval, and question-answering. The Discriminative Multinomial Naive Bayes classifier has been a focus of research in the field of text classification. This ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICCOMP'05: Proceedings of the 9th WSEAS International Conference on Computers
July 2005
768 pages
ISBN:9608457297
Editor:
Nikos E. Mastorakis
Head of the Department of Computer Science, Military Institutes of University Education, Hellenic Naval Academy, Piraeus, Greece
Sponsors
In-Cooperation
Publisher
World Scientific and Engineering Academy and Society (WSEAS)
Stevens Point, Wisconsin, United States
Publication History
- Published: 14 July 2005
Author Tags
feature selection
learning algorithms
text mining
text representation
Qualifiers
- Article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 14
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

Text classification: a recent overview

ICCOMP'05: Proceedings of the 9th WSEAS International Conference on Computers

ABSTRACT

References

Cited By

Index Terms

Recommendations

Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Urdu text classification

Increasing the Accuracy of Discriminative of Multinomial Bayesian Classifier in Text Classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

Digital Edition

Caption

Text classification: a recent overview

ICCOMP'05: Proceedings of the 9th WSEAS International Conference on Computers

ABSTRACT

References

Cited By

Index Terms

Recommendations

Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Urdu text classification

Increasing the Accuracy of Discriminative of Multinomial Bayesian Classifier in Text Classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

Digital Edition

Share this Publication link

Share on Social Media