skip to main content
10.5555/782096.782104dlproceedingsArticle/Chapter ViewAbstractPublication PagescasconConference Proceedingsconference-collections
Article

Email classification with co-training

Authors Info & Claims
Published:05 November 2001Publication History

ABSTRACT

The main problems in text classification are lack of labeled data, as well as the cost of labeling the unlabeled data. We address these problems by exploring co-training - an algorithm that uses unlabeled data along with a few labeled examples to boost the performance of a classifier. We experiment with co-training on the email domain. Our results show that the performance of co-training depends on the learning algorithm it uses. In particular, Support Vector Machines significantly outperforms Naive Bayes on email classification.

References

  1. {1} Avrim Blum and Tom Mitchell. Combining Labeled and Unlabeled Data with Co-Training. In Proc. of the 11th Annual Conference on Computational Learning Theory , pages 92-100, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {2} Gary Boone. Concept Features in Re:Agent, an Intelligent Email Agent. In Proc. of the the 2nd International Conference on Autonomous Agents, pages 141-148, St. Paul, MN, USA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {3} Jake D. Brutlag and Christopher Meek. Challenges of the Email Domain for Text Classification. In Proc. of the 17th International Conference on Machine Learning, pages 103-110, Stanford University, USA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. {4} William W. Cohen. Learning Rules that Classify Email. In Proc. of the AAAI Spring Simposium on Machine Learning in Information Access, 1996.Google ScholarGoogle Scholar
  5. {5} M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery. Learning to Construct Knowledge Bases from the World Wide Web. Artificial Intelligence, (118):69-113, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. {6} A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1-38, 1977.Google ScholarGoogle Scholar
  7. {7} M. A. Hall. Correlation-based Feature Subset Selection for Machine Learning. PhD thesis, University of Waikato, 1998.Google ScholarGoogle Scholar
  8. {8} Thorsten Joachims. Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In Proc. of the 10th European Conference on Machine Learning, pages 137-142, Chemnitz, Germany, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {9} Thorsten Joachims. Transductive Inference for Text Classification using Support Vector Machines. In Proc. of the 16th International Conference on Machine Learning , pages 200-209, San Francisco, USA, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. {10} Thorsten Joachims. The Maximum Margin Approach to Learning Text Classifiers: Methods, Theory, and Algorithms. PhD thesis, Universität Dortmund, 2000.Google ScholarGoogle Scholar
  11. {11} George H. John and Pat Langley. Estimating Continuous Distributions in Bayesian Classifiers. In Proc. of the 11th Conference on Uncertainty in Artificial Intelligence , pages 338-345, Montreal, Quebec, Canada, 1995. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {12} Ion Muslea, Steven Minton, and Craig A. Knoblock. Selective Sampling + Semi-Supervised Learning = Robust Multi-View Learning. In IJCAI-2001 Workshop "Text Learning: Beyond Supervision ", 2001.Google ScholarGoogle Scholar
  13. {13} Kamal Nigam and Rayid Ghani. Analyzing the Effectiveness and Applicability of Co-training. In Proc. of the 9th International Conference on Information Knowledge Management, pages 86-93, McLean, VA, USA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. {14} David Pierce and Claire Cardie. Limitations of Co-Training for Natural Language Learning from Large Datasets. In Proc. of the 2001 Conference on Empirical Methods in Natural Language Processing, CMU, Pittsburgh, PA, USA, 2001.Google ScholarGoogle Scholar
  15. {15} M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz. A Bayesian Approach to Filtering Junk E-mail. In AAAI-98 Workshop on Learning for Text Categorization, Madison, Wisconsin, USA, 1998.Google ScholarGoogle Scholar
  16. {16} Richard B. Segal and Jeffrey O. Kephart. MailCat: An Intelligent Assistant for Organizing E-Mail. In Proc. of the Sixteenth National Conference on Artificial Intelligence , pages 925-926, Orlando, Florida, USA, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. {17} L. Valiant. A Theory of the Learnable. Communications of the ACM, 27(11):1134-1142, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. {18} Vladimir N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. {19} Werner Winiwarter. PEA - a Personal Email Assistant with Evolutionary Adaptation. International Journal of Information Technology, 5(1), 1999.Google ScholarGoogle Scholar
  20. {20} Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations . Morgan Kaufmann, 1999. http://www.cs.waikato.ac.nz/ml/weka/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. {21} Sarah Zelikovitz and Haym Hirsh. Improving Short-Text Classification using Unlabeled Background Knowledge to Assess Document Similarity. In Proc. of the 17th International Conference on Machine Learning, Stanford University, USA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Email classification with co-training

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          CASCON '01: Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
          November 2001
          230 pages

          Publisher

          IBM Press

          Publication History

          • Published: 5 November 2001

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate24of90submissions,27%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader