skip to main content
10.1145/2806416.2806554acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A Novel Class Noise Estimation Method and Application in Classification

Authors Info & Claims
Published:17 October 2015Publication History

ABSTRACT

Noise in class labels of any training set can lead to poor classification results no matter what machine learning method is used. In this paper, we first present the problem of binary classification in the presence of random noise on the class labels, which we call class noise. To model class noise, a class noise rate is normally defined as a small independent probability of the class labels being inverted on the whole set of training data. In this paper, we propose a method to estimate class noise rate at the level of individual samples in real data. Based on the estimation result, we propose two approaches to handle class noise. The first technique is based on modifying a given surrogate loss function. The second technique eliminates class noise by sampling. Furthermore, we prove that the optimal hypothesis on the noisy distribution can approximate the optimal hypothesis on the clean distribution using both approaches. Our methods achieve over 87% accuracy on a synthetic non-separable dataset even when 40% of the labels are inverted. Comparisons to other algorithms show that our methods outperform state-of-the-art approaches on several benchmark datasets in different domains with different noise rates.

References

  1. Zhu, X., and Wu, X "Class noise vs. attribute noise: A quantitative study." In Artificial Intelligence Review. 22(3): 177--210, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Sáez, J. A., Galar, M, Luengo, J, and Herrera, F. "Analyzing the presence of noise in multi-class problems: alleviating its influence with the One-vs-One decomposition." In Knowledge and Information Systems. 38(1): 179--206, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  3. Joseph, L., Gyorkos, T. W., and Coupal, L.. "Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard." In American Journal of Epidemiology. 3: 263--272, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  4. Cawthorn, D. M., Steinman, H. A., and Hoffman, L. C.. "A High Incidence of Species Substitution and Mislabelling Detected in Meat Products Sold in South Africa." In Food Control. 32(2): 440--449, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  5. Beigman, E. and Klebanov, B. B.. "Learning with Annotation Noise". In Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, 280--287, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Natarajan, N., Dhillon, I. S., and Ravikumar, P.. "Learning with Noisy Labels". In Proceeding of Advances in Neural Information Processing Systems. 2013.Google ScholarGoogle Scholar
  7. Brodley, C. E., and Friedl, M. A.. "Identifying mislabeled training data." In Journal of Artificial Intelligence Research. 11: 131--167, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  8. Zighed, D.A., Lallich, S., Muhlenbach, F.. "A Statistical Approach to Class Separability". In Applied Stochastic Models in Business and Industry, Wiley-Blackwell, 21 (2): 187--197, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sluban, B., Gamberger, D., and Lavrac, N.. "Advances in Class Noise Detection." In Proceeding of European Conference on Artificial Intelligence, 1105--1106. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Montgomery-Smith, S. J. "The distribution of Rademacher Sums." In Proceeding of the American Mathematical Society. 109(2): 517--522, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  11. Angluin, D., and D.Laird, P. "Learning from Noisy Examples." In Machine Learning 2(4): 343--370, 1988 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Zhang, M. L., and Zhou, Z. H.. "CoTrade: Confident Co-Training with Data Editing." In Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions. 41(6): 1612--1626, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gui, L., Xu, R. F., Lu, Q., et al. "Cross-lingual Opinion Analysis via Negative Transfer Detection." In Proceedings of the 52th Annual Meeting of the ACL. 860--865, 2014.Google ScholarGoogle Scholar
  14. Frénay, B., and Verleysen, M.. "Classification in the Presence of Label Noise: a Survey". In IEEE Transactions on Neural Networks and Learning Systems, Vol. 25, 5, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  15. Heskes, T. "The use of being Stubborn and Introspective," In Studies in Cognitive Systems. 2000.Google ScholarGoogle Scholar
  16. Li, Y., Wessels, L. F. A., Ridder, D., and Reinders, M. J. T.. "Classification in the presence of class noise using a probabilistic Kernel Fisher method". In Pattern Recognition, Volume 40, Issue 12, December 2007, Pages 3349--3357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Scott, C., Blanchard. G., and Handy, G.. "Classification with Asymmetric Label Noise: Consistency and Maximal Denoising". In Journal of Machine Learning Research: Workshop and Conference Proceedings vol 30 (2013) 1--23Google ScholarGoogle Scholar
  18. Lawrence, N. D., and Schölkopf, B.. "Estimating a Kernel Fisher Discriminant in the Presence of Label Noise," In Proceeding of International Conference on Machine Learning. 306--313, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Perez, C. J., Giron, F. J., Martin, J., Ruiz, M., and Rojano, C.. "Misclassified Multinomial Data: A Bayesian Approach," Revista De La Real Academia De Ciencias Exactas Físicas Y Naturales Serie A Matemáticas, vol. 101, no. 1, 71--80, 2007.Google ScholarGoogle Scholar
  20. Klebanov, B. B., and Beigman, E.. "From Annotator Agreement to Noise Models," In Computational. Linguistics, vol. 35, no. 4, 495--503, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kolcz, A., and Cormack, G. V.. "Genre-based Decomposition of Email Class Noise," In Proceeding of 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 427--436, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Zhu, X., Wu, X., and Chen, Q. J.. "Eliminating Class Noise in Large Datasets." In Proceeding of International Conference on Machine Learning, vol. 3, 920--927. 2003.Google ScholarGoogle Scholar
  23. Jiang, Y., and Zhou, Z. H.. "Editing Training Data for k-NN Classifiers with Neural Network Ensemble." In Advances in Neural Networks, 356--361. Springer Berlin Heidelberg, 2004.Google ScholarGoogle Scholar
  24. Bennett, C., and Sharpley, M.. "Interpolation of Operators". Vol. 129. Academic press, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Golub, T. R., Donna K. S., Pablo Tamayo, C. H., Michelle G., Jill, P. M., Hilary C.. "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring." Science. 286(5439): 531--537, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  26. Platt, J. C. "Fast Training of Support Vector Machines using Sequential Minimal Optimization". In Advances in Kernel Methods - Support Vector Learning, Cambridge, MA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Crammer, K., and Lee, D.. "Learning via Gaussian Herding." In Proceeding of Advances in Neural Information Processing Systems, 451--459. 2010.Google ScholarGoogle Scholar
  28. Cui, B., Ooi, B. C., Su, J., and Tan, K. L. "Contorting high dimensional data for efficient main memory KNN processing". In Proceeding of International Conference on Management of Data - SIGMOD, pp. 479--490, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Hui, J., Ooi, B.C., Shen, H., Yu, C., Zhou, A.: An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing. In: Proc. 19th ICDE Conference, p. 87 2003.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
    October 2015
    1998 pages
    ISBN:9781450337946
    DOI:10.1145/2806416

    Copyright © 2015 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 17 October 2015

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    CIKM '15 Paper Acceptance Rate165of646submissions,26%Overall Acceptance Rate1,861of8,427submissions,22%

    Upcoming Conference

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader