research-article

A Novel Class Noise Estimation Method and Application in Classification

Authors:
Lin Gui

Harbin Institute of Technology, Shenzhen, China

Harbin Institute of Technology, Shenzhen, China
View Profile

,
Qin Lu

Hong Kong Polytechnic University, Hong Kong, Hong Kong

Hong Kong Polytechnic University, Hong Kong, Hong Kong
View Profile

,
Ruifeng Xu

Harbin Institute of Technology, Shenzhen, China

Harbin Institute of Technology, Shenzhen, China
View Profile

,
Minglei Li

Hong Kong Polytechnic University, Hong Kong, Hong Kong

Hong Kong Polytechnic University, Hong Kong, Hong Kong
View Profile

,
Qikang Wei

Harbin Institute of Technology, Shenzhen, China

Harbin Institute of Technology, Shenzhen, China
View Profile

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementOctober 2015Pages 1081–1090https://doi.org/10.1145/2806416.2806554

Published:17 October 2015Publication History

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

Pages 1081–1090

ABSTRACT

Noise in class labels of any training set can lead to poor classification results no matter what machine learning method is used. In this paper, we first present the problem of binary classification in the presence of random noise on the class labels, which we call class noise. To model class noise, a class noise rate is normally defined as a small independent probability of the class labels being inverted on the whole set of training data. In this paper, we propose a method to estimate class noise rate at the level of individual samples in real data. Based on the estimation result, we propose two approaches to handle class noise. The first technique is based on modifying a given surrogate loss function. The second technique eliminates class noise by sampling. Furthermore, we prove that the optimal hypothesis on the noisy distribution can approximate the optimal hypothesis on the clean distribution using both approaches. Our methods achieve over 87% accuracy on a synthetic non-separable dataset even when 40% of the labels are inverted. Comparisons to other algorithms show that our methods outperform state-of-the-art approaches on several benchmark datasets in different domains with different noise rates.

References

Zhu, X., and Wu, X "Class noise vs. attribute noise: A quantitative study." In Artificial Intelligence Review. 22(3): 177--210, 2004. Google ScholarDigital Library
Sáez, J. A., Galar, M, Luengo, J, and Herrera, F. "Analyzing the presence of noise in multi-class problems: alleviating its influence with the One-vs-One decomposition." In Knowledge and Information Systems. 38(1): 179--206, 2014.Google ScholarCross Ref
Joseph, L., Gyorkos, T. W., and Coupal, L.. "Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard." In American Journal of Epidemiology. 3: 263--272, 1995.Google ScholarCross Ref
Cawthorn, D. M., Steinman, H. A., and Hoffman, L. C.. "A High Incidence of Species Substitution and Mislabelling Detected in Meat Products Sold in South Africa." In Food Control. 32(2): 440--449, 2013.Google ScholarCross Ref
Beigman, E. and Klebanov, B. B.. "Learning with Annotation Noise". In Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, 280--287, 2009. Google ScholarDigital Library
Natarajan, N., Dhillon, I. S., and Ravikumar, P.. "Learning with Noisy Labels". In Proceeding of Advances in Neural Information Processing Systems. 2013.Google Scholar
Brodley, C. E., and Friedl, M. A.. "Identifying mislabeled training data." In Journal of Artificial Intelligence Research. 11: 131--167, 1999.Google ScholarCross Ref
Zighed, D.A., Lallich, S., Muhlenbach, F.. "A Statistical Approach to Class Separability". In Applied Stochastic Models in Business and Industry, Wiley-Blackwell, 21 (2): 187--197, 2005. Google ScholarDigital Library
Sluban, B., Gamberger, D., and Lavrac, N.. "Advances in Class Noise Detection." In Proceeding of European Conference on Artificial Intelligence, 1105--1106. 2010. Google ScholarDigital Library
Montgomery-Smith, S. J. "The distribution of Rademacher Sums." In Proceeding of the American Mathematical Society. 109(2): 517--522, 1990.Google ScholarCross Ref
Angluin, D., and D.Laird, P. "Learning from Noisy Examples." In Machine Learning 2(4): 343--370, 1988 Google ScholarDigital Library
Zhang, M. L., and Zhou, Z. H.. "CoTrade: Confident Co-Training with Data Editing." In Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions. 41(6): 1612--1626, 2011. Google ScholarDigital Library
Gui, L., Xu, R. F., Lu, Q., et al. "Cross-lingual Opinion Analysis via Negative Transfer Detection." In Proceedings of the 52th Annual Meeting of the ACL. 860--865, 2014.Google Scholar
Frénay, B., and Verleysen, M.. "Classification in the Presence of Label Noise: a Survey". In IEEE Transactions on Neural Networks and Learning Systems, Vol. 25, 5, 2014.Google ScholarCross Ref
Heskes, T. "The use of being Stubborn and Introspective," In Studies in Cognitive Systems. 2000.Google Scholar
Li, Y., Wessels, L. F. A., Ridder, D., and Reinders, M. J. T.. "Classification in the presence of class noise using a probabilistic Kernel Fisher method". In Pattern Recognition, Volume 40, Issue 12, December 2007, Pages 3349--3357. Google ScholarDigital Library
Scott, C., Blanchard. G., and Handy, G.. "Classification with Asymmetric Label Noise: Consistency and Maximal Denoising". In Journal of Machine Learning Research: Workshop and Conference Proceedings vol 30 (2013) 1--23Google Scholar
Lawrence, N. D., and Schölkopf, B.. "Estimating a Kernel Fisher Discriminant in the Presence of Label Noise," In Proceeding of International Conference on Machine Learning. 306--313, 2001. Google ScholarDigital Library
Perez, C. J., Giron, F. J., Martin, J., Ruiz, M., and Rojano, C.. "Misclassified Multinomial Data: A Bayesian Approach," Revista De La Real Academia De Ciencias Exactas Físicas Y Naturales Serie A Matemáticas, vol. 101, no. 1, 71--80, 2007.Google Scholar
Klebanov, B. B., and Beigman, E.. "From Annotator Agreement to Noise Models," In Computational. Linguistics, vol. 35, no. 4, 495--503, 2009. Google ScholarDigital Library
Kolcz, A., and Cormack, G. V.. "Genre-based Decomposition of Email Class Noise," In Proceeding of 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 427--436, 2009. Google ScholarDigital Library
Zhu, X., Wu, X., and Chen, Q. J.. "Eliminating Class Noise in Large Datasets." In Proceeding of International Conference on Machine Learning, vol. 3, 920--927. 2003.Google Scholar
Jiang, Y., and Zhou, Z. H.. "Editing Training Data for k-NN Classifiers with Neural Network Ensemble." In Advances in Neural Networks, 356--361. Springer Berlin Heidelberg, 2004.Google Scholar
Bennett, C., and Sharpley, M.. "Interpolation of Operators". Vol. 129. Academic press, 1988. Google ScholarDigital Library
Golub, T. R., Donna K. S., Pablo Tamayo, C. H., Michelle G., Jill, P. M., Hilary C.. "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring." Science. 286(5439): 531--537, 1999.Google ScholarCross Ref
Platt, J. C. "Fast Training of Support Vector Machines using Sequential Minimal Optimization". In Advances in Kernel Methods - Support Vector Learning, Cambridge, MA, 1998. Google ScholarDigital Library
Crammer, K., and Lee, D.. "Learning via Gaussian Herding." In Proceeding of Advances in Neural Information Processing Systems, 451--459. 2010.Google Scholar
Cui, B., Ooi, B. C., Su, J., and Tan, K. L. "Contorting high dimensional data for efficient main memory KNN processing". In Proceeding of International Conference on Management of Data - SIGMOD, pp. 479--490, 2003. Google ScholarDigital Library
Hui, J., Ooi, B.C., Shen, H., Yu, C., Zhou, A.: An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing. In: Proc. 19th ICDE Conference, p. 87 2003.Google Scholar

Recommendations

A First Study on the Use of Noise Filtering to Clean the Bags in Multi-Instance Classification
LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications

Data in the real world is far from being perfect. The appearance of noise is a common issue that arises from the limitations of data adquisition mechanisms and human knowledge. In classification, label noise will hinder the performance of any classifier,...
Read More
Noise correction to improve data and model quality for crowdsourcing
Abstract
In supervised learning, obtaining expert labeling of data is expensive and time-consuming in many cases. Crowdsourcing services provide a cheap and efficient way to acquire the labels of data. In crowdsourcing scenario, each instance ...
Highlights
- There are few works on noise handling techniques to improve crowdsourcing learning.
Read More
Class noise vs. attribute noise: a quantitative study of their impacts

Real-world data is never perfect and can often suffer from corruptions (noise) that may impact interpretations of the data, models created from the data and decisions made based on the data. Noise can reduce system performance in terms of classification ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
October 2015
1998 pages
ISBN:9781450337946
DOI:10.1145/2806416
General Chairs:
James Bailey
The University of Melbourne
,
Alistair Moffat
The University of Melbourne
,
Program Chairs:
Charu C. Aggarwal
IBM
,
Maarten de Rijke
University of Amsterdam
,
Ravi Kumar
Google
,
Vanessa Murdock
Microsoft
,
Timos Sellis
RMIT University
,
Jeffrey Xu Yu
Chinese University of Hong Kong
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
class noise
learning with noise
noise elimination
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '15 Paper Acceptance Rate165of646submissions,26%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 336
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Novel Class Noise Estimation Method and Application in Classification

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Recommendations

A First Study on the Use of Noise Filtering to Clean the Bags in Multi-Instance Classification

Noise correction to improve data and model quality for crowdsourcing

Class noise vs. attribute noise: a quantitative study of their impacts