research-article

Learning to combine discriminative classifiers: confidence based

Author:
Chi-Hoon Lee

Yahoo! Labs, Sunnyvale, CA, USA

Yahoo! Labs, Sunnyvale, CA, USA
View Profile

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2010Pages 743–752https://doi.org/10.1145/1835804.1835899

Published:25 July 2010Publication History

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 743–752

ABSTRACT

Much of research in data mining and machine learning has led to numerous practical applications. Spam filtering, fraud detection, and user query-intent analysis has relied heavily on machine learned classifiers, and resulted in improvements in robust classification accuracy. Combining multiple classifiers (a.k.a. Ensemble Learning) is a well studied and has been known to improve effectiveness of a classifier. To address two key challenges in Ensemble Learning-- (1) learning weights of individual classifiers and (2) the combination rule of their weighted responses, this paper proposes a novel Ensemble classifier, EnLR, that computes weights of responses from discriminative classifiers and combines their weighted responses to produce a single response for a test instance. The combination rule is based on aggregating weighted responses, where a weight of an individual classifier is inversely based on their respective variances around their responses. Here, variance quantifies the uncertainty of the discriminative classifiers' parameters, which in turn depends on the training samples. As opposed to other ensemble methods where the weight of each individual classifier is learned as a part of parameter learning and thus the same weight is applied to all testing instances, our model is actively adjusted as individual classifiers become confident at its decision for a test instance. Our empirical experiments on various data sets demonstrate that our combined classifier produces "effective" results when compared with a single classifier. Our novel classifier shows statistically significant better accuracy when compared to well known Ensemble methods -- Bagging and AdaBoost. In addition to robust accuracy, our model is extremely efficient dealing with high volumes of training samples due to the independent learning paradigm among its multiple classifiers. It is simple to implement in a distributed computing environment such as Hadoop.

Supplemental Material

kdd2010_lee_lcd_01.mov

mov

132.4 MB

Download

References

T. Amemiya. Introduction to Statistics and Econometrics. Harvard University Press, 1994.Google Scholar
A. Asuncion and D. Newman. UCI machine learning repository, 2007.Google Scholar
L. Breiman and L. Breiman. Bagging predictors. In Machine Learning, pages 123--140, 1996. Google ScholarDigital Library
H. Cao, D. H. Hu, D. Shen, D. Jiang, J.-T. Sun,E. Chen, and Q. Yang. Context-aware query classification. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 3--10, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
R. Frank, M. Ester, and A. Knobbe. A multi-relational approach to spatial classification. In KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 309--318, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm, 1996.Google Scholar
J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.Google ScholarCross Ref
J. Friedman. Stochastic gradient boosting. Computational Statistics and Data Analysis, 38(4):367--378, 2002. Google ScholarDigital Library
A. Fuxman, A. Kannan, A. B. Goldberg, R. Agrawal, P. Tsaparas, and J. Shafer. Improving classification accuracy using automatically extracted training data. In KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1145--1154, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
S. Ghosal. Dirichlet process, related priors and posterior asymptotics. Bayesian Nonparametrics in Practice, 2009.Google Scholar
R. Greiner and W. Zhou. Structural extension to logistic regression. Proceedings of the Eighteenth Annual National Conference on Artificial Intelligence (AAI02), 2002. Google ScholarDigital Library
Z. Gyongyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In VLDB '04:Proceedings of the Thirtieth international conference on Very large data bases, pages 576--587. VLDB Endowment, 2004. Google ScholarDigital Library
Hadoop. http://hadoop.apache.org/, 2009.Google Scholar
S. Hashem. Optimal linear combinations of neural networks. NEURAL NETWORKS, 10(4):599--614, 1994. Google ScholarDigital Library
T. Hastie, R. Tibshirani, and J. Friedma. The Elements of Statistical Learning. Springer, New York, NY, 2002.Google Scholar
D. W. Hosmer and S. Lemeshow. Applied logistic regression (Wiley Series in probability and statistics). Wiley-Interscience Publication, September 2000.Google Scholar
S. in Lee, H. Lee, P. Abbeel, and A. Y. Ng. Efficient l1 regularized logistic regression. In In AAAI, 2006. Google ScholarDigital Library
N. Indurkhya and S. M. Weiss. Solving regression problems with rule-based ensemble classifiers. In KDD, pages 287--292, 2001. Google ScholarDigital Library
I.-H. Kang and G. Kim. Query type classification for web document retrieval. In SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 64--71, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
S. M. Kay. Fundamentals of statistical signal processing: estimation theory. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993. Google ScholarDigital Library
A. Krogh and J. Vedelsby. Neural network ensembles, cross validation, and active learning. In Advances in Neural Information Processing Systems, pages 231--238. MIT Press, 1995.Google ScholarDigital Library
C.-H. Lee, R. Greiner, and S. Wang. Using query-specific variance estimates to combine bayesian classifiers. In ICML '06: Proceedings of the 23rd international conference on Machine learning, pages 529--536, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
Y. Li, Z. Zheng, and H. Dai. KDD CUP-2005 report: Facing a great challenge. ACM SIGKDD Explorations Newsletter, 7(2):99, 2005. Google ScholarDigital Library
A. Ng and M. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In in Advances in Neural Information Processing Systems 14. Cambridge, MA: MIT Press, 2002.Google Scholar
R. Peck, C. Olsen, and J. L. Devore. Introduction to Statistics and Data Analysis (with ThomsonNOW Printed Access Card). Duxbury Press, 2007. Google ScholarDigital Library
M. P. Perrone and L. N. Cooper. When networks disagree: Ensemble methods for hybrid neural networks. In R. J. Mammone, editor, Artificial Neural Networks for Speech and Vision, pages 126--142. Chapman & Hall, 1993.Google Scholar
J. R. Quinlan. Bagging, boosting, and c4.5. In In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 725--730. AAAI Press, 1996. Google ScholarDigital Library
Z. Yin, R. Li, Q. Mei, and J. Han. Exploring social tagging graph for web object classification. In KDD'09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 957--966, New York, NY, USA,2009. ACM. Google ScholarDigital Library

Index Terms

Learning to combine discriminative classifiers: confidence based
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
      2. Markov decision processes
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Markov decision processes

Recommendations

Building Locally Discriminative Classifier Ensemble Through Classifier Fusion Among Nearest Neighbors
PCM 2016: 17th Pacific-Rim Conference on Advances in Multimedia Information Processing - Volume 9916

Many studies on ensemble learning that combines multiple classifiers have shown that, it is an effective technique to improve accuracy and stability of a single classifier. In this paper, we propose a novel discriminative classifier fusion method, which ...
Read More
An Evolutionary Algorithm for Learning Interpretable Ensembles of Classifiers
Intelligent Systems
Abstract
Ensembles of classifiers are a very popular type of method for performing classification, due to their usually high predictive accuracy. However, ensembles have two drawbacks. First, ensembles are usually considered a ‘black box’, non-...
Read More
Hierarchical distance learning by stacking nearest neighbor classifiers

A hierarchical decision fusion and distance learning method, called FSG, is proposed.FSG is employed to bridge the gap between Bayes and N-sample classification error.A measure is proposed to learn expertise of base-layer classifiers of the FSG.FSG ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
classification
ensemble learning
logistic regression
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 692
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning to combine discriminative classifiers: confidence based

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Building Locally Discriminative Classifier Ensemble Through Classifier Fusion Among Nearest Neighbors

An Evolutionary Algorithm for Learning Interpretable Ensembles of Classifiers

Hierarchical distance learning by stacking nearest neighbor classifiers