A statistical approach to improving accuracy in classifier ensembles

January 2008

Author:
Gary F. Holness
University of Massachusetts Amherst
,
Adviser:
Paul E. Utgoff
University of Massachusetts Amherst

Publisher:

University of Massachusetts Amherst

ISBN:978-0-549-91543-0

Order Number:AAI3336957

Pages:

274

Purchase on ProQuest

Bibliometrics

Abstract

Popular ensemble classifier induction algorithms, such as bagging and boosting, construct the ensemble by optimizing component classifiers in isolation. The controllable degrees of freedom in an ensemble include the instance selection and feature selection for each component classifier. Because their degrees of freedom are uncoupled, the component classifiers are not built to optimize performance of the ensemble, rather they are constructed by minimizing individual training loss. Recent work in the ensemble literature contradicts the notion that a combination of the best individually performing classifiers results in lower ensemble error rates. Zenobi et al. demonstrated that ensemble construction should consider a classifier's contribution to ensemble accuracy and diversity even at the expense of individual classifier performance. To tradeoff individual accuracy against ensemble accuracy and diversity, a component classifier inducer requires knowledge of the choices made by the other ensemble members.

We introduce an approach, called DiSCO, that exercises direct control over the tradeoff between diversity and error by sharing ensemble-wide information on instance selection during training. A classifier's contribution to ensemble accuracy and diversity can be measured as it is constructed in isolation, but without sharing information among its peers in the ensemble during training, nothing can be done to control it. In this work, we explore a method for training the component classifiers collectively by sharing information about training set selection. This allows our algorithm to build ensembles whose component classifiers select complementary error distributions that maximize diversity while minimizing ensemble error directly. Treating ensemble construction as an optimization problem, we explore approaches using local search, global search and stochastic methods.

Using this approach we can improve ensemble classifier accuracy over bagging and boosting on a variety of data, particularly those for which the classes are moderately overlapping. In ensemble classification research, how to use diversity to build effective classifier teams is an open question. We also provide a method that uses entropy as a measure of diversity to train an ensemble classifier.

Contributors

Paul Everett Utgoff
University of Massachusetts Amherst
- Publication Years1982 - 2012
- Publication counts66
- Citation count572
- Available for Download2
- Downloads (cumulative)626
- Downloads (12 months)14
- Downloads (6 weeks)1
- Average Downloads per Article313
- Average Citation per Article9
View Full Profile
Gary F Holness
University of Massachusetts Amherst
- Publication Years2000 - 2008
- Publication counts4
- Citation count6
- Available for Download0
- Downloads (cumulative)0
- Downloads (12 months)0
- Downloads (6 weeks)0
- Average Downloads per Article0
- Average Citation per Article2
View Full Profile

Recommendations

Classifier ensembles: Select real-world applications

Broad classes of statistical classification algorithms have been developed and applied successfully to a wide range of real-world domains. In general, ensuring that the particular classification algorithm matches the properties of the data is crucial in ...
Read More
A new classifier ensembles framework
KES'11: Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part I

In constructing a classifier ensemble diversity is more important as the accuracy of its elements. To reach a diverse ensemble, one approach is to produce a pool of classifiers. Then we define a metric to evaluate the diversity value in a set of ...
Read More
Incremental construction of classifier and discriminant ensembles

We discuss approaches to incrementally construct an ensemble. The first constructs an ensemble of classifiers choosing a subset from a larger set, and the second constructs an ensemble of discriminants, where a classifier is used for some classes only. ...
Read More

Comments

Browse Theses

Sections

Classifier ensembles: Select real-world applications

A new classifier ensembles framework

Incremental construction of classifier and discriminant ensembles

Sections

Save to Binder

Recommendations

Classifier ensembles: Select real-world applications

A new classifier ensembles framework

Incremental construction of classifier and discriminant ensembles