Article

Interruptible anytime algorithms for iterative improvement of decision trees

Authors:
Saher Esmeir

Technion--IIT, Haifa, Israel

Technion--IIT, Haifa, Israel
View Profile

,
Shaul Markovitch

Technion---IIT, Haifa, Israel

Technion---IIT, Haifa, Israel
View Profile

UBDM '05: Proceedings of the 1st international workshop on Utility-based data miningAugust 2005Pages 78–85https://doi.org/10.1145/1089827.1089837

Published:21 August 2005Publication History

UBDM '05: Proceedings of the 1st international workshop on Utility-based data mining

Pages 78–85

ABSTRACT

Finding a minimal decision tree consistent with the examples is an NP-complete problem. Therefore, most of the existing algorithms for decision tree induction use a greedy approach based on local heuristics. These algorithms usually require a fixed small amount of time and result in trees that are not globally optimal. Recently, the LSID3 contract anytime algorithm was introduced to allow using extra resources for building better decision trees. A contract anytime algorithm needs to get its resource allocation a priori. In many cases, however, the time allocation is not known in advance, disallowing the use of contract algorithms. To overcome this problem, in this work we present two interruptible anytime algorithms for inducing decision trees. Interruptible anytime algorithms do not require their resource allocation in advance and thus must be ready to be interrupted and return a valid solution at any moment. The first interruptible algorithm we propose is based on a general technique for converting a contract algorithm to an interruptible one by sequencing. The second is an iterative improvement algorithm that repeatedly selects a subtree whose reconstruction is estimated to yield the highest marginal utility and rebuilds it with higher resource allocation. Empirical evaluation shows a good anytime behavior for both algorithms. The iterative improvement algorithm shows smoother performance profiles which allow more refined control.

References

C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998.]]Google Scholar
A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Occam's Razor. Information Processing Letters, 24(6):377--380, 1987.]] Google ScholarDigital Library
M. Boddy and T. L. Dean. Deliberation scheduling for problem solving in time constrained environments. Artificial Intelligence, 67(2):245--285, 1994.]] Google ScholarDigital Library
R. R. Bouckaert. Choosing between two learning algorithms based on calibrated tests. In ICML'03, pages 51--58, Washington, DC, USA, 2003.]]Google Scholar
L. Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996.]] Google ScholarCross Ref
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984.]]Google Scholar
C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121--167, 1998.]] Google ScholarDigital Library
M. W. Craven and J. W. Shavlik. Extracting Comprehensible Models from Trained Neural Networks. PhD thesis, Computer Science department, University of Wisconsin, Madison, 1996.]] Google ScholarDigital Library
S. Esmeir and S. Markovitch. Lookahead-based algorithms for anytime induction of decision trees. In ICML'04, pages 257--264, 2004.]] Google ScholarDigital Library
W. Fan, H. Wang, P. S. Yu, and S. Ma. Is random model better? on its accuracy and efficiency. In ICDM'03, pages 51--58, 2003.]] Google ScholarDigital Library
Y. Freund and L. Mason. The alternating decision tree learning algorithm. In Proceedings of the Sixteenth International Conference on Machine Learning, pages 124--133, Bled, Slovenia, 1999.]] Google ScholarDigital Library
E. Gabrilovich and S. Markovitch. Text categorization with many redundant features: Using aggressive feature selection to make svms competitive with c4.5. In ICML'04, pages 321--328, 2004.]] Google ScholarDigital Library
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning: data mining, inference, and prediction. New York: Springer-Verlag, 2001.]]Google Scholar
E. Hovitz. Computation and Action under Bounded Resources. PhD thesis, Computer Science Department, Stanford University, 1990.]] Google ScholarDigital Library
J. Huang, J. Lu, and C. Ling. Comparing naive bayes, decision trees, and svm using accuracy and auc. In ICDM'03, Melbourne, FL, 2003.]] Google ScholarDigital Library
L. Hyafil and R. L. Rivest. Constructing optimal binary decision trees is NP-complete. Information Processing Letters, 5(1):15--17, 1976.]]Google ScholarCross Ref
M. Last, A. Kandel, O. Maimon, and E. Eberbach. Anytime algorithm for feature selection. In RSCTC'01, pages 532--539. Springer-Verlag, 2001.]] Google ScholarDigital Library
D. J. Lizotte, O. Madani, and R. Greiner. Budgeted learning of naive bayes classifiers. In UAI'03, Acapulco, Mexico, 2003.]]Google Scholar
O. J. Murphy and R. L. McCraw. Designing storage efficient decision trees. IEEE Transactions on Computers, 40(3):315--320, 1991.]] Google ScholarDigital Library
D. W. Opitz. An Anytime Approach to Connectionist Theory Refinement: Refining the Topologies of Knowledge-Based Neural Networks. PhD thesis, Department of Computer Sciences, University of Wisconsin-Madison, 1995.]] Google ScholarDigital Library
A. Papagelis and D. Kalles. Breeding decision trees using evolutionary techniques. In ICML'01, pages 393--400, San Francisco, CA, USA, 2001.]] Google ScholarDigital Library
J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81--106, 1986.]] Google ScholarCross Ref
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.]] Google ScholarDigital Library
S. J. Russell and E. Wefald. Principles of metareasoning. In KR'89, pages 400--411, San Mateo, California, 1989.]] Google ScholarDigital Library
S. J. Russell and S. Zilberstein. Composing real-time systems. In IJCAI'91, pages 212--217, Sydney, Australia, 1991.]]Google Scholar
S. J. Russell and S. Zilberstein. Optimal composition of real-time systems. Artificial Intelligence, 82(1-2):181--213, 1996.]] Google ScholarDigital Library
R. Schapire. A brief introduction to boosting. In IJCAI'99, pages 1401--1406, Stockholm, Sweden, 1999.]] Google ScholarDigital Library
Y. Shan, E. Milios, A. Roger, C. Blouin, and E. Susko. Automatic recognition of regions of intrinsically poor multiple alignment using machine learning. In CSB'03, 2003.]] Google ScholarDigital Library
J. W. Shavlik, R. J. Mooney, and G. G. Towell. Symbolic and neural learning algorithms: An experimental comparison. Machine Learning, 6(2):111--143, 1991.]] Google ScholarDigital Library
P. E. Utgoff. Incremental induction of decision trees. Machine Learning, 4(2):161--186, 1989.]] Google ScholarDigital Library
P. E. Utgoff, N. C. Berkman, and J. A. Clouse. Decision tree induction based on efficient tree restructuring. Machine Learning, 29(1):5--44, 1997.]] Google ScholarDigital Library
X. Zhu, S. Sun, S. E. Cheng, and M. Bern. Classification of protein crystallization imagery. In EMBS'04, San Francisco, CA, 2004.]]Google Scholar

Index Terms

Interruptible anytime algorithms for iterative improvement of decision trees
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Anytime algorithms for speech parsing?
COLING '94: Proceedings of the 15th conference on Computational linguistics - Volume 2

This paper discusses to which extent the concept of "anytime algorithms" can be applied to parsing algorithms with feature unification. We first try to give a more precise definition of what an anytime algorithm is. We arque that parsing algorithms have ...
Read More
Anytime Learning of Decision Trees

The majority of existing algorithms for learning decision trees are greedy---a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Even the few non-greedy ...
Read More
Anytime Learning of Decision Trees

The majority of existing algorithms for learning decision trees are greedy---a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Even the few non-greedy ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UBDM '05: Proceedings of the 1st international workshop on Utility-based data mining
August 2005
104 pages
ISBN:1595932089
DOI:10.1145/1089827
Conference Chairs:
Gary Weiss
Fordham University, USA
,
Maytal Saar-Tsechansky
University of Texas at Austin, USA
,
Bianca Zadrozny
IBM Research, USA
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anytime algorithms
anytime learning
cost-quality tradeoff
decision trees
hard concepts
Qualifiers
- Article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 471
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.