- 1.L. Breiman, J. H. Ftiedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, Belmont, CA, 1984.Google Scholar
- 2.J. Catett. Megainduction: Machine Learning on Very Large Databases. PhD thesis, Basset Department of Computer Science, University of Sydney, Sydney, AustrMia, 1991.Google Scholar
- 3.T. G. Dietterich. Overfitting and undercomputing in machine learning. Computing Sueys, 27:326 327, 1995. Google ScholarDigital Library
- 4.M. Ester, H.-P. Kriegel, J. Sander, M. Wimmer, and X. Xu. Incremental clustering for mining in a data warehousing environment. In Proceedings of the Twenty-Fourth fnterrtational Conference on Very Large Data Bases, pages 323 333, New York, NY, 1998. Morgan Kaufmann. Google ScholarDigital Library
- 5.J. Gehrke, V. Ganti, R. Ramarishnan, and W.-L. Loh. BOAT: optimistic decision tree construction. In Proceedings of the 1999 ACM SIGMOD Interrtational Confer'ence on Management of Data, pages 169 180, Philattelphia, PA, 1999. ACM Press. Google ScholarDigital Library
- 6.J. Gratch. Sequential inductive learning. In Proceedings of the Thireeenth National Conference on Artificial fntelligence, pages 779 786, Portland, OR, 1996. AAAI Press. Google ScholarDigital Library
- 7.W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58:13 30, 1963.Google Scholar
- 8.N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285 318, 1997. Google ScholarDigital Library
- 9.O. Maron and A. Moore. Hoeffding races: Accelerating model selection search for classification and function approximation. In J. D. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural fnformation Processing Systems 6. Morgan Kaufmann, San Mateo, CA, 1994.Google Scholar
- 10.M. Mehta, A. AgrawM, and J. Rissanen. SLIQ: A fast scalable classifier for data mining. In Proceedings of the Fifth fnterrtational Conference on Extending Database Technology, pages 18 32, Avignon, France, 1996. Springer. Google ScholarDigital Library
- 11.R. G. Miller, Jr. Simultaneous Statistical fnference. Springer, New York, NY, 2nd edition, 1981.Google Scholar
- 12.A. W. Moore and M. S. Lee. Efficient algorithms for minimizing cross validation error. In Proceedings of the Eleventh fnterrtational Conference on Machine Learning, pages 190 198, New Brunswick, NO, 1994. Morgan Kaufmann.Google ScholarDigital Library
- 13.R. Musick, J. Catlett, and S. Russell. Decision theoretic subsampling for induction on large databases. In Proceedings of the Tenth fnterrtational Conference on Machine Learning, pages 212 219, Amherst, MA, 1993. Morgan Kauflnann.Google ScholarCross Ref
- 14.F. Provost, D. Jensen, and T. Oates. Efficient progressive sampling. In Proceedings of the Fifth A CM SIGKDD fnterrtational Conference on Knowledge Discovery and Data Mining, pages 23 32, San Diego, CA, 1999. ACM Press. Google ScholarDigital Library
- 15.J. R. Quinlan. C.5: Programs for Machine Learning. Morgan Kauflnann, San Mateo, CA, 1993. Google ScholarDigital Library
- 16.J. R. Quinlan and R. M. Cameron-Jones. Oversearching and layered search in empirical learning. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 1019 1024, Montreal, Canada, 1995. Morgan Kaufmann. Google ScholarDigital Library
- 17.J. C. Sharer, R. Agrawal, and M. Mehta. SPRINT: A scalable parallel classifier for data mining. In Proceedings of the Twenty-Second Interrtational Conference on Very Large Databases, pages 544 555, Mumbai, India, 1996. Morgan Kaufmann. Google ScholarDigital Library
- 18.P. Smyth and D. Wolpert. Anytime exploratory data anMysis for massive data sets. In Proceedings of the Third Interrtational Conference on Knowledge Discovery and Data Mining, pages 5&60, Newport Beach, CA, 1997. AAAI Press.Google Scholar
- 19.H. Toivonen. Sampling large databases for association rules. In Proceedings of the Twenty-Second fnterrtational Conference on Very Large Data Bases, pages 134 145, Mumbai, India, 1996. Morgan Kauflnann. Google ScholarDigital Library
- 20.P. E. Utgoff. Incremental induction of decision trees. Machine Learning, 4:161 186, 1989. Google ScholarDigital Library
- 21.P. E. Utgoff. An improved algorithm for incremental induction of decision trees. In Proceedings of the Eleventh International Conference on Machine Learning, pages 318 325, New Brunswick, NJ, 1994. Morgan Kaufmann.Google ScholarDigital Library
- 22.G. L Webb. OPUS: An efiqcient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:431 465, 1995. Google ScholarDigital Library
- 23.A. Wolman, G. Voelker, N. Sharma, N. Cardwell, M. Brown, T. Landray, D. Pinnel, A. KaHin, and H. Levy. Organization-based analysis of Web-object sharing and caching. In Proceedings of the Second USENIX Conference on Interrtet Technologies and Systems, pages 25-36, Boulder, CO, 1999. Google ScholarDigital Library
Index Terms
Mining high-speed data streams
Recommendations
Accurate decision trees for mining high-speed data streams
KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data miningIn this paper we study the problem of constructing accurate decision tree models from data streams. Data streams are incremental tasks that require incremental, online, and any-time learning algorithms. One of the most successful algorithms for mining ...
Mining time-changing data streams
KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data miningMost statistical and machine-learning algorithms assume that the data is a random sample drawn from a stationary distribution. Unfortunately, most of the large databases available for mining today violate this assumption. They were gathered over months ...
Decision Trees for Mining Data Streams Based on the McDiarmid's Bound
In mining data streams the most popular tool is the Hoeffding tree algorithm. It uses the Hoeffding's bound to determine the smallest number of examples needed at a node to select a splitting attribute. In the literature the same Hoeffding's bound was ...
Comments