ABSTRACT
Data streams generated in real-time can be strongly temporally dependent. In this case, standard techniques where we suppose that class labels are not correlated may produce sub-optimal performance because the assumption is incorrect. To deal with this problem, we present in this paper a new algorithm to classify temporally correlated data based on deferral learning. This approach is suitable for learning over time-varying streams. We show how simple classifiers such as Naive Bayes can boost their performance using this new meta-learning methodology. We give an empirical validation of our new algorithm over several real and artificial datasets.
- A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. MOA: Massive online analysis. J. of Mach. Learn. Res., 11:1601--1604, 2010. Google ScholarDigital Library
- A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R. Gavaldà. New ensemble methods for evolving data streams. In KDD, pages 139--148, 2009. Google ScholarDigital Library
- A. Bifet, J. Read, I. Zliobaite, B. Pfahringer, and G. Holmes. Pitfalls in benchmarking data stream classification and how to avoid them. In ECMLPKDD, pages 465--479, 2013.Google ScholarCross Ref
- G. E. P. Box and G. M. Jenkins. Time Series Analysis: Forecasting and Control. Prentice Hall PTR, Upper Saddle River, NJ, USA, 3rd edition, 1994. Google ScholarDigital Library
- K. Cukier. Data, data everywhere. The Economist Report, 2010.Google Scholar
- J. Gama, P. Medas, G. Castillo, and P. Rodrigues. Learning with drift detection. In Proc. of the 7th Brazilian Symp. on Artificial Intelligence, SBIA, pages 286--295, 2004.Google ScholarCross Ref
- M. Harries. SPLICE-2 comparative evaluation: Electricity pricing. Tech. report, University of New South Wales, 1999.Google Scholar
- J. Kolter and M. Maloof. Dynamic weighted majority: An ensemble method for drifting concepts. J. of Mach. Learn. Res., 8:2755--2790, 2007. Google ScholarDigital Library
- G. Ross, N. Adams, D. Tasoulis, and D. Hand. Exponentially weighted moving average charts for detecting concept drift. Pattern Recogn. Lett, 33:191--198, 2012. Google ScholarDigital Library
- G. Seber and C. Wild. Nonlinear Regression. Wiley Series in Probability and Statistics. Wiley, 2003.Google ScholarCross Ref
- M. Wojnarski. Prediction of product quality in glass manufacturing process using LTF-A neural network.On Bagging and Nonlinear Estimation. Technical report, EUNITE Competition, 2003.Google Scholar
- I. Zliobaite. How good is the electricity benchmark for evaluating concept drift adaptation. CoRR, abs/1301.3524, 2013.Google Scholar
Index Terms
- Deferral classification of evolving temporal dependent data streams
Recommendations
Towards large scale continuous EDA: a random matrix theory perspective
GECCO '13: Proceedings of the 15th annual conference on Genetic and evolutionary computationEstimation of distribution algorithms (EDA) are a major branch of evolutionary algorithms (EA) with some unique advantages in principle. They are able to take advantage of correlation structure to drive the search more efficiently, and they are able to ...
Mining frequent closed graphs on evolving data streams
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data miningGraph mining is a challenging task by itself, and even more so when processing data streams which evolve in real-time. Data stream mining faces hard constraints regarding time and space for processing, and also needs to provide for concept drift ...
An effective evaluation measure for clustering on evolving data streams
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data miningDue to the ever growing presence of data streams, there has been a considerable amount of research on stream mining algorithms. While many algorithms have been introduced that tackle the problem of clustering on evolving data streams, hardly any ...
Comments