skip to main content
article
Free Access

Learning in Environments with Unknown Dynamics: Towards more Robust Concept Learners

Published:01 December 2007Publication History
Skip Abstract Section

Abstract

In the process of concept learning, target concepts may have portions with short-term changes, other portions may support long-term changes, and yet others may not change at all. For this reason several local windows need to be handled. We suggest facing this problem, which naturally exists in the field of concept learning, by allocating windows which can adapt their size to portions of the target concept. We propose an incremental decision tree that is updated with incoming examples. Each leaf of the decision tree holds a time window and a local performance measure as the main parameter to be controlled. When the performance of a leaf decreases, the size of its local window is reduced. This learning algorithm, called OnlineTree2, automatically adjusts its internal parameters in order to face the current dynamics of the data stream. Results show that it is comparable to other batch algorithms when facing problems with no concept change, and it is better than evaluated methods in its ability to deal with concept drift when dealing with problems in which: concept change occurs at different speeds, noise may be present and, examples may arrive from different areas of the problem domain (virtual drift).

References

  1. David Aha, Dennis Kibler, and Marc Albert. Instance-based learning algorithms. Machine Learning , 6:37-66, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Arthur Asuncion and David J. Newman. UCI machine learning repository. http://www.ics.uci.edu/ mlearn/MLRepository.html, Irvine, CA: University of California, Department of Information and Computer Science, 2007.Google ScholarGoogle Scholar
  3. Janez Dem¿ar. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7:1-30, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Thomas G. Dietterich. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895-1923, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Pedro Domingos and Geoff Hulten. Mining high-speed data streams. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 71- 80, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. James Dougherty, Ron Kohavi, and Mehran Sahami. Supervised and unsupervised discretization of continuous features. In Proceedings of the 20th International Conference on Machine Learning, pages 194-202, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  7. Richard Duda and Peter Hart. Pattern Classification and Scene Analysis. Wiley-Interscience, 1973.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Floriana Esposito, Donato Malerba, and Giovanni Semeraro. A comparative analysis of methods for pruning decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19 (5):476-491, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Wei Fan. Systematic data selection to mine concept-drifting data streams. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 128-137, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. JoÃo Gama and Pedro Medas. Learning decision trees from dynamic data streams. Journal of Universal Computer Science, 11(8):1353-1366, 2005.Google ScholarGoogle Scholar
  11. JoÃo Gama, Pedro Medas, Gladys Castillo, and Pedro Rodrigues. Learning with drift detection. In Proceedings of the 17th Brazilian Symposium on Artificial Intelligence, pages 286-295, 2004a.Google ScholarGoogle ScholarCross RefCross Ref
  12. JoÃo Gama, Pedro Medas, and Pedro Rodrigues. Learning in dynamic environments: Decision trees for data streams. Pattern Recognition in Information Systems, pages 149-158, 2004b.Google ScholarGoogle Scholar
  13. JoÃo Gama, Ricardo Rocha, and Pedro Medas. Accurate decision trees for mining high-speed data streams. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 523-528, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michael Harries, Claude Sammut, and Kim Horn. Extracting hidden context. Machine Learning, 32(2):101-126, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58:13-30, 1963.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Geoff Hulten and Pedro Domingos. VFML - a toolkit for mining high-speed time-changing data streams. http://www.cs.washington.edu/dm/vfml/, 2003.Google ScholarGoogle Scholar
  17. Geoff Hulten, Laurie Spencer, and Pedro Domingos. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 97-106, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ralf Klinkenberg. Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, 8(3):281-300, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ralf Klinkenberg and Thorsten Joachims. Detecting concept drift with support vector machines. In Proceedings of the 17th International Conference on Machine Learning, pages 487-494, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ralf Klinkenberg and Ingrid Renz. Adaptive information filtering: Learning in the presence of concept drifts. In Workshop Notes of the ICML-98 on Workshop on Learning for Text Categorization, pages 33-40, 1998.Google ScholarGoogle Scholar
  21. Ivana Krizakova and Miroslav Kubat. FAVORIT: Concept formation with ageing of knowledge. Pattern Recognition Letters, 13(1):19-25, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Miroslav Kubat and Gerhard Widmer. Adapting to drift in continuous domains (extended abstract). In Proceedings of the 8th European Conference on Machine Learning, pages 307-310, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. James B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281-297, 1967.Google ScholarGoogle Scholar
  24. Marcus A. Maloof and Ryszard S. Michalski. Selecting examples for partial memory learning. Machine Learning, 41(1):27-52, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Marcus A. Maloof and Ryszard S. Michalski. Incremental learning with partial instance memory. Artificial Intelligence, 154:95-126, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. John Nagle. On packets switches with infinite storage. IEEE Transactions On Communications, 35 (4):435-438, 1987.Google ScholarGoogle ScholarCross RefCross Ref
  27. Marlon Núñez, Raúl Fidalgo, and Rafael Morales. On-line learning of decision trees in problems with unknown dynamics. In Proceedings of the 4th Mexican International Conference on Artificial Intelligence, pages 443-453, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Vern Paxson and Mark Allman. RFC-2988: Computing TCPs transmission timer. Network Working Group Requests for Comment, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. John Platt. Fast training of support vector machines using sequential minimal optimization. Advances in Kernel Methods - Support Vector Learning, pages 185-208, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jon Postel. RFC-793: TCP specification. ARPANET Working Group Requests for Comment, DDN Network Information Center, SRI International, 1981.Google ScholarGoogle Scholar
  31. Duncan Potts and Claude Sammut. Incremental learning of linear model trees. Machine Learning, 61:5-48, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Ross Quinlan. Induction of decision trees. Machine Learning, 1:81-106, 1986. Google ScholarGoogle ScholarCross RefCross Ref
  33. J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. W. Nick Street and YongSeog Kim. A streaming ensemble algorithm (SEA) for large-scale classification. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 377-382, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Charles Taylor, Gholamreza Nakhaeizadeh, and Carsten Lanquillon. Structural change and classification. In Workshop Notes of the ICML-97 Workshop on Dynamically Changing Domains: Theory Revision and Context Dependence Issues, pages 67-78, 1997.Google ScholarGoogle Scholar
  36. Paul E. Utgoff, Neil C. Berkman, and Jeffery A. Clouse. Decision tree induction based on efficient tree restructuring. Machine Learning, 29(1):5-44, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Gerhard Widmer and Miroslav Kubat. Learning in the presence of concept drift and hidden contexts. Machine Learning, 23:69-101, 1996. Google ScholarGoogle ScholarCross RefCross Ref
  38. Dwi H. Widyantoro, Thomas R. Ioerger, and John Yen. An adaptive algorithm for learning changes in user interests. In Proceedings of the 8th International Conference on Information and Knowledge Management, pages 405-412, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2005. (2nd Edition). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning in Environments with Unknown Dynamics: Towards more Robust Concept Learners
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image The Journal of Machine Learning Research
      The Journal of Machine Learning Research  Volume 8, Issue
      12/1/2007
      2736 pages
      ISSN:1532-4435
      EISSN:1533-7928
      Issue’s Table of Contents

      Publisher

      JMLR.org

      Publication History

      • Published: 1 December 2007
      Published in jmlr Volume 8, Issue

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader