Each passing year bears witness to the development of ever more powerful computers, increasingly fast and cheap storage media, and even higher bandwidth data connections. This makes it easy to believe that we can now at least in principle solve any problem we are faced with so long as we only have enough data. Yet this is not the case. Although large databases allow us to retrieve many different single pieces of information and to compute simple aggregations, general patterns and regularities often go undetected. Furthermore, it is exactly these patterns, regularities and trends that are often most valuable. To avoid the danger of drowning in information, but starving for knowledge the branch of research known as data analysis has emerged, and a considerable number of methods and software tools have been developed. However, it is not these tools alone but the intelligent application of human intuition in combination with computational power, of sound background knowledge with computer-aided modeling, and of critical reflection with convenient automatic model construction, that results in successful intelligent data analysis projects. Guide to Intelligent Data Analysis provides a hands-on instructional approach to many basic data analysis techniques, and explains how these are used to solve data analysis problems. Topics and features: guides the reader through the process of data analysis, following the interdependent steps of project understanding, data understanding, data preparation, modeling, and deployment and monitoring; equips the reader with the necessary information in order to obtain hands-on experience of the topics under discussion; provides a review of the basics of classical statistics that support and justify many data analysis methods, and a glossary of statistical terms; includes numerous examples using R and KNIME, together with appendices introducing the open source software; integrates illustrations and case-study-style examples to support pedagogical exposition. This practical and systematic textbook/reference for graduate and advanced undergraduate students is also essential reading for all professionals who face data analysis problems. Moreover, it is a book to be used following ones exploration of it. Dr. Michael R. Berthold is Nycomed-Professor of Bioinformatics and Information Mining at the University of Konstanz, Germany. Dr. Christian Borgelt is Principal Researcher at the Intelligent Data Analysis and Graphical Models Research Unit of the European Centre for Soft Computing, Spain. Dr. Frank Hppner is Professor of Information Systems at Ostfalia University of Applied Sciences, Germany. Dr. Frank Klawonn is a Professor in the Department of Computer Science and Head of the Data Analysis and Pattern Recognition Laboratory at Ostfalia University of Applied Sciences, Germany. He is also Head of the Bioinformatics and Statistics group at the Helmholtz Centre for Infection Research, Braunschweig, Germany.
Cited By
- Hüsing S Epistemic Programming - An insight-driven programming concept for Data Science Proceedings of the 21st Koli Calling International Conference on Computing Education Research, (1-3)
- Kochegurova E and Martynova Y (2020). Aspects of Continuous User Identification Based on Free Texts and Hidden Monitoring, Programming and Computing Software, 46:1, (12-24), Online publication date: 1-Jan-2020.
- Heinemann B, Opel S, Budde L, Schulte C, Frischemeier D, Biehler R, Podworny S and Wassong T Drafting a Data Science Curriculum for Secondary Schools Proceedings of the 18th Koli Calling International Conference on Computing Education Research, (1-5)
- Batista N, Brandão M, Pinheiro M, Dalip D and Moro M Dealing with Data from Multiple Web Sources Proceedings of the 24th Brazilian Symposium on Multimedia and the Web, (3-6)
- Grossi V, Monreale A, Nanni M, Pedreschi D and Turini F Clustering Formulation Using Constraint Optimization Revised Selected Papers of the SEFM 2015 Collocated Workshops on Software Engineering and Formal Methods - Volume 9509, (93-107)
- Holvitie J and Leppänen V RefUTU Proceedings of the 16th International Conference on Computer Systems and Technologies, (176-183)
- Anaya I, Simko V, Bourcier J, Plouzeau N and Jézéquel J A prediction-driven adaptation approach for self-adaptive sensor networks Proceedings of the 9th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, (145-154)
- Klawonn F, Lechner W and Grigull L Case-Centred multidimensional scaling for classification visualisation in medical diagnosis Proceedings of the second international conference on Health Information Science, (137-148)
- Ince K and Klawonn F Handling Different Levels of Granularity within Naive Bayes Classifiers Proceedings of the 14th International Conference on Intelligent Data Engineering and Automated Learning --- IDEAL 2013 - Volume 8206, (521-528)
- Klawonn F, Crull K, Kukita A and Pessler F Median polish with power transformations as an alternative for the analysis of contingency tables with patient data Proceedings of the First international conference on Health Information Science, (25-35)
- Klawonn F, Höppner F and Jayaram B What are Clusters in High Dimensions and are they Difficult to Find? Revised Selected Papers of the First International Workshop on Clustering High--Dimensional Data - Volume 7627, (14-33)
- Kosina P and Gama J Very Fast Decision Rules for multi-class problems Proceedings of the 27th Annual ACM Symposium on Applied Computing, (795-800)
- Klawonn F, Höppner F and May S An alternative to ROC and AUC analysis of classifiers Proceedings of the 10th international conference on Advances in intelligent data analysis X, (210-221)
Index Terms
- Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data