Dataset Shift in Machine Learning: | Guide books

Dataset Shift in Machine LearningFebruary 2009

February 2009

Publisher:

The MIT Press

ISBN:978-0-262-17005-5

Published:27 February 2009

Pages:

248

Available at Amazon

Bibliometrics

Abstract

Dataset shift is a common problem in predictive modeling that occurs when the joint distribution of inputs and outputs differs between training and test stages. Covariate shift, a particular case of dataset shift, occurs when only the input distribution changes. Dataset shift is present in most practical applications, for reasons ranging from the bias introduced by experimental design to the irreproducibility of the testing conditions at training time. (An example is -email spam filtering, which may fail to recognize spam that differs in form from the spam the automatic filter has been built on.) Despite this, and despite the attention given to the apparently similar problems of semi-supervised learning and active learning, dataset shift has received relatively little attention in the machine learning community until recently. This volume offers an overview of current efforts to deal with dataset and covariate shift. The chapters offer a mathematical and philosophical introduction to the problem, place dataset shift in relationship to transfer learning, transduction, local learning, active learning, and semi-supervised learning, provide theoretical views of dataset and covariate shift (including decision theoretic and Bayesian perspectives), and present algorithms for covariate shift. Contributors: Shai Ben-David, Steffen Bickel, Karsten Borgwardt, Michael Brckner, David Corfield, Amir Globerson, Arthur Gretton, Lars Kai Hansen, Matthias Hein, Jiayuan Huang, Takafumi Kanamori, Klaus-Robert Mller, Sam Roweis, Neil Rubens, Tobias Scheffer, Marcel Schmittfull, Bernhard Schlkopf, Hidetoshi Shimodaira, Alex Smola, Amos Storkey, Masashi Sugiyama, Choon Hui Teo Neural Information Processing series

Cited By

Contributors

Joaquin Quionero-Candela
- Publication Years2009 - 2009
- Publication counts1
- Citation count189
- Available for Download0
- Downloads (cumulative)0
- Downloads (12 months)0
- Downloads (6 weeks)0
- Average Downloads per Article0
- Average Citation per Article189
View Full Profile
Masashi Sugiyama
Riken
- Publication Years1999 - 2024
- Publication counts242
- Citation count2,136
- Available for Download63
- Downloads (cumulative)11,145
- Downloads (12 months)1,698
- Downloads (6 weeks)230
- Average Downloads per Article177
- Average Citation per Article9
View Full Profile
Anton Schwaighofer
Microsoft Corporation
- Publication Years2001 - 2013
- Publication counts15
- Citation count550
- Available for Download5
- Downloads (cumulative)2,880
- Downloads (12 months)77
- Downloads (6 weeks)11
- Average Downloads per Article576
- Average Citation per Article37
View Full Profile
Neil D Lawrence
University of Cambridge
- Publication Years1997 - 2023
- Publication counts73
- Citation count1,545
- Available for Download21
- Downloads (cumulative)37,876
- Downloads (12 months)20,906
- Downloads (6 weeks)2,670
- Average Downloads per Article1,804
- Average Citation per Article21
View Full Profile

Index Terms

Dataset Shift in Machine Learning
1. Computing methodologies
  1. Machine learning
2. General and reference

Recommendations

Advances in Machine Learning II: Dedicated to the memory of Professor Ryszard S. Michalski
Read More
Machine Learning @ Amazon
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

In this talk, I will first provide an overview of key problem areas where we are applying Machine Learning techniques within Amazon such as product demand forecasting, product search, and information extraction from reviews, and associated technical ...
Read More
Machine Learning in Biology: A Profile of David Haussler

David Haussler, a leader in bioinformatics worldwide, was one of the first computer science researchers to apply statistical machine learning to biology. This work culminated in his team's contribution to the assembly pipeline for the publicly funded ...
Read More

Comments

Browse Books

Sections

Cited By

Index Terms

Advances in Machine Learning II: Dedicated to the memory of Professor Ryszard S. Michalski

Machine Learning @ Amazon

Machine Learning in Biology: A Profile of David Haussler

Save to Binder

Sections

Cited By

Save to Binder

Index Terms

Recommendations

Advances in Machine Learning II: Dedicated to the memory of Professor Ryszard S. Michalski

Machine Learning @ Amazon

Machine Learning in Biology: A Profile of David Haussler