research-article

The MovieLens Datasets: History and Context

Authors:
F. Maxwell Harper

University of Minnesota, Minneapolis, MN

University of Minnesota, Minneapolis, MN
View Profile

,
Joseph A. Konstan

University of Minnesota, Minneapolis, MN

University of Minnesota, Minneapolis, MN
View Profile

ACM Transactions on Interactive Intelligent Systems Volume 5 Issue 4Article No.: 19pp 1–19https://doi.org/10.1145/2827872

Published:22 December 2015Publication History

ACM Transactions on Interactive Intelligent Systems

Abstract

The MovieLens datasets are widely used in education, research, and industry. They are downloaded hundreds of thousands of times each year, reflecting their use in popular press programming books, traditional and online courses, and software. These datasets are a product of member activity in the MovieLens movie recommendation system, an active research platform that has hosted many experiments since its launch in 1997. This article documents the history of MovieLens and the MovieLens datasets. We include a discussion of lessons learned from running a long-standing, live research platform from the perspective of a research organization. We document best practices and limitations of using the MovieLens datasets in new research.

References

Shuo Chang, F. Maxwell Harper, and Loren Terveen. 2015. Using groups of items for preference elicitation in recommender systems. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW’’15). ACM, New York, NY, 1258--1269. DOI:http://dx.doi.org/10.1145/2675133.2675210 Google ScholarDigital Library
Dan Cosley, Dan Frankowski, Sara Kiesler, Loren Terveen, and John Riedl. 2005. How oversight improves member-maintained communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’05). ACM, New York, NY, 11--20. DOI:http://dx.doi.org/10.1145/1054972.1054975 Google ScholarDigital Library
Dan Cosley, Shyong K. Lam, Istvan Albert, Joseph A. Konstan, and John Riedl. 2003. Is seeing believing?: How recommender system interfaces affect users’ opinions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’03). ACM, New York, NY, 585--592. DOI:http://dx.doi.org/10.1145/642611.642713 Google ScholarDigital Library
Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, NY, 271--280. DOI:http://dx.doi.org/10.1145/1242572.1242610 Google ScholarDigital Library
Mukund Deshpande and George Karypis. 2004. Item-based top-N recommendation algorithms. ACM Transactions on Information Systems 22, 1, 143--177. DOI:http://dx.doi.org/10.1145/963770.963776 Google ScholarDigital Library
Sara Drenner, Max Harper, Dan Frankowski, John Riedl, and Loren Terveen. 2006. Insert movie reference here: A system to bridge conversation and item-oriented web sites. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’06). ACM, New York, NY, 951--954. DOI:http://dx.doi.org/10.1145/1124772.1124914 Google ScholarDigital Library
Gideon Dror, Yahoo Labs, Noam Koenigstein, Yehuda Koren, and Markus Weimer. 2012. The Yahoo&excl; music dataset and KDDCup11. In Journal of Machine Learning Research Workshop and Conference Proceedings: Proceedings of KDD Cup 2011. 3--18.Google Scholar
Michael D. Ekstrand, Daniel Kluver, F. Maxwell Harper, and Joseph A. Konstan. 2015. Letting users choose recommender algorithms: An experimental study. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 11--18. DOI:http://dx.doi.org/10.1145/2792838.2800195 Google ScholarDigital Library
Michael D. Ekstrand, Michael Ludwig, Joseph A. Konstan, and John T. Riedl. 2011. Rethinking the recommender research ecosystem: Reproducibility, openness, and lenskit. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York, NY, 133--140. DOI:http://dx.doi.org/10.1145/2043932.2043958 Google ScholarDigital Library
Malcolm Gladwell. 1999. The science of the sleeper. The New Yorker. Retrieved November 13, 2015 from http://gladwell.com/the-science-of-the-sleeper/.Google Scholar
Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. 2001. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval 4, 2, 133--151. DOI:http://dx.doi.org/10.1023/A:1011419012209 Google ScholarDigital Library
F. Maxwell Harper, Dan Frankowski, Sara Drenner, Yuqing Ren, Sara Kiesler, Loren Terveen, Robert Kraut, and John Riedl. 2007a. Talk amongst yourselves: Inviting users to participate in online conversations. In Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI’07). ACM, New York, NY, 62--71. DOI:http://dx.doi.org/10.1145/1216295.1216313 Google ScholarDigital Library
F. Maxwell Harper, Shilad Sen, and Dan Frankowski. 2007b. Supporting social recommendations with activity-balanced clustering. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 165--168. DOI:http://dx.doi.org/10.1145/1297231.1297262 Google ScholarDigital Library
F. Maxwell Harper, Funing Xu, Harmanpreet Kaur, Kyle Condiff, Shuo Chang, and Loren Terveen. 2015. Putting users in control of their recommendations. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 3--10. DOI:http://dx.doi.org/10.1145/2792838.2800179 Google ScholarDigital Library
George Karypis. 2001. Evaluation of item-based top-N recommendation algorithms. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM’01). ACM, New York, NY, 247--254. DOI:http://dx.doi.org/10.1145/502585.502627 Google ScholarDigital Library
Joseph A. Konstan, Bradley N. Miller, David Maltz, Jonathan L. Herlocker, Lee R. Gordon, and John Riedl. 1997. GroupLens: Applying collaborative filtering to Usenet news. Communications of the ACM 40, 3, 77--87. DOI:http://dx.doi.org/10.1145/245108.245126 Google ScholarDigital Library
Joseph A. Konstan, J. D. Walker, D. Christopher Brooks, Keith Brown, and Michael D. Ekstrand. 2014. Teaching recommender systems at large scale: Evaluation and lessons learned from a hybrid MOOC. In Proceedings of the 1st ACM Conference on Learning @ Scale Conference (L@S’14). ACM, New York, NY, 61--70. DOI:http://dx.doi.org/10.1145/2556325.2566244 Google ScholarDigital Library
John G. Lynch, Jr., Dipankar Chakravarti, and Anusree Mitra. 1991. Contrast effects in consumer judgments: Changes in mental representations or in the anchoring of rating scales? Journal of Consumer Research 18, 3, 284--297.Google ScholarCross Ref
Paolo Massa and Paolo Avesani. 2007. Trust-aware recommender systems. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 17--24. DOI:http://dx.doi.org/10.1145/1297231.1297235 Google ScholarDigital Library
Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015a. Inferring networks of substitutable and complementary products. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). ACM, New York, NY, 785--794. DOI:http://dx.doi.org/10.1145/2783258.2783381 Google ScholarDigital Library
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015b. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, New York, NY, 43--52. DOI:http://dx.doi.org/10.1145/2766462.2767755 Google ScholarDigital Library
Bradley Norman Miller. 2003. Toward a Personal Recommender System. Ph.D. dissertation. University of Minnesota, Minneapolis, MN. Retrieved from http://search.proquest.com/dissertations/docview/305324342/abstract/A46BCC87FC4D4DD4PQ/1?accountid=14586.Google Scholar
Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl. 2001. PolyLens: A recommender system for groups of users. In Proceedings of the 7th Conference on European Conference on Computer Supported Cooperative Work (ECSCW’01). Kluwer Academic Publishers, Norwell, MA, 199--218. Google ScholarDigital Library
John O’Donovan and Barry Smyth. 2005. Trust in recommender systems. In Proceedings of the 10th International Conference on Intelligent User Interfaces (IUI’05). ACM, New York, NY, 167--174. DOI:http://dx.doi.org/10.1145/1040830.1040870 Google ScholarDigital Library
Nick Pentreath. 2015. Machine Learning with Spark. Packt Publishing Ltd, Birmingham, UK.Google Scholar
Reid Priedhorsky, Mikhil Masli, and Loren Terveen. 2010. Eliciting and focusing geographic volunteer work. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW’10). ACM, New York, NY, 61--70. DOI:http://dx.doi.org/10.1145/1718918.1718931 Google ScholarDigital Library
Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, and John Riedl. 2002. Getting to know you: Learning new user preferences in recommender systems. In Proceedings of the 7th International Conference on Intelligent User Interfaces (IUI’02). ACM, New York, NY, 127--134. DOI:http://dx.doi.org/10.1145/502716.502737 Google ScholarDigital Library
Al Mamunur Rashid, George Karypis, and John Riedl. 2008. Learning preferences of new users in recommender systems: An information theoretic approach. ACM SIGKDD Explorations Newsletter 10, 2, 90--100. DOI:http://dx.doi.org/10.1145/1540276.1540302 Google ScholarDigital Library
Yuqing Ren, F. Harper, Sara Drenner, Loren Terveen, Sara Kiesler, John Riedl, and Robert Kraut. 2012. Building member attachment in online communities: Applying theories of group identity and interpersonal bonds. Management Information Systems Quarterly 36, 3 (Sept. 2012), 841--864. Google ScholarDigital Library
Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl. 1994. GroupLens: An open architecture for collaborative filtering of Netnews. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW’94). ACM, New York, NY, 175--186. DOI:http://dx.doi.org/10.1145/192844.192905 Google ScholarDigital Library
Eric Ries. 2011. The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown Business, New York, NY.Google Scholar
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Application of Dimensionality Reduction in Recommender System—A Case Study. Technical Report. DTIC Document. Retrieved from http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix==html&identifier==ADA439541.Google ScholarCross Ref
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW’01). ACM, New York, NY, 285--295. DOI:http://dx.doi.org/10.1145/371920.372071 Google ScholarDigital Library
Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York, NY, 253--260. DOI:http://dx.doi.org/10.1145/564376.564421 Google ScholarDigital Library
Toby Segaran. 2007. Programming Collective Intelligence: Building Smart Web 2.0 Applications. O’Reilly Media, Inc., Sebastopol, CA. Google ScholarDigital Library
Shilad Sen, F. Maxwell Harper, Adam LaPitz, and John Riedl. 2007. The quest for quality tags. In Proceedings of the 2007 International ACM Conference on Supporting Group Work (GROUP’07). ACM, New York, NY, 361--370. DOI:http://dx.doi.org/10.1145/1316624.1316678 Google ScholarDigital Library
Shilad Sen, Shyong K. Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, and John Riedl. 2006. Tagging, communities, vocabulary, evolution. In Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work (CSCW’06). ACM, New York, NY, 181--190. DOI:http://dx.doi.org/10.1145/1180875.1180904 Google ScholarDigital Library
Shilad Sen, Jesse Vig, and John Riedl. 2009. Learning to recognize valuable tags. In Proceedings of the 14th International Conference on Intelligent User Interfaces (IUI’09). ACM, New York, NY, 87--96. DOI:http://dx.doi.org/10.1145/1502650.1502666 Google ScholarDigital Library
Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, New York, NY, 257--297. http://link.springer.com/chapter/10.1007/978-0-387-85820-3_8Google Scholar
Jesse Vig, Shilad Sen, and John Riedl. 2012. The tag genome: Encoding community knowledge to support novel interaction. ACM Transactions on Interactive Intelligent Systems 2, 3, 13:1--13:44. DOI:http://dx.doi.org/10.1145/2362394.2362395 Google ScholarDigital Library
Jesse Vig, Matthew Soukup, Shilad Sen, and John Riedl. 2010. Tag expression: Tagging with feeling. In Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology (UIST’10). ACM, New York, NY, 323--332. DOI:http://dx.doi.org/10.1145/1866029.1866079 Google ScholarDigital Library
Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). ACM, New York, NY, 22--32. DOI:http://dx.doi.org/10.1145/1060745.1060754 Google ScholarDigital Library

Index Terms

The MovieLens Datasets: History and Context

Recommendations

iSynchronizer: A Tool for Extracting, Integration and Analysis of MovieLens and IMDb Datasets
UMAP '18: Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization

The growing popularity of e-commerce has ignited the interest of the research community in e-commerce application research and development. For this purpose, variety of applications and resources such as MovieLens and IMDb datasets have been utilized, ...
Read More
Putting Users in Control of their Recommendations
RecSys '15: Proceedings of the 9th ACM Conference on Recommender Systems

The essence of a recommender system is that it can recommend items personalized to the preferences of an individual user. But typically users are given no explicit control over this personalization, and are instead left guessing about how their actions ...
Read More
An analysis of users' propensity toward diversity in recommendations
RecSys '14: Proceedings of the 8th ACM Conference on Recommender systems

Providing very accurate recommendations to end users has been nowadays recognized to be just one of the main tasks a recommender systems must be able to perform. While predicting relevant suggestions, attention needs to be paid to their diversification ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Interactive Intelligent Systems Volume 5, Issue 4
Regular Articles and Special issue on New Directions in Eye Gaze for Interactive Intelligent Systems (Part 1 of 2)
January 2016
118 pages
ISSN:2160-6455
EISSN:2160-6463
DOI:10.1145/2866565
Editors:
Anthony Jameson
German Research Center for Artifi cial Intelligence (DFKI), Germany
,
Krzysztof Gajos
Harvard University, U.S.A
Issue’s Table of Contents
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 December 2015
- Revised: 1 October 2015
- Accepted: 1 October 2015
- Received: 1 July 2015
Published in tiis Volume 5, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Datasets
MovieLens
ratings
recommendations
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2,090
  Total Citations
  View Citations
- 8,125
  Total Downloads
- Downloads (Last 12 months)1,161
- Downloads (Last 6 weeks)145
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The MovieLens Datasets: History and Context

ACM Transactions on Interactive Intelligent Systems

Abstract

References

Cited By

Index Terms

Recommendations

iSynchronizer: A Tool for Extracting, Integration and Analysis of MovieLens and IMDb Datasets

Putting Users in Control of their Recommendations

An analysis of users' propensity toward diversity in recommendations