skip to main content
research-article

The MovieLens Datasets: History and Context

Published:22 December 2015Publication History
Skip Abstract Section

Abstract

The MovieLens datasets are widely used in education, research, and industry. They are downloaded hundreds of thousands of times each year, reflecting their use in popular press programming books, traditional and online courses, and software. These datasets are a product of member activity in the MovieLens movie recommendation system, an active research platform that has hosted many experiments since its launch in 1997. This article documents the history of MovieLens and the MovieLens datasets. We include a discussion of lessons learned from running a long-standing, live research platform from the perspective of a research organization. We document best practices and limitations of using the MovieLens datasets in new research.

References

  1. Shuo Chang, F. Maxwell Harper, and Loren Terveen. 2015. Using groups of items for preference elicitation in recommender systems. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW’’15). ACM, New York, NY, 1258--1269. DOI:http://dx.doi.org/10.1145/2675133.2675210 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Dan Cosley, Dan Frankowski, Sara Kiesler, Loren Terveen, and John Riedl. 2005. How oversight improves member-maintained communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’05). ACM, New York, NY, 11--20. DOI:http://dx.doi.org/10.1145/1054972.1054975 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Dan Cosley, Shyong K. Lam, Istvan Albert, Joseph A. Konstan, and John Riedl. 2003. Is seeing believing?: How recommender system interfaces affect users’ opinions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’03). ACM, New York, NY, 585--592. DOI:http://dx.doi.org/10.1145/642611.642713 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, NY, 271--280. DOI:http://dx.doi.org/10.1145/1242572.1242610 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Mukund Deshpande and George Karypis. 2004. Item-based top-N recommendation algorithms. ACM Transactions on Information Systems 22, 1, 143--177. DOI:http://dx.doi.org/10.1145/963770.963776 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sara Drenner, Max Harper, Dan Frankowski, John Riedl, and Loren Terveen. 2006. Insert movie reference here: A system to bridge conversation and item-oriented web sites. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’06). ACM, New York, NY, 951--954. DOI:http://dx.doi.org/10.1145/1124772.1124914 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gideon Dror, Yahoo Labs, Noam Koenigstein, Yehuda Koren, and Markus Weimer. 2012. The Yahoo! music dataset and KDDCup11. In Journal of Machine Learning Research Workshop and Conference Proceedings: Proceedings of KDD Cup 2011. 3--18.Google ScholarGoogle Scholar
  8. Michael D. Ekstrand, Daniel Kluver, F. Maxwell Harper, and Joseph A. Konstan. 2015. Letting users choose recommender algorithms: An experimental study. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 11--18. DOI:http://dx.doi.org/10.1145/2792838.2800195 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Michael D. Ekstrand, Michael Ludwig, Joseph A. Konstan, and John T. Riedl. 2011. Rethinking the recommender research ecosystem: Reproducibility, openness, and lenskit. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York, NY, 133--140. DOI:http://dx.doi.org/10.1145/2043932.2043958 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Malcolm Gladwell. 1999. The science of the sleeper. The New Yorker. Retrieved November 13, 2015 from http://gladwell.com/the-science-of-the-sleeper/.Google ScholarGoogle Scholar
  11. Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. 2001. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval 4, 2, 133--151. DOI:http://dx.doi.org/10.1023/A:1011419012209 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. Maxwell Harper, Dan Frankowski, Sara Drenner, Yuqing Ren, Sara Kiesler, Loren Terveen, Robert Kraut, and John Riedl. 2007a. Talk amongst yourselves: Inviting users to participate in online conversations. In Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI’07). ACM, New York, NY, 62--71. DOI:http://dx.doi.org/10.1145/1216295.1216313 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. F. Maxwell Harper, Shilad Sen, and Dan Frankowski. 2007b. Supporting social recommendations with activity-balanced clustering. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 165--168. DOI:http://dx.doi.org/10.1145/1297231.1297262 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Maxwell Harper, Funing Xu, Harmanpreet Kaur, Kyle Condiff, Shuo Chang, and Loren Terveen. 2015. Putting users in control of their recommendations. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 3--10. DOI:http://dx.doi.org/10.1145/2792838.2800179 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. George Karypis. 2001. Evaluation of item-based top-N recommendation algorithms. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM’01). ACM, New York, NY, 247--254. DOI:http://dx.doi.org/10.1145/502585.502627 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Joseph A. Konstan, Bradley N. Miller, David Maltz, Jonathan L. Herlocker, Lee R. Gordon, and John Riedl. 1997. GroupLens: Applying collaborative filtering to Usenet news. Communications of the ACM 40, 3, 77--87. DOI:http://dx.doi.org/10.1145/245108.245126 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Joseph A. Konstan, J. D. Walker, D. Christopher Brooks, Keith Brown, and Michael D. Ekstrand. 2014. Teaching recommender systems at large scale: Evaluation and lessons learned from a hybrid MOOC. In Proceedings of the 1st ACM Conference on Learning @ Scale Conference (L@S’14). ACM, New York, NY, 61--70. DOI:http://dx.doi.org/10.1145/2556325.2566244 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. John G. Lynch, Jr., Dipankar Chakravarti, and Anusree Mitra. 1991. Contrast effects in consumer judgments: Changes in mental representations or in the anchoring of rating scales? Journal of Consumer Research 18, 3, 284--297.Google ScholarGoogle ScholarCross RefCross Ref
  19. Paolo Massa and Paolo Avesani. 2007. Trust-aware recommender systems. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 17--24. DOI:http://dx.doi.org/10.1145/1297231.1297235 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015a. Inferring networks of substitutable and complementary products. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). ACM, New York, NY, 785--794. DOI:http://dx.doi.org/10.1145/2783258.2783381 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015b. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, New York, NY, 43--52. DOI:http://dx.doi.org/10.1145/2766462.2767755 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bradley Norman Miller. 2003. Toward a Personal Recommender System. Ph.D. dissertation. University of Minnesota, Minneapolis, MN. Retrieved from http://search.proquest.com/dissertations/docview/305324342/abstract/A46BCC87FC4D4DD4PQ/1?accountid=14586.Google ScholarGoogle Scholar
  23. Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl. 2001. PolyLens: A recommender system for groups of users. In Proceedings of the 7th Conference on European Conference on Computer Supported Cooperative Work (ECSCW’01). Kluwer Academic Publishers, Norwell, MA, 199--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. John O’Donovan and Barry Smyth. 2005. Trust in recommender systems. In Proceedings of the 10th International Conference on Intelligent User Interfaces (IUI’05). ACM, New York, NY, 167--174. DOI:http://dx.doi.org/10.1145/1040830.1040870 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Nick Pentreath. 2015. Machine Learning with Spark. Packt Publishing Ltd, Birmingham, UK.Google ScholarGoogle Scholar
  26. Reid Priedhorsky, Mikhil Masli, and Loren Terveen. 2010. Eliciting and focusing geographic volunteer work. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW’10). ACM, New York, NY, 61--70. DOI:http://dx.doi.org/10.1145/1718918.1718931 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, and John Riedl. 2002. Getting to know you: Learning new user preferences in recommender systems. In Proceedings of the 7th International Conference on Intelligent User Interfaces (IUI’02). ACM, New York, NY, 127--134. DOI:http://dx.doi.org/10.1145/502716.502737 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Al Mamunur Rashid, George Karypis, and John Riedl. 2008. Learning preferences of new users in recommender systems: An information theoretic approach. ACM SIGKDD Explorations Newsletter 10, 2, 90--100. DOI:http://dx.doi.org/10.1145/1540276.1540302 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yuqing Ren, F. Harper, Sara Drenner, Loren Terveen, Sara Kiesler, John Riedl, and Robert Kraut. 2012. Building member attachment in online communities: Applying theories of group identity and interpersonal bonds. Management Information Systems Quarterly 36, 3 (Sept. 2012), 841--864. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl. 1994. GroupLens: An open architecture for collaborative filtering of Netnews. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW’94). ACM, New York, NY, 175--186. DOI:http://dx.doi.org/10.1145/192844.192905 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Eric Ries. 2011. The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown Business, New York, NY.Google ScholarGoogle Scholar
  32. Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Application of Dimensionality Reduction in Recommender System—A Case Study. Technical Report. DTIC Document. Retrieved from http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix==html&identifier==ADA439541.Google ScholarGoogle ScholarCross RefCross Ref
  33. Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW’01). ACM, New York, NY, 285--295. DOI:http://dx.doi.org/10.1145/371920.372071 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York, NY, 253--260. DOI:http://dx.doi.org/10.1145/564376.564421 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Toby Segaran. 2007. Programming Collective Intelligence: Building Smart Web 2.0 Applications. O’Reilly Media, Inc., Sebastopol, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Shilad Sen, F. Maxwell Harper, Adam LaPitz, and John Riedl. 2007. The quest for quality tags. In Proceedings of the 2007 International ACM Conference on Supporting Group Work (GROUP’07). ACM, New York, NY, 361--370. DOI:http://dx.doi.org/10.1145/1316624.1316678 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Shilad Sen, Shyong K. Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, and John Riedl. 2006. Tagging, communities, vocabulary, evolution. In Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work (CSCW’06). ACM, New York, NY, 181--190. DOI:http://dx.doi.org/10.1145/1180875.1180904 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Shilad Sen, Jesse Vig, and John Riedl. 2009. Learning to recognize valuable tags. In Proceedings of the 14th International Conference on Intelligent User Interfaces (IUI’09). ACM, New York, NY, 87--96. DOI:http://dx.doi.org/10.1145/1502650.1502666 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, New York, NY, 257--297. http://link.springer.com/chapter/10.1007/978-0-387-85820-3_8Google ScholarGoogle Scholar
  40. Jesse Vig, Shilad Sen, and John Riedl. 2012. The tag genome: Encoding community knowledge to support novel interaction. ACM Transactions on Interactive Intelligent Systems 2, 3, 13:1--13:44. DOI:http://dx.doi.org/10.1145/2362394.2362395 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jesse Vig, Matthew Soukup, Shilad Sen, and John Riedl. 2010. Tag expression: Tagging with feeling. In Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology (UIST’10). ACM, New York, NY, 323--332. DOI:http://dx.doi.org/10.1145/1866029.1866079 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). ACM, New York, NY, 22--32. DOI:http://dx.doi.org/10.1145/1060745.1060754 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The MovieLens Datasets: History and Context

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Interactive Intelligent Systems
            ACM Transactions on Interactive Intelligent Systems  Volume 5, Issue 4
            Regular Articles and Special issue on New Directions in Eye Gaze for Interactive Intelligent Systems (Part 1 of 2)
            January 2016
            118 pages
            ISSN:2160-6455
            EISSN:2160-6463
            DOI:10.1145/2866565
            Issue’s Table of Contents

            Copyright © 2015 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 22 December 2015
            • Revised: 1 October 2015
            • Accepted: 1 October 2015
            • Received: 1 July 2015
            Published in tiis Volume 5, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader