Abstract
The MovieLens datasets are widely used in education, research, and industry. They are downloaded hundreds of thousands of times each year, reflecting their use in popular press programming books, traditional and online courses, and software. These datasets are a product of member activity in the MovieLens movie recommendation system, an active research platform that has hosted many experiments since its launch in 1997. This article documents the history of MovieLens and the MovieLens datasets. We include a discussion of lessons learned from running a long-standing, live research platform from the perspective of a research organization. We document best practices and limitations of using the MovieLens datasets in new research.
- Shuo Chang, F. Maxwell Harper, and Loren Terveen. 2015. Using groups of items for preference elicitation in recommender systems. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW’’15). ACM, New York, NY, 1258--1269. DOI:http://dx.doi.org/10.1145/2675133.2675210 Google ScholarDigital Library
- Dan Cosley, Dan Frankowski, Sara Kiesler, Loren Terveen, and John Riedl. 2005. How oversight improves member-maintained communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’05). ACM, New York, NY, 11--20. DOI:http://dx.doi.org/10.1145/1054972.1054975 Google ScholarDigital Library
- Dan Cosley, Shyong K. Lam, Istvan Albert, Joseph A. Konstan, and John Riedl. 2003. Is seeing believing?: How recommender system interfaces affect users’ opinions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’03). ACM, New York, NY, 585--592. DOI:http://dx.doi.org/10.1145/642611.642713 Google ScholarDigital Library
- Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, NY, 271--280. DOI:http://dx.doi.org/10.1145/1242572.1242610 Google ScholarDigital Library
- Mukund Deshpande and George Karypis. 2004. Item-based top-N recommendation algorithms. ACM Transactions on Information Systems 22, 1, 143--177. DOI:http://dx.doi.org/10.1145/963770.963776 Google ScholarDigital Library
- Sara Drenner, Max Harper, Dan Frankowski, John Riedl, and Loren Terveen. 2006. Insert movie reference here: A system to bridge conversation and item-oriented web sites. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’06). ACM, New York, NY, 951--954. DOI:http://dx.doi.org/10.1145/1124772.1124914 Google ScholarDigital Library
- Gideon Dror, Yahoo Labs, Noam Koenigstein, Yehuda Koren, and Markus Weimer. 2012. The Yahoo! music dataset and KDDCup11. In Journal of Machine Learning Research Workshop and Conference Proceedings: Proceedings of KDD Cup 2011. 3--18.Google Scholar
- Michael D. Ekstrand, Daniel Kluver, F. Maxwell Harper, and Joseph A. Konstan. 2015. Letting users choose recommender algorithms: An experimental study. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 11--18. DOI:http://dx.doi.org/10.1145/2792838.2800195 Google ScholarDigital Library
- Michael D. Ekstrand, Michael Ludwig, Joseph A. Konstan, and John T. Riedl. 2011. Rethinking the recommender research ecosystem: Reproducibility, openness, and lenskit. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York, NY, 133--140. DOI:http://dx.doi.org/10.1145/2043932.2043958 Google ScholarDigital Library
- Malcolm Gladwell. 1999. The science of the sleeper. The New Yorker. Retrieved November 13, 2015 from http://gladwell.com/the-science-of-the-sleeper/.Google Scholar
- Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. 2001. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval 4, 2, 133--151. DOI:http://dx.doi.org/10.1023/A:1011419012209 Google ScholarDigital Library
- F. Maxwell Harper, Dan Frankowski, Sara Drenner, Yuqing Ren, Sara Kiesler, Loren Terveen, Robert Kraut, and John Riedl. 2007a. Talk amongst yourselves: Inviting users to participate in online conversations. In Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI’07). ACM, New York, NY, 62--71. DOI:http://dx.doi.org/10.1145/1216295.1216313 Google ScholarDigital Library
- F. Maxwell Harper, Shilad Sen, and Dan Frankowski. 2007b. Supporting social recommendations with activity-balanced clustering. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 165--168. DOI:http://dx.doi.org/10.1145/1297231.1297262 Google ScholarDigital Library
- F. Maxwell Harper, Funing Xu, Harmanpreet Kaur, Kyle Condiff, Shuo Chang, and Loren Terveen. 2015. Putting users in control of their recommendations. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 3--10. DOI:http://dx.doi.org/10.1145/2792838.2800179 Google ScholarDigital Library
- George Karypis. 2001. Evaluation of item-based top-N recommendation algorithms. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM’01). ACM, New York, NY, 247--254. DOI:http://dx.doi.org/10.1145/502585.502627 Google ScholarDigital Library
- Joseph A. Konstan, Bradley N. Miller, David Maltz, Jonathan L. Herlocker, Lee R. Gordon, and John Riedl. 1997. GroupLens: Applying collaborative filtering to Usenet news. Communications of the ACM 40, 3, 77--87. DOI:http://dx.doi.org/10.1145/245108.245126 Google ScholarDigital Library
- Joseph A. Konstan, J. D. Walker, D. Christopher Brooks, Keith Brown, and Michael D. Ekstrand. 2014. Teaching recommender systems at large scale: Evaluation and lessons learned from a hybrid MOOC. In Proceedings of the 1st ACM Conference on Learning @ Scale Conference (L@S’14). ACM, New York, NY, 61--70. DOI:http://dx.doi.org/10.1145/2556325.2566244 Google ScholarDigital Library
- John G. Lynch, Jr., Dipankar Chakravarti, and Anusree Mitra. 1991. Contrast effects in consumer judgments: Changes in mental representations or in the anchoring of rating scales? Journal of Consumer Research 18, 3, 284--297.Google ScholarCross Ref
- Paolo Massa and Paolo Avesani. 2007. Trust-aware recommender systems. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 17--24. DOI:http://dx.doi.org/10.1145/1297231.1297235 Google ScholarDigital Library
- Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015a. Inferring networks of substitutable and complementary products. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). ACM, New York, NY, 785--794. DOI:http://dx.doi.org/10.1145/2783258.2783381 Google ScholarDigital Library
- Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015b. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, New York, NY, 43--52. DOI:http://dx.doi.org/10.1145/2766462.2767755 Google ScholarDigital Library
- Bradley Norman Miller. 2003. Toward a Personal Recommender System. Ph.D. dissertation. University of Minnesota, Minneapolis, MN. Retrieved from http://search.proquest.com/dissertations/docview/305324342/abstract/A46BCC87FC4D4DD4PQ/1?accountid=14586.Google Scholar
- Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl. 2001. PolyLens: A recommender system for groups of users. In Proceedings of the 7th Conference on European Conference on Computer Supported Cooperative Work (ECSCW’01). Kluwer Academic Publishers, Norwell, MA, 199--218. Google ScholarDigital Library
- John O’Donovan and Barry Smyth. 2005. Trust in recommender systems. In Proceedings of the 10th International Conference on Intelligent User Interfaces (IUI’05). ACM, New York, NY, 167--174. DOI:http://dx.doi.org/10.1145/1040830.1040870 Google ScholarDigital Library
- Nick Pentreath. 2015. Machine Learning with Spark. Packt Publishing Ltd, Birmingham, UK.Google Scholar
- Reid Priedhorsky, Mikhil Masli, and Loren Terveen. 2010. Eliciting and focusing geographic volunteer work. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW’10). ACM, New York, NY, 61--70. DOI:http://dx.doi.org/10.1145/1718918.1718931 Google ScholarDigital Library
- Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, and John Riedl. 2002. Getting to know you: Learning new user preferences in recommender systems. In Proceedings of the 7th International Conference on Intelligent User Interfaces (IUI’02). ACM, New York, NY, 127--134. DOI:http://dx.doi.org/10.1145/502716.502737 Google ScholarDigital Library
- Al Mamunur Rashid, George Karypis, and John Riedl. 2008. Learning preferences of new users in recommender systems: An information theoretic approach. ACM SIGKDD Explorations Newsletter 10, 2, 90--100. DOI:http://dx.doi.org/10.1145/1540276.1540302 Google ScholarDigital Library
- Yuqing Ren, F. Harper, Sara Drenner, Loren Terveen, Sara Kiesler, John Riedl, and Robert Kraut. 2012. Building member attachment in online communities: Applying theories of group identity and interpersonal bonds. Management Information Systems Quarterly 36, 3 (Sept. 2012), 841--864. Google ScholarDigital Library
- Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl. 1994. GroupLens: An open architecture for collaborative filtering of Netnews. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW’94). ACM, New York, NY, 175--186. DOI:http://dx.doi.org/10.1145/192844.192905 Google ScholarDigital Library
- Eric Ries. 2011. The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown Business, New York, NY.Google Scholar
- Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Application of Dimensionality Reduction in Recommender System—A Case Study. Technical Report. DTIC Document. Retrieved from http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix==html&identifier==ADA439541.Google ScholarCross Ref
- Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW’01). ACM, New York, NY, 285--295. DOI:http://dx.doi.org/10.1145/371920.372071 Google ScholarDigital Library
- Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York, NY, 253--260. DOI:http://dx.doi.org/10.1145/564376.564421 Google ScholarDigital Library
- Toby Segaran. 2007. Programming Collective Intelligence: Building Smart Web 2.0 Applications. O’Reilly Media, Inc., Sebastopol, CA. Google ScholarDigital Library
- Shilad Sen, F. Maxwell Harper, Adam LaPitz, and John Riedl. 2007. The quest for quality tags. In Proceedings of the 2007 International ACM Conference on Supporting Group Work (GROUP’07). ACM, New York, NY, 361--370. DOI:http://dx.doi.org/10.1145/1316624.1316678 Google ScholarDigital Library
- Shilad Sen, Shyong K. Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, and John Riedl. 2006. Tagging, communities, vocabulary, evolution. In Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work (CSCW’06). ACM, New York, NY, 181--190. DOI:http://dx.doi.org/10.1145/1180875.1180904 Google ScholarDigital Library
- Shilad Sen, Jesse Vig, and John Riedl. 2009. Learning to recognize valuable tags. In Proceedings of the 14th International Conference on Intelligent User Interfaces (IUI’09). ACM, New York, NY, 87--96. DOI:http://dx.doi.org/10.1145/1502650.1502666 Google ScholarDigital Library
- Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, New York, NY, 257--297. http://link.springer.com/chapter/10.1007/978-0-387-85820-3_8Google Scholar
- Jesse Vig, Shilad Sen, and John Riedl. 2012. The tag genome: Encoding community knowledge to support novel interaction. ACM Transactions on Interactive Intelligent Systems 2, 3, 13:1--13:44. DOI:http://dx.doi.org/10.1145/2362394.2362395 Google ScholarDigital Library
- Jesse Vig, Matthew Soukup, Shilad Sen, and John Riedl. 2010. Tag expression: Tagging with feeling. In Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology (UIST’10). ACM, New York, NY, 323--332. DOI:http://dx.doi.org/10.1145/1866029.1866079 Google ScholarDigital Library
- Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). ACM, New York, NY, 22--32. DOI:http://dx.doi.org/10.1145/1060745.1060754 Google ScholarDigital Library
Index Terms
- The MovieLens Datasets: History and Context
Recommendations
iSynchronizer: A Tool for Extracting, Integration and Analysis of MovieLens and IMDb Datasets
UMAP '18: Adjunct Publication of the 26th Conference on User Modeling, Adaptation and PersonalizationThe growing popularity of e-commerce has ignited the interest of the research community in e-commerce application research and development. For this purpose, variety of applications and resources such as MovieLens and IMDb datasets have been utilized, ...
Putting Users in Control of their Recommendations
RecSys '15: Proceedings of the 9th ACM Conference on Recommender SystemsThe essence of a recommender system is that it can recommend items personalized to the preferences of an individual user. But typically users are given no explicit control over this personalization, and are instead left guessing about how their actions ...
An analysis of users' propensity toward diversity in recommendations
RecSys '14: Proceedings of the 8th ACM Conference on Recommender systemsProviding very accurate recommendations to end users has been nowadays recognized to be just one of the main tasks a recommender systems must be able to perform. While predicting relevant suggestions, attention needs to be paid to their diversification ...
Comments