skip to main content
research-article
Open Access

YFCC100M: the new data in multimedia research

Authors Info & Claims
Published:25 January 2016Publication History
Skip Abstract Section

Abstract

This publicly available curated dataset of almost 100 million photos and videos is free and legal for all.

References

  1. Bernd, J., Borth, D., Elizalde, B., Friedland, G., Gallagher, H., Gottlieb, L.R., Janin, A., Karabashlieva, S., Takahashi, J., and Won, J. The YLI-MED corpus: Characteristics, procedures, and plans. Computing Research Repository Division of arXiv abs/1503.04250 (Mar. 2015).Google ScholarGoogle Scholar
  2. Borgman, C.L. The conundrum of sharing research data. Journal of the American Society for Information Science and Technology 63, 6 (Apr. 2012), 1059--1078. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Choi, J., Thomee, B., Friedland, G., Cao, L., Ni, K., Borth, D., Elizalde, B., Gottlieb, L., Carrano, C., Pearce, R., and Poland, D. The placing task: A large-scale geo-estimation challenge for social-media videos and images. In Proceedings of the Third ACM International Workshop on Geotagging and Its Applications in Multimedia (Orlando, FL, Nov. 3--7). ACM Press, New York, 2014, 27--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Crandall, D. J., Backstrom, L., Huttenlocher, D., and Kleinberg, J. Mapping the world's photos. In Proceedings of the 18th IW3C2 International Conference on the World Wide Web (Madrid, Spain, Apr. 20--24). ACM Press, New York, 2009, 761--770. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Deng, J., Dong, W., Socher, R., Li, L., Li, K., and Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Miami, FL, June 20--25). IEEE Press, New York, 2009. 248--255.Google ScholarGoogle Scholar
  6. Facebook, Ericsson, and Qualcomm. A Focus on Efficiency. Technical Report, Internet.org, 2013; https://web.archive.org/web/20150402101302/http://internet.org/efficiencypaperGoogle ScholarGoogle Scholar
  7. Fienberg, S.E., Martin, M.E., and Straf, M.L. Eds. (National Research Council). Sharing Research Data. National Academy Press, Washington, D.C., 1985; http://www.nap.edu/catalog/2033/sharing-research-dataGoogle ScholarGoogle Scholar
  8. Good, J. How many photos have ever been taken?. Internet Archive Wayback Machine, Sept. 2011; https://web.archive.org/web/20150203215607/http://blog.1000memories.com/94-number-of-photos-ever-taken-digital-and-analog-in-shoeboxGoogle ScholarGoogle Scholar
  9. Hays, J. and Efros, A.A. IM2GPS: Estimating geographic information from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Anchorage, AK, June 23--28). IEEE Press, New York, 2008.Google ScholarGoogle Scholar
  10. Hecht, B., Hong, L., Suh, B., and Chi, E. H. Tweets from Justin Bieber's heart: The dynamics of the location field in user profiles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, Canada, May 7--12). ACM Press, New York, 2011, 237--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R. B., Guadarrama, S., and Darrell, T. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia (Orlando, FL, Nov. 3--7). ACM Press, New York, 2014, 675--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kremerskothen, K. Welcome the Internet archive to the commons. Flickr, San Francisco, CA, Aug. 2014; https://blog.flickr.net/2014/08/29/welcome-the-internet-archive-to-the-commons/Google ScholarGoogle Scholar
  13. Krizhevsky, A., Sutskever, I., and Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems (Lake Tahoe, CA, Dec 3--8). Curran Associates, Red Hook, NY, 2012, 1097--1105.Google ScholarGoogle Scholar
  14. Li, L., Socher, R., and Fei-Fei, L. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Miami, FL, June 20--25). IEEE Press, New York, 2009, 2036--2043.Google ScholarGoogle Scholar
  15. Rattenbury, T., Good, N., and Naaman, M. Towards automatic extraction of event and place semantics from Flickr tags. In Proceedings of the 30th ACM International Conference on Research and Development in Information Retrieval (Amsterdam, the Netherlands, July 23--27). ACM Press, New York, 2007, 103--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Renear, A.H., Sacchi, S., and Wickett, K.M. Definitions of dataset in the scientific and technical literature. In Proceedings of the 73rd Annual Meeting of the American Society for Information Science and Technology (Pittsburgh, PA, Oct. 22--27). Association for Information Science and Technology, Silver Spring, MD, 2010, article 81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Snavely, N., Seitz, S., and Szeliski, R. Photo tourism: Exploring photo collections in 3D. ACM Transactions on Graphics 25, 3 (July 2006), 835--846. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Swan, A. and Brown, S. To Share or Not to Share: Publication and Quality Assurance of Research Data Outputs. Technical Report. Research Information Network, London, U.K., 2008.Google ScholarGoogle Scholar
  19. Van Dijck, J. Digital photography: Communication, identity, memory. Visual Communication 7, 1 (Feb. 2008), 57--76.Google ScholarGoogle ScholarCross RefCross Ref
  20. Wilson, M.L., Chi, E.H., Reeves, S., and Coyle, D. RepliCHI: The workshop II. In Proceedings of the International Conference on Human Factors in Computing Systems, Extended Abstracts (Toronto, Canada, Apr. 26--May 1). ACM Press, New York, 2014, 33--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yelp. Yelp Dataset Challenge. Yelp, San Francisco, CA; http://yelp.com/dataset_challenge/Google ScholarGoogle Scholar
  22. YouTube. YouTube press statistics. YouTube, San Bruno, CA; http://youtube.com/yt/press/statistics.htmlGoogle ScholarGoogle Scholar

Index Terms

  1. YFCC100M: the new data in multimedia research

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Communications of the ACM
        Communications of the ACM  Volume 59, Issue 2
        February 2016
        110 pages
        ISSN:0001-0782
        EISSN:1557-7317
        DOI:10.1145/2886013
        • Editor:
        • Moshe Y. Vardi
        Issue’s Table of Contents

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 January 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Popular
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDFChinese translation

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format