Abstract
This publicly available curated dataset of almost 100 million photos and videos is free and legal for all.
- Bernd, J., Borth, D., Elizalde, B., Friedland, G., Gallagher, H., Gottlieb, L.R., Janin, A., Karabashlieva, S., Takahashi, J., and Won, J. The YLI-MED corpus: Characteristics, procedures, and plans. Computing Research Repository Division of arXiv abs/1503.04250 (Mar. 2015).Google Scholar
- Borgman, C.L. The conundrum of sharing research data. Journal of the American Society for Information Science and Technology 63, 6 (Apr. 2012), 1059--1078. Google ScholarDigital Library
- Choi, J., Thomee, B., Friedland, G., Cao, L., Ni, K., Borth, D., Elizalde, B., Gottlieb, L., Carrano, C., Pearce, R., and Poland, D. The placing task: A large-scale geo-estimation challenge for social-media videos and images. In Proceedings of the Third ACM International Workshop on Geotagging and Its Applications in Multimedia (Orlando, FL, Nov. 3--7). ACM Press, New York, 2014, 27--31. Google ScholarDigital Library
- Crandall, D. J., Backstrom, L., Huttenlocher, D., and Kleinberg, J. Mapping the world's photos. In Proceedings of the 18th IW3C2 International Conference on the World Wide Web (Madrid, Spain, Apr. 20--24). ACM Press, New York, 2009, 761--770. Google ScholarDigital Library
- Deng, J., Dong, W., Socher, R., Li, L., Li, K., and Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Miami, FL, June 20--25). IEEE Press, New York, 2009. 248--255.Google Scholar
- Facebook, Ericsson, and Qualcomm. A Focus on Efficiency. Technical Report, Internet.org, 2013; https://web.archive.org/web/20150402101302/http://internet.org/efficiencypaperGoogle Scholar
- Fienberg, S.E., Martin, M.E., and Straf, M.L. Eds. (National Research Council). Sharing Research Data. National Academy Press, Washington, D.C., 1985; http://www.nap.edu/catalog/2033/sharing-research-dataGoogle Scholar
- Good, J. How many photos have ever been taken?. Internet Archive Wayback Machine, Sept. 2011; https://web.archive.org/web/20150203215607/http://blog.1000memories.com/94-number-of-photos-ever-taken-digital-and-analog-in-shoeboxGoogle Scholar
- Hays, J. and Efros, A.A. IM2GPS: Estimating geographic information from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Anchorage, AK, June 23--28). IEEE Press, New York, 2008.Google Scholar
- Hecht, B., Hong, L., Suh, B., and Chi, E. H. Tweets from Justin Bieber's heart: The dynamics of the location field in user profiles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, Canada, May 7--12). ACM Press, New York, 2011, 237--246. Google ScholarDigital Library
- Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R. B., Guadarrama, S., and Darrell, T. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia (Orlando, FL, Nov. 3--7). ACM Press, New York, 2014, 675--678. Google ScholarDigital Library
- Kremerskothen, K. Welcome the Internet archive to the commons. Flickr, San Francisco, CA, Aug. 2014; https://blog.flickr.net/2014/08/29/welcome-the-internet-archive-to-the-commons/Google Scholar
- Krizhevsky, A., Sutskever, I., and Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems (Lake Tahoe, CA, Dec 3--8). Curran Associates, Red Hook, NY, 2012, 1097--1105.Google Scholar
- Li, L., Socher, R., and Fei-Fei, L. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Miami, FL, June 20--25). IEEE Press, New York, 2009, 2036--2043.Google Scholar
- Rattenbury, T., Good, N., and Naaman, M. Towards automatic extraction of event and place semantics from Flickr tags. In Proceedings of the 30th ACM International Conference on Research and Development in Information Retrieval (Amsterdam, the Netherlands, July 23--27). ACM Press, New York, 2007, 103--110. Google ScholarDigital Library
- Renear, A.H., Sacchi, S., and Wickett, K.M. Definitions of dataset in the scientific and technical literature. In Proceedings of the 73rd Annual Meeting of the American Society for Information Science and Technology (Pittsburgh, PA, Oct. 22--27). Association for Information Science and Technology, Silver Spring, MD, 2010, article 81. Google ScholarDigital Library
- Snavely, N., Seitz, S., and Szeliski, R. Photo tourism: Exploring photo collections in 3D. ACM Transactions on Graphics 25, 3 (July 2006), 835--846. Google ScholarDigital Library
- Swan, A. and Brown, S. To Share or Not to Share: Publication and Quality Assurance of Research Data Outputs. Technical Report. Research Information Network, London, U.K., 2008.Google Scholar
- Van Dijck, J. Digital photography: Communication, identity, memory. Visual Communication 7, 1 (Feb. 2008), 57--76.Google ScholarCross Ref
- Wilson, M.L., Chi, E.H., Reeves, S., and Coyle, D. RepliCHI: The workshop II. In Proceedings of the International Conference on Human Factors in Computing Systems, Extended Abstracts (Toronto, Canada, Apr. 26--May 1). ACM Press, New York, 2014, 33--36. Google ScholarDigital Library
- Yelp. Yelp Dataset Challenge. Yelp, San Francisco, CA; http://yelp.com/dataset_challenge/Google Scholar
- YouTube. YouTube press statistics. YouTube, San Bruno, CA; http://youtube.com/yt/press/statistics.htmlGoogle Scholar
Index Terms
- YFCC100M: the new data in multimedia research
Recommendations
Real-time Analysis and Visualization of the YFCC100m Dataset
MMCommons '15: Proceedings of the 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel SolutionsWith the Yahoo Flickr Creative Commons 100 Million (YFCC100m) dataset, a novel dataset was introduced to the computer vision and multimedia research community. To maximize the benefit for the research community and utilize its potential, this dataset has ...
Analysis of Spatial, Temporal, and Content Characteristics of Videos in the YFCC100M Dataset
MMCommons '16: Proceedings of the 2016 ACM Workshop on Multimedia COMMONSThe Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M) is one of the largest public databases containing images and videos and their annotations for research on multimedia analysis. In this paper, we present our study on analysis of ...
Practical guide to using the YFCC100M and MMCOMMONS on a budget
The Yahoo-Flickr Creative Commons 100 Million (YFCC100M), the largest freely usable multimedia dataset to have been released so far, is widely used by students, researchers and engineers on topics in multimedia that range from computer vision to machine ...
Comments