research-article

Fashion-focused creative commons social dataset

Authors:
Babak Loni

Delft University of Technology, The Netherlands

Delft University of Technology, The Netherlands
View Profile

,
Maria Menendez

University of Trento, Italy

University of Trento, Italy
View Profile

,
Mihai Georgescu

L3S Research Center, Germany

L3S Research Center, Germany
View Profile

,
Luca Galli

Polytechnic of Milan, Italy

Polytechnic of Milan, Italy
View Profile

,
Claudio Massari

Innovation Engineering, Italy

Innovation Engineering, Italy
View Profile

,
Ismail Sengor Altingovde

L3S Research Center, Germany

L3S Research Center, Germany
View Profile

,
Davide Martinenghi

Polytechnic of Milan, Italy

Polytechnic of Milan, Italy
View Profile

,
Mark Melenhorst

Novay, The Netherlands

Novay, The Netherlands
View Profile

,
Raynor Vliegendhart

Delft University of Technology, The Netherlands

Delft University of Technology, The Netherlands
View Profile

,
Martha Larson

Delft University of Technology, The Netherlands

Delft University of Technology, The Netherlands
View Profile

MMSys '13: Proceedings of the 4th ACM Multimedia Systems ConferenceFebruary 2013Pages 72–77https://doi.org/10.1145/2483977.2483984

Published:28 February 2013Publication History

MMSys '13: Proceedings of the 4th ACM Multimedia Systems Conference

Pages 72–77

ABSTRACT

In this work, we present a fashion-focused Creative Commons dataset, which is designed to contain a mix of general images as well as a large component of images that are focused on fashion (i.e., relevant to particular clothing items or fashion accessories). The dataset contains 4810 images and related metadata. Furthermore, a ground truth on image's tags is presented. Ground truth generation for large-scale datasets is a necessary but expensive task. Traditional expert based approaches have become an expensive and non-scalable solution. For this reason, we turn to crowdsourcing techniques in order to collect ground truth labels; in particular we make use of the commercial crowdsourcing platform, Amazon Mechanical Turk (AMT). Two different groups of annotators (i.e., trusted annotators known to the authors and crowdsourcing workers on AMT) participated in the ground truth creation. Annotation agreement between the two groups is analyzed. Applications of the dataset in different contexts are discussed. This dataset contributes to research areas such as crowdsourcing for multimedia, multimedia content analysis, and design of systems that can elicit fashion preferences from users.

References

M. Larson, M. Soleymani, M. Eskevich, P. Serdyukov, R. Ordelman and G. Jones, The Community and the Crowd: Multimedia Benchmark Dataset Development, in IEEE Multimedia, 2012. Google ScholarDigital Library
A. J. Quinn and B. B. Bederson, Human computation: a survey and taxonomy of a growing field, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2011. Google ScholarDigital Library
S. Nowak and S. Ruger, How reliable are annotations via crowdsourcing? a study about inter-annotator agreement for multi-label image annotation, in The 11th ACM International Conference on Multimedia Information Retrieval (MIR), Philadelphia, USA, 2010. Google ScholarDigital Library
L. Ahn, Games with a purpose, IEEE Computer Society, vol. 39, pp. 92--96, 2006. Google ScholarDigital Library
L. Ahn and L. Dabbish, Labeling images with a computer game, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04), 2004. Google ScholarDigital Library
S. Li, J. Feng, Z. Song, T. Zhang, H. Lu, C. Xu and S. Yan, Hi, Magic Closet, Tell Me What to Wear, in Proceeding of International Conference of ACM MM, Nara, Japan, 2012. Google ScholarDigital Library
T. Iwata, S. Watanabe and H. Sawada, Fashion coordinates recommender system using photographs from fashion magazines, in In Proceedings of the Twenty-Second international joint conference on Artificial Intelligence, 2011. Google ScholarDigital Library
M. Fukuda and Y. Nakatani, What to Wear in Different Situations? A New Approach to a Fashion Coordinate Support System, in Proceedings of the World Congress on Engineering and Computer Science, 2011.Google Scholar
R. Sakurai and J.-H. Lee, People and Clothes Recognition based on Topic Model Integration (SII), in EEE/SICE International Symposium on System, 2011.Google Scholar
K. Yamaguchi, H. Kiapour, L. Ortiz and T. Berg, Parsing clothing in fashion photographs, in IEEE Conference on Computer Vision and Pattern Recognition, 2012. Google ScholarDigital Library
J. S. Pedro, S. Siersdorfer and M. Sanderson, Content redundancy in YouTube and its application to video tagging, in ACM Trans. Inf. Syst., 2011. Google ScholarDigital Library
K. Filippova and K. Hall, Improved video categorization from text metadata and user comments, in Proceedings of the 34th international ACM SIGIR conference on Research and development in Information (SIGIR-2011), 2011. Google ScholarDigital Library
S. V. Chelaru, C. Orellana-Rodriguez and I. S. Altingövde, Can Social Features Help Learning to Rank YouTube Videos?, in 13th International Conference on Web Information System Engineering, 2012. Google ScholarDigital Library
A. Cox, P. Clough and S. Siersdorfer, "Developing metrics to characterize Flickr groups," Am. Soc. Inf. Sci. Technol., vol. 62, pp. 493--506, 2011. Google ScholarDigital Library
J. San Pedro, T. Yeh and N. Oliver, Leveraging user comments for aesthetic aware image search reranking, in WWW'12, 2012. Google ScholarDigital Library
C. Eickhoff and C. Vries, Increasing cheat robustness of crowdsourcing tasks, Information Retrieval Journal, 2012. Google ScholarDigital Library
P. Fraternali, M. Tagliasacchi and D. Martinenghi, The CUBRIK project: human-enhanced time-aware multimedia search, in WWW 2012 -- European Projects Track, 2012. Google ScholarDigital Library
J. Randolph, Free-marginal multirater kappa: An alternative to Fleiss' fixed-marginal multirater kappa, in Joensuu University Learning and Instruction Symposium, 2005.Google Scholar
L. v. Ahn, R. Liu and a. M. Blum, Peekaboom:a game for locating objects in images, in In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '06), 2006. Google ScholarDigital Library
L. Galli, P. Fraternali, J. Novak, D. Martinenghi and M. Tagliasacchi, A Draw-and-Guess Game to Segment Images, in ASE International Conference on Social Computing (SocialCom), 2012. Google ScholarDigital Library
J. A. Noble, Minority voices of crowdsourcing: why we should pay attention to every member of the crowd, in In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work Companion (CSCW '12), New York, USA, 2012. Google ScholarDigital Library

Index Terms

Fashion-focused creative commons social dataset
1. Applied computing
  1. Computers in other domains
    1. Digital libraries and archives
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Document collection models
  2. Information systems applications
    1. Digital libraries and archives

Recommendations

Fashion 10000: an enriched social image dataset for fashion and clothing
MMSys '14: Proceedings of the 5th ACM Multimedia Systems Conference

In this work, we present a new social image dataset related to the fashion and clothing domain. The dataset contains more than 32000 images, their context and social metadata. Furthermore the dataset is enriched with several types of annotations ...
Read More
The 2012 social event detection dataset
MMSys '13: Proceedings of the 4th ACM Multimedia Systems Conference

This paper presents the 2012 Social Event Detection dataset (SED2012). The dataset constitutes a challenging benchmark for methods that detect social events in large collections of multimedia items. More specifically, the dataset comprises more than 160 ...
Read More
Reddit entity linking dataset
Abstract
We introduce and make publicly available an entity linking dataset from Reddit that contains 17,316 linked entities, each annotated by three human annotators and then grouped into Gold, Silver, and Bronze to indicate inter-annotator ...
Highlights
- We release a new entity linking dataset taken from Reddit.
- Human annotators ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MMSys '13: Proceedings of the 4th ACM Multimedia Systems Conference
February 2013
304 pages
ISBN:9781450318945
DOI:10.1145/2483977
General Chair:
Carsten Griwodz
Simula Research Laboratory & University of Oslo, Norway
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 February 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crowdsourcing
dataset
fashion
multimedia content analysis
Qualifiers
- research-article
Conference

Acceptance Rates
MMSys '13 Paper Acceptance Rate15of63submissions,24%Overall Acceptance Rate176of530submissions,33%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 659
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fashion-focused creative commons social dataset

MMSys '13: Proceedings of the 4th ACM Multimedia Systems Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Fashion 10000: an enriched social image dataset for fashion and clothing

The 2012 social event detection dataset

Reddit entity linking dataset

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Fashion-focused creative commons social dataset

MMSys '13: Proceedings of the 4th ACM Multimedia Systems Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Fashion 10000: an enriched social image dataset for fashion and clothing

The 2012 social event detection dataset

Reddit entity linking dataset

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media