research-article

Semantic Indexing of Wearable Camera Images: Kids'Cam Concepts

Authors:
Alan F. Smeaton

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland
View Profile

,
Kevin McGuinness

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland
View Profile

,
Cathal Gurrin

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland
View Profile

,
Jiang Zhou

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland
View Profile

,
Noel E. O'Connor

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland
View Profile

,
Peng Wang

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Brian Davis

National University of Ireland, Galway, Galway, Ireland

National University of Ireland, Galway, Galway, Ireland
View Profile

,
Lucas Azevedo

National University of Ireland, Galway, Galway, Ireland

National University of Ireland, Galway, Galway, Ireland
View Profile

,
Andre Freitas

University of Passau, Passau, Germany

University of Passau, Passau, Germany
View Profile

,
Louise Signal

University of Otago, Otago, New Zealand

University of Otago, Otago, New Zealand
View Profile

,
Moira Smith

University of Otago, Otago, New Zealand

University of Otago, Otago, New Zealand
View Profile

,
James Stanley

University of Otago, Otago, New Zealand

University of Otago, Otago, New Zealand
View Profile

,
Michelle Barr

University of Otago, Otago, New Zealand

University of Otago, Otago, New Zealand
View Profile

,
Tim Chambers

University of Otago, Otago, New Zealand

University of Otago, Otago, New Zealand
View Profile

,
Cliona Ní Mhurchu

University of Auckland,, Auckland, New Zealand

University of Auckland,, Auckland, New Zealand
View Profile

iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia FusionOctober 2016Pages 27–34https://doi.org/10.1145/2983563.2983566

Published:16 October 2016Publication History

iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion

Pages 27–34

ABSTRACT

In order to provide content-based search on image media, including images and video, they are typically accessed based on manual or automatically assigned concepts or tags, or sometimes based on image-image similarity depending on the use case. While great progress has been made in very recent years in automatic concept detection using machine learning, we are still left with a mis-match between the semantics of the concepts we can automatically detect, and the semantics of the words used in a user's query, for example. In this paper we report on a large collection of images from wearable cameras gathered as part of the Kids'Cam project, which have been both manually annotated from a vocabulary of 83 concepts, and automatically annotated from a vocabulary of 1,000 concepts. This collection allows us to explore issues around how language, in the form of two distinct concept vocabularies or spaces, one manually assigned and thus forming a ground-truth, is used to represent images, in our case taken using wearable cameras. It also allows us to discuss, in general terms, issues around mis-match of concepts in visual media, which derive from language mis-matches. We report the data processing we have completed on this collection and some of our initial experimentation in mapping across the two language vocabularies.

References

G. Awad, C. G. M. Snoek, A. F. Smeaton, and G. Quénot. TRECVid Semantic Indexing of Video: A 6-Year Retrospective. ITE Transactions on Media Technology and Applications, pages 1--22, 2016. (in press).Google Scholar
S. Barzegar, J. E. Sales, A. Freitas, S. Handschuh, and B. Davis. Dinfra: A one stop shop for computing multilingual semantic relatedness. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '15, pages 1027--1028, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274, 2015.Google Scholar
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391, 1990.Google ScholarCross Ref
C. Gurrin, A. F. Smeaton, and A. R. Doherty. Lifelogging: Personal big data. Foundations and Trends in Information Retrieval, 8(1):1--125, 2014. Google ScholarDigital Library
Z. S. Harris. Distributional structure. WORD, 10(2--3):146--162, 1954.Google Scholar
K. McGuinness, R. Aly, K. Chatfield, O. Parkhi, R. Arandjelovic, M. Douze, M. Kemman, M. Kleppe, P. Van Der Kreeft, K. Macquarrie, et al. The axes research video search system. In IEEE ICASSP-International Conference on Acoustics, Speech and Signal Processing, pages 4--9, 2014.Google Scholar
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. CoRR, abs/1310.4546, 2013.Google Scholar
Ministry of Health. New Zealand Health Survey. Annual update of key findings 2014/15. Wellington: Ministry of Health. http://www.health.govt.nz/publication/annual-update-key-results-2014--15-new-zealand-health -survey. Accessed Mar 15, 2016.Google Scholar
OECD. Obesity Update. http://www.oecd.org/els/health-systems/Obesity-Update-2014.pdf. Accessed Oct 3, 2015.Google Scholar
G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, T. Mei, M. Wang, and H.-J. Zhang. Correlative multilabel video annotation with temporal kernels. ACM Trans. Multimedia Comput. Commun. Appl., 5(1):3:1--3:27, Oct. 2008. Google ScholarDigital Library
P. Resnik. Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. CoRR, abs/1105.5444, 2011.Google Scholar
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211--252, 2015. Google ScholarDigital Library
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.Google Scholar
A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and TRECVid. In MIR '06: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pages 321--330, New York, NY, USA, 2006. ACM Press. Google ScholarDigital Library
P. D. Turney and P. Pantel. From frequency to meaning: Vector space models of semantics. J. Artif. Int. Res., 37(1):141--188, Jan. 2010. Google ScholarDigital Library
P. Wang, L. Sun, S. Yang, and A. F. Smeaton. Towards training-free refinement for semantic indexing of visual media. In MultiMedia Modeling: 22nd International Conference, MMM 2016, Miami, FL, USA, January 4--6, 2016, Proceedings, Part I, pages 251--263, Cham, 2016. Springer International Publishing. Google ScholarDigital Library
P. Wang, L. Sun, S. Yang, A. F. Smeaton, and C. Gurrin. Characterizing everyday activities from visual lifelogs based on enhancing concept representation. Computer Vision and Image Understanding, 148:181--192, 2016. Special issue on Assistive Computer Vision and Robotics: Assistive Solutions for Mobility, Communication and HMI. Google ScholarDigital Library
WHO. Report of the Commission on Ending Childhood Obesity. Geneva: World Health Organization. http://apps.who.int.wmezproxy.wnmeds.ac.nz/iris/bitstream/10665/204176/1/9789241510066_eng.pdf. Accessed Dec 18, 2015.Google Scholar
X. Xue, W. Zhang, J. Zhang, B. Wu, J. Fan, and Y. Lu. Correlative multi-label multi-instance image annotation. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 651--658, Nov 2011. Google ScholarDigital Library

Index Terms

Semantic Indexing of Wearable Camera Images: Kids'Cam Concepts
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
2. Information systems
  1. Information storage systems
  2. Information systems applications
    1. Multimedia information systems

Recommendations

Privacy behaviors of lifeloggers using wearable cameras
UbiComp '14: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing

A number of wearable 'lifelogging' camera devices have been released recently, allowing consumers to capture images and other sensor data continuously from a first-person perspective. Unlike traditional cameras that are used deliberately and ...
Read More
Concept-based indexing of annotated images using semantic DNA

One of the challenges in image retrieval is dealing with concepts which have no visual appearance in the images or are not used as keywords in their annotations. To address this problem, this paper proposes an unsupervised concept-based image indexing ...
Read More
Understanding lifelog sharing preferences of lifeloggers
OzCHI '16: Proceedings of the 28th Australian Conference on Computer-Human Interaction

The lifelogging activity enables users, the lifeloggers, to passively capture images using wearable cameras from a first person perspective and ultimately create a visual diary encoding every possible aspect of their life with unprecedented details. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion
October 2016
70 pages
ISBN:9781450345194
DOI:10.1145/2983563
General Chairs:
Marie-Francine Moens
KU Leuven, Belgium
,
Katerina Pastra
Cognitive Systems Research Institute, Greece
,
Kate Saenko
Boston University, USA
,
Tinne Tuytelaars
KU Leuven, Belgium
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 October 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concept vocabularies
image tagging
lifelogging
wearable cameras
Qualifiers
- research-article
Conference

Acceptance Rates
iV&L-MM '16 Paper Acceptance Rate7of15submissions,47%Overall Acceptance Rate7of15submissions,47%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 132
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Semantic Indexing of Wearable Camera Images: Kids'Cam Concepts

iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion

ABSTRACT

References

Cited By

Index Terms

Recommendations

Privacy behaviors of lifeloggers using wearable cameras

Concept-based indexing of annotated images using semantic DNA

Understanding lifelog sharing preferences of lifeloggers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Semantic Indexing of Wearable Camera Images: Kids'Cam Concepts

iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion

ABSTRACT

References

Cited By

Index Terms

Recommendations

Privacy behaviors of lifeloggers using wearable cameras

Concept-based indexing of annotated images using semantic DNA

Understanding lifelog sharing preferences of lifeloggers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media