research-article

Automatic tag generation and ranking for sensor-rich outdoor videos

Authors:
Zhijie Shen

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Sakire Arslan Ay

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Seon Ho Kim

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

,
Roger Zimmermann

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

MM '11: Proceedings of the 19th ACM international conference on MultimediaNovember 2011Pages 93–102https://doi.org/10.1145/2072298.2072312

Published:28 November 2011Publication History

MM '11: Proceedings of the 19th ACM international conference on Multimedia

Pages 93–102

ABSTRACT

Video tag annotations have become a useful and powerful feature to facilitate video search in many social media and web applications. The majority of tags assigned to videos are supplied by users - a task which is time consuming and may result in annotations that are subjective and lack precision. A number of studies have utilized content-based extraction techniques to automate tag generation. However, these methods are compute-intensive and challenging to apply across domains. Here, we describe a complementary approach for generating tags based on the geographic properties of videos. With today's sensor-equipped smartphones, the location and orientation of a camera can be continuously acquired in conjunction with the captured video stream. Our novel technique utilizes these sensor meta-data to automatically tag outdoor videos in a two step process. First, we model the viewable scenes of the video as geometric shapes by means of its accompanied sensor data and determine the geographic objects that are visible in the video by querying geo-information databases through the viewable scene descriptions. Subsequently we extract textual information about the visible objects to serve as tags. Second, we define six criteria to score the tag relevance and rank the obtained tags based on these scores. Then we associate the tags with the video and the accurately delimited segments of the video. To evaluate the proposed technique we implemented a prototype tag generator and conducted a user study. The results demonstrate significant benefits of our method in terms of automation and tag utility.

References

M. Ames and M. Naaman. Why We Tag: Motivations for Annotation in Mobile and Online Media. In CHI, 2007. Google ScholarDigital Library
S. Arslan Ay, R. Zimmermann, and S. H. Kim. Viewable Scene Modeling for Geospatial Video Search. In ACM Multimedia, 2008. Google ScholarDigital Library
L. Cao, J. Luo, and T. S. Huang. Annotating Photo Collections by Label Propagation According to Multiple Similarity Cues. In ACM Multimedia, 2008. Google ScholarDigital Library
Y. Gao, J. Tang, R. Hong, Q. Dai, T. S. Chua, and R. Jain. W2Go: A Travel Guidance System by Automatic Landmark Ranking. In ACM Multimedia, 2010. Google ScholarDigital Library
C. H. Graham, N. R. Bartlett, J. L. Brown, Y. Hsia, C. C. Mueller, and L. A. Riggs. Vision and Visual Perception. John Wiley & Sons, Inc., 1965.Google Scholar
R. Jain and P. Sinha. Content Without Context is Meaningless. In ACM Multimedia, 2010. Google ScholarDigital Library
Y. G. Jiang, C. W. Ngo, and S. F. Chang. Semantic Context Transfer across Heterogeneous Sources for Domain Adaptive Video Search. In ACM Multimedia, 2009. Google ScholarDigital Library
Y. Jin, M. Hu, H. Singh, D. Rule, M. Berlyant, and Z. Xie. MySpace Video Recommendation with Map-Reduce on Qizmt. In IEEE ICSC, 2010. Google ScholarDigital Library
T. Judd, K. Ehinger, F. Durand, and A. Torralba. Learning to Predict Where Humans Look. In ICCV, 2009.Google ScholarCross Ref
K. C. K. Lee, W.-C. Lee, and H. V. Leong. Nearest Surrounder Queries. IEEE TKDE, 2010. Google ScholarDigital Library
K. Lerman and L. Jones. Social Browsing on Flickr. Arxiv preprint cs0612047, 2006.Google Scholar
X. Li, L. Guo, and Y. E. Zhao. Tag-based Social Interest Discovery. In WWW, 2008. Google ScholarDigital Library
D. Liu, X. S. Hua, L. Yang, M. Wang, and H. J. Zhang. Tag Ranking. In WWW, 2009. Google ScholarDigital Library
X. Liu, M. Corner, and P. Shenoy. SEVA: Sensor-Enhanced Video Annotation. In ACM Multimedia, 2005. Google ScholarDigital Library
X. Lu, C. Wang, J. M. Yang, Y. Pang, and L. Zhang. Photo2Trip: Generating Travel Routes from Geo-Tagged Photos for Trip Planning. In ACM Multimedia, 2010. Google ScholarDigital Library
M. Naaman, S. Harada, Q. Wang, H. G. Molina, and A. Paepcke. Context Data in GeoReferenced Digital Photo Collections. In ACM Multimedia, 2004. Google ScholarDigital Library
M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-Scale concept ontology for multimedia. IEEE Multimedia, 2006. Google ScholarDigital Library
M. R. Naphade and J. R. Smith. On the Detection of Semantic Concepts at TRECVID. In ACM Multimedia, 2004. Google ScholarDigital Library
A. Pigeau and M. Gelgon. Building and Tracking Hierarchical Geographical & Temporal Partitions for Image Collection Management on Mobile Devices. In ACM Multimedia, 2005. Google ScholarDigital Library
G. J. Qi, X. S. Hua, Y. Rui, J. Tang, T. Mei, and H. J. Zhang. Correlative Multi-Label Video Annotation. In ACM Multimedia, 2007. Google ScholarDigital Library
C. Shahabi, F. Banaei-Kashani, A. Khoshgozaran, L. Nocera, and S. Xing. GeoDec: A Framework to Effectively Visualize and Query Geospatial Data for Decision-Making. IEEE Multimedia, 2010. Google ScholarDigital Library
S. Siersdorfer, J. San Pedro, and M. Sanderson. Automatic Video Tagging using Content Redundancy. In SIGIR, 2009. Google ScholarDigital Library
B. Sigurbjörnsson and R. van Zwol. Flickr Tag Recommendation based on Collective Knowledge. In WWW, 2008. Google ScholarDigital Library
F. M. Suchanek, M. Vojnovic, and D. Gunawardena. Social Tags: Meaning and Suggestions. In ACM CIKM, 2008. Google ScholarDigital Library
K. Toyama, R. Logan, and A. Roseway. Geographic Location Tags on Digital Images. In ACM Multimedia, 2003. Google ScholarDigital Library
M. Wang, X. S. Hua, X. Yuan, Y. Song, and L. R. Dai. Optimizing Multi-Graph Learning: Towards A Unified Video Annotation Scheme. In ACM Multimedia, 2007. Google ScholarDigital Library
R. Yan, A. Natsev, and M. Campbell. A Learning-based Hybrid Tagging and Browsing Approach for Efficient Manual Image Annotation. In CVPR, 2008.Google Scholar
K. Yang, X. S. Hua, M. Wang, and H. J. Zhang. Tagging Tags. In ACM Multimedia, 2010. Google ScholarDigital Library
P. A. Zandbergen. Accuracy of iPhone Locations: A Comparison of Assisted-GPS, WiFi and Cellular Positioning. Transactions in GIS, 2009.Google Scholar

Index Terms

Automatic tag generation and ranking for sensor-rich outdoor videos
1. Information systems
  1. Information retrieval

Recommendations

Spatial-Temporal Tag Mining for Automatic Geospatial Video Annotation

Videos are increasingly geotagged and used in practical and powerful GIS applications. However, video search and management operations are typically supported by manual textual annotations, which are subjective and laborious. Therefore, research has ...
Read More
SRV-TaGS: An Automatic TAGging and Search System for Sensor-Rich Outdoor Videos
MM '11: Proceedings of the 19th ACM international conference on Multimedia

Tagging facilitates video search in many social media and web applications. While manual tagging is time consuming, subjective and sometimes inaccurate, auto-tagging facilitated by content-based techniques is compute-intensive and challenging to apply ...
Read More
Disinformation in Multimedia Annotation: Misleading Metadata Detection on YouTube
iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion

Popularity of online videos is increasing at a rapid rate. Not only the users can access these videos online, but they can also upload video content on platforms like YouTube and Myspace. These videos are indexed by user generated multimedia annotation, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '11: Proceedings of the 19th ACM international conference on Multimedia
November 2011
944 pages
ISBN:9781450306164
DOI:10.1145/2072298
General Chairs:
K. Selçuk Candan
Arizona State University, USA
,
Sethuraman Panchanathan
Arizona State University, USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA
,
Program Chairs:
Hari Sundaram
Arizona State University, USA
,
Wu-Chi Feng
Portland State University, USA
,
Nicu Sebe
University of Trento, Italy
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 November 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
geospatial
location sensors
mobile video
video tags
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 35
  Total Citations
  View Citations
- 482
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic tag generation and ranking for sensor-rich outdoor videos

MM '11: Proceedings of the 19th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Spatial-Temporal Tag Mining for Automatic Geospatial Video Annotation

SRV-TaGS: An Automatic TAGging and Search System for Sensor-Rich Outdoor Videos

Disinformation in Multimedia Annotation: Misleading Metadata Detection on YouTube

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatic tag generation and ranking for sensor-rich outdoor videos

MM '11: Proceedings of the 19th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Spatial-Temporal Tag Mining for Automatic Geospatial Video Annotation

SRV-TaGS: An Automatic TAGging and Search System for Sensor-Rich Outdoor Videos

Disinformation in Multimedia Annotation: Misleading Metadata Detection on YouTube

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media