skip to main content
10.1145/2072298.2072312acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Automatic tag generation and ranking for sensor-rich outdoor videos

Authors Info & Claims
Published:28 November 2011Publication History

ABSTRACT

Video tag annotations have become a useful and powerful feature to facilitate video search in many social media and web applications. The majority of tags assigned to videos are supplied by users - a task which is time consuming and may result in annotations that are subjective and lack precision. A number of studies have utilized content-based extraction techniques to automate tag generation. However, these methods are compute-intensive and challenging to apply across domains. Here, we describe a complementary approach for generating tags based on the geographic properties of videos. With today's sensor-equipped smartphones, the location and orientation of a camera can be continuously acquired in conjunction with the captured video stream. Our novel technique utilizes these sensor meta-data to automatically tag outdoor videos in a two step process. First, we model the viewable scenes of the video as geometric shapes by means of its accompanied sensor data and determine the geographic objects that are visible in the video by querying geo-information databases through the viewable scene descriptions. Subsequently we extract textual information about the visible objects to serve as tags. Second, we define six criteria to score the tag relevance and rank the obtained tags based on these scores. Then we associate the tags with the video and the accurately delimited segments of the video. To evaluate the proposed technique we implemented a prototype tag generator and conducted a user study. The results demonstrate significant benefits of our method in terms of automation and tag utility.

References

  1. M. Ames and M. Naaman. Why We Tag: Motivations for Annotation in Mobile and Online Media. In CHI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Arslan Ay, R. Zimmermann, and S. H. Kim. Viewable Scene Modeling for Geospatial Video Search. In ACM Multimedia, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Cao, J. Luo, and T. S. Huang. Annotating Photo Collections by Label Propagation According to Multiple Similarity Cues. In ACM Multimedia, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Y. Gao, J. Tang, R. Hong, Q. Dai, T. S. Chua, and R. Jain. W2Go: A Travel Guidance System by Automatic Landmark Ranking. In ACM Multimedia, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. H. Graham, N. R. Bartlett, J. L. Brown, Y. Hsia, C. C. Mueller, and L. A. Riggs. Vision and Visual Perception. John Wiley & Sons, Inc., 1965.Google ScholarGoogle Scholar
  6. R. Jain and P. Sinha. Content Without Context is Meaningless. In ACM Multimedia, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Y. G. Jiang, C. W. Ngo, and S. F. Chang. Semantic Context Transfer across Heterogeneous Sources for Domain Adaptive Video Search. In ACM Multimedia, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Jin, M. Hu, H. Singh, D. Rule, M. Berlyant, and Z. Xie. MySpace Video Recommendation with Map-Reduce on Qizmt. In IEEE ICSC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Judd, K. Ehinger, F. Durand, and A. Torralba. Learning to Predict Where Humans Look. In ICCV, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  10. K. C. K. Lee, W.-C. Lee, and H. V. Leong. Nearest Surrounder Queries. IEEE TKDE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Lerman and L. Jones. Social Browsing on Flickr. Arxiv preprint cs0612047, 2006.Google ScholarGoogle Scholar
  12. X. Li, L. Guo, and Y. E. Zhao. Tag-based Social Interest Discovery. In WWW, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Liu, X. S. Hua, L. Yang, M. Wang, and H. J. Zhang. Tag Ranking. In WWW, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. X. Liu, M. Corner, and P. Shenoy. SEVA: Sensor-Enhanced Video Annotation. In ACM Multimedia, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Lu, C. Wang, J. M. Yang, Y. Pang, and L. Zhang. Photo2Trip: Generating Travel Routes from Geo-Tagged Photos for Trip Planning. In ACM Multimedia, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Naaman, S. Harada, Q. Wang, H. G. Molina, and A. Paepcke. Context Data in GeoReferenced Digital Photo Collections. In ACM Multimedia, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-Scale concept ontology for multimedia. IEEE Multimedia, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. R. Naphade and J. R. Smith. On the Detection of Semantic Concepts at TRECVID. In ACM Multimedia, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Pigeau and M. Gelgon. Building and Tracking Hierarchical Geographical & Temporal Partitions for Image Collection Management on Mobile Devices. In ACM Multimedia, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. J. Qi, X. S. Hua, Y. Rui, J. Tang, T. Mei, and H. J. Zhang. Correlative Multi-Label Video Annotation. In ACM Multimedia, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Shahabi, F. Banaei-Kashani, A. Khoshgozaran, L. Nocera, and S. Xing. GeoDec: A Framework to Effectively Visualize and Query Geospatial Data for Decision-Making. IEEE Multimedia, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Siersdorfer, J. San Pedro, and M. Sanderson. Automatic Video Tagging using Content Redundancy. In SIGIR, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Sigurbjörnsson and R. van Zwol. Flickr Tag Recommendation based on Collective Knowledge. In WWW, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. F. M. Suchanek, M. Vojnovic, and D. Gunawardena. Social Tags: Meaning and Suggestions. In ACM CIKM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Toyama, R. Logan, and A. Roseway. Geographic Location Tags on Digital Images. In ACM Multimedia, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Wang, X. S. Hua, X. Yuan, Y. Song, and L. R. Dai. Optimizing Multi-Graph Learning: Towards A Unified Video Annotation Scheme. In ACM Multimedia, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Yan, A. Natsev, and M. Campbell. A Learning-based Hybrid Tagging and Browsing Approach for Efficient Manual Image Annotation. In CVPR, 2008.Google ScholarGoogle Scholar
  28. K. Yang, X. S. Hua, M. Wang, and H. J. Zhang. Tagging Tags. In ACM Multimedia, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. A. Zandbergen. Accuracy of iPhone Locations: A Comparison of Assisted-GPS, WiFi and Cellular Positioning. Transactions in GIS, 2009.Google ScholarGoogle Scholar

Index Terms

  1. Automatic tag generation and ranking for sensor-rich outdoor videos

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '11: Proceedings of the 19th ACM international conference on Multimedia
      November 2011
      944 pages
      ISBN:9781450306164
      DOI:10.1145/2072298

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 November 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader