skip to main content
10.1145/3173386.3176994acmconferencesArticle/Chapter ViewAbstractPublication PageshriConference Proceedingsconference-collections
abstract

Reasonable Perception: Connecting Vision and Language Systems for Validating Scene Descriptions

Published:01 March 2018Publication History

ABSTRACT

Understanding explanations of machine perception is an important step towards developing accountable, trustworthy machines. Furthermore, speech and vision are the primary modalities by which humans collect information about the world, but the linking of visual and natural language domains is a relatively new pursuit in computer vision, and it is difficult to test performance in a safe environment. To couple human visual understanding and machine perception, we present an explanatory system for creating a library of possible context-specific actions associated with 3D objects in immersive virtual worlds. We also contribute a novel scene description dataset, generated natively in virtual reality containing speech, image, gaze, and acceleration data. We discuss the development of a hybrid machine learning algorithm linking vision data with environmental affordances in natural language. Our findings demonstrate that it is possible to develop a model which can generate interpretable verbal descriptions of possible actions associated with recognized 3D objects within immersive VR environments.

References

  1. Joseph A Blass and Kenneth D Forbus . 2017. Analogical Chaining with Natural Language Instruction for Commonsense Reasoning AAAI. 4357--4363.Google ScholarGoogle Scholar
  2. Scott E Fahlman . 1979. NETL, a system for representing and using real-world knowledge. MIT press.Google ScholarGoogle Scholar
  3. Matthew Molineaux and David W Aha . 2015. Continuous explanation generation in a multi-agent domain. Technical Report. NAVAL RESEARCH LAB WASHINGTON DC.Google ScholarGoogle Scholar
  4. Robert Speer and Catherine Havasi . 2013. ConceptNet 5: A large semantic network for relational knowledge. The People's Web Meets NLP. Springer, 161--176.Google ScholarGoogle Scholar
  5. Yi Zhang Siyuan Qiao Zihao Xiao Tae Soo Kim Yizhou Wang Alan Yuille Weichao Qiu, Fangwei Zhong . 2017. UnrealCV: Virtual Worlds for Computer Vision. ACM Multimedia Open Source Software Competition (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Patrick Henry Winston and Dylan Holmes . 2017. The Genesis manifesto: Story understanding and human intelligence. (2017).Google ScholarGoogle Scholar

Index Terms

  1. Reasonable Perception: Connecting Vision and Language Systems for Validating Scene Descriptions

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        HRI '18: Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction
        March 2018
        431 pages
        ISBN:9781450356152
        DOI:10.1145/3173386

        Copyright © 2018 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 March 2018

        Check for updates

        Qualifiers

        • abstract

        Acceptance Rates

        HRI '18 Paper Acceptance Rate49of206submissions,24%Overall Acceptance Rate192of519submissions,37%
      • Article Metrics

        • Downloads (Last 12 months)9
        • Downloads (Last 6 weeks)1

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader