ABSTRACT
We present a speech-driven digital personal assistant that is robust despite little or no training data and autonomously improves as it interacts with users. The system is able to establish and build common ground between itself and users by signaling understanding and by learning a mapping via interaction between the words that users actually speak and the system actions. We evaluated our system with real users and found an overall positive response. We further show through objective measures that autonomous learning improves performance in a simple itinerary filling task.
- Gregory Aist, James Allen, Ellen Campana, Lucian Galescu, Carlos Gallo, Scott Stoness, Mary Swift, and Michael Tanenhaus. 2006. Software architectures for incremental understanding of human speech. In Proceedings of CSLP. 1922-1925.Google Scholar
- Gregory Aist, James Allen, Ellen Campana, Carlos Gomez Gallo, Scott Stoness, and Mary Swift. 2007. Incremental understanding in human-computer dialogue and experimental evidence for advantages over nonincremental methods. In Pragmatics, Vol. 1. Trento, Italy, 149--154.Google Scholar
- Layla El Asri, Romain Laroche, Olivier Pietquin, and Hatim Khouzaimi. 2014. NASTIA:Negotiating Appointment Setting Interface. In Proceedings of LREC. 266--271.Google Scholar
- Joyce Y Chai, Lanbo She, Rui Fang, Spencer Ottarson, Cody Littley, Changsong Liu, and Kenneth Hanson. 2014. Collaborative effort towards common ground in situated human-robot dialogue. In Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction. Bielefeld, Germany, 33--40.Google ScholarDigital Library
- Herbert H. Clark and Edward F. Schaefer. 1989. Contributing to discourse. Cognitive Science 13, 2 (1989), 259--294. Google ScholarCross Ref
- Nina Dethlefs, Helen Hastie, Heriberto Cuayáhuitl, Yanchao Yu, Verena Rieser, and Oliver Lemon. 2016. Information density and overlap in spoken dialogue. Computer Speech and Language 37 (2016), 82--97. Google ScholarDigital Library
- Jens Edlund, Joakim Gustafson, Mattias Heldner, and Anna Hjalmarsson. 2008. Towards human-like spoken dialogue systems. Speech Communication 50, 8--9 (2008), 630--645.Google ScholarDigital Library
- Julian Hough and David Schlangen. 2017. A Model of Continuous Intention Grounding for HRI. In Proceedings of The Role of Intentions in Human-Robot Interaction Workshop.Google Scholar
- Casey Kennington and David Schlangen. 2016. Supporting Spoken Assistant Systems with a Graphical User Interface that Signals Incremental Understanding and Prediction State. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, Los Angeles, 242--251. Google ScholarCross Ref
- Casey Kennington and David Schlangen. 2017. A Simple Generative Model of Incremental Reference Resolution in Situated Dialogue. Computer Speech & Language (2017).Google Scholar
- Geert-Jan M Kruijff. 2012. There is no common ground in human-robot interaction. In Proceedings of SemDial.Google Scholar
- Pierre Lison. 2015. A hybrid approach to dialogue management based on probabilistic rules. Computer Speech and Language 34, 1 (2015), 232--255. Google ScholarDigital Library
- Chansong Lui, Rui Fang, and Joyce Yue Chai. 2012. Towards Mediating Shared Perceptual Basis in Situated Dialogue. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, Seoul, South Korea, 140--149.Google Scholar
- Raveesh Meena, Gabriel Skantze, and Joakim Gustafson. 2014. Data-driven models for timing feedback responses in a Map Task dialogue system. In Computer Speech and Language, Vol. 28. Association for Computational Linguistics, Metz, France, 903--922. Google ScholarCross Ref
- David Schlangen and Gabriel Skantze. 2011. A General, Abstract Model of Incremental Dialogue Processing. In Dialogue & Discourse, Vol. 2. 83--111. Google ScholarCross Ref
- Gabriel Skantze and Anna Hjalmarsson. 1991. Towards Incremental Speech Production in Dialogue Systems. In Word Journal Of The International Linguistic Association. Tokyo, Japan, 1--8.Google Scholar
- Gabriel Skantze and David Schlangen. 2009. Incremental dialogue processing in a micro-domain. Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics on EACL 09 April (2009), 745--753.Google ScholarDigital Library
- Michael J. Spivey, Michael K. Tanenhaus, Kathleen M. Eberhard, and Julie C. Sedivy. 2002. Eye movements and spoken language comprehension:Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology 45, 4 (2002), 447--481. Google ScholarCross Ref
- Michael Tanenhaus, Michael Spivey-Knowlton, Kathleen Eberhard, and Julie Sedivy. 1995. Integration of visual and linguistic information in spoken language comprehension. Science (New York, N.Y.) 268, 5217 (1995), 1632--1634. Google ScholarCross Ref
Index Terms
- A Graphical Digital Personal Assistant that Grounds and Learns Autonomously
Recommendations
Field Trial Analysis of Socially Aware Robot Assistant
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsThe Socially-Aware Robot Assistant (SARA) is an embodied conversational agent that works toward using detection of visual, vocal and verbal cues as an input to estimate the strength of its relationship (namely the level of rapport) with a user. SARA ...
Natural Language, Mixed-initiative Personal Assistant Agents
IMCOM '18: Proceedings of the 12th International Conference on Ubiquitous Information Management and CommunicationThe increasing popularity and use of personal voice assistant technologies, such as Siri and Google Now, is driving and expanding progress toward the long-term and lofty goal of using artificial intelligence to build human-computer dialog systems ...
An agent-based approach to dialogue management in personal assistants
IUI '05: Proceedings of the 10th international conference on Intelligent user interfacesPersonal assistants need to allow the user to interact with the system in a flexible and adaptive way such as through spoken language dialogue. In this research we focus on an application in which the user can use a variety of devices to interact with a ...
Comments