research-article

Managing Human-Robot Engagement with Forecasts and... um... Hesitations

Authors:
Dan Bohus

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Eric Horvitz

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

ICMI '14: Proceedings of the 16th International Conference on Multimodal InteractionNovember 2014Pages 2–9https://doi.org/10.1145/2663204.2663241

Published:12 November 2014Publication History

ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

Pages 2–9

ABSTRACT

We explore methods for managing conversational engagement in open-world, physically situated dialog systems. We investigate a self-supervised methodology for constructing forecasting models that aim to anticipate when participants are about to terminate their interactions with a situated system. We study how these models can be leveraged to guide a disengagement policy that uses linguistic hesitation actions, such as filled and non-filled pauses, when uncertainty about the continuation of engagement arises. The hesitations allow for additional time for sensing and inference, and convey the system's uncertainty. We report results from a study of the proposed approach with a directions-giving robot deployed in the wild.

References

Kendon, A. 1990. Spatial organization in social encounters: the F-formation system, Conducting Interaction: Patterns of behavior in focused encounters, Studies in International Sociolinguistics, Cambridge University Press.Google Scholar
Sidner, C.L., Lee, C., Kidd, C.D., Lesh, N. and Rich, C., 2005. Explorations in engagement for humans and robots, Artificial Intelligence, 166 (1--2), pp. 140--164. Google ScholarDigital Library
Rich, C., Ponsler, B., Holroyd, A., and Sidner, C.L., 2010. Recognizing engagement in human-robot interaction, in Proc. of HRI'2010, Osaka, Japan. Google ScholarDigital Library
Michalowski, M.P., Sabanovic, S., and Simmons, R., 2006. A spatial model of engagement for a social robot, in 9th IEEE Workshop on Advanced Motion Control, pp. 762--767.Google Scholar
Bohus, D., and Horvitz, E., 2009. Models for Multiparty Engagement in Open-World Dialog, in Proc. of SIGdial'2009, London, UK. Google ScholarDigital Library
Bohus, D., and Horvitz, E., 2009. Learning to Predict Engagement with a Spoken Dialog System in Open-World Settings, in Proc. of SIGdial'2009, London, UK. Google ScholarDigital Library
Clark, H.H., and Fox Tree, J.E., 2002. Using uh and um in spontaneous speaking, Cognition, 84(1):73--111, May, 2002.Google ScholarCross Ref
Corley, M., and Stewart, O.W., 2008. Hesitation disfluencies in spontaneous speech: The meaning of um, Language and Linguistics Compass, 4, 589--602.Google ScholarCross Ref
Goto, M., Itou, K., and Hayamizu, S., 1999. A Real-time Filled Pause Detection System for Spontaneous Speech Recognition, in Proc. of Eurospeech'99, Budapest, Hungary.Google Scholar
An, G., Brizan, D.G., and Rosenberg, A., 2013. Detecting laughter and filled pauses using syllable-based features, in Proc. of Interspeech'2013, Lyon, France.Google Scholar
Adell, J., Bonafonte, A., and Escudero, D., 2010. Synthesis of filled pauses based on a disfluent speech model, in Proc. of ICASSP'2010, Dallas, TX.Google Scholar
Skantze, G., and Hjalmarsson, A., 2010. Towards incremental speech generation in dialogue systems, in Proc. of SIGDial'2010, Tokyo, Japan. Google ScholarDigital Library
Skantze, G., Hjalmarsson, A., and Oertel, C., 2013. Exploring the effects of gaze and pauses in situated human-robot interaction, in Proc. of SIGDial'2013, Metz, France.Google Scholar
Dethlefs, N., Hastie, H., Reiser, V., and Lemon, O., 2012. Optimizing Natural Language Generation for Decision Making for Situated Dialogue, in Proc. of INLG'2012, 49--58, Utica, IL.Google Scholar
Bohus, D., Saw, C.W., and Horvitz, E., 2014. Directions Robot: In-the-Wild Experiences and Lessons Learned, in Proc. of AAMAS'2014, Paris, France. Google ScholarDigital Library
Bohus, D., and Horvitz, E., 2009. Dialog in the Open World: Platform and Applications, in Proc. of ICMI'2009, Boston, MA. Google ScholarDigital Library
Gouaillier, D., Hugel, V., Blazevic, P., Kilner, C., Monceaux, J., Lafourcade, P., Marnier, B., Serre, J., Maisonnier, B. 2009. Mechatronic design of NAO humanoid. In Proceedings of ICRA'09, Kobe, Japan. Google ScholarDigital Library

Index Terms

Managing Human-Robot Engagement with Forecasts and... um... Hesitations
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Enabling Multimodal Human–Robot Interaction for the Karlsruhe Humanoid Robot

In this paper, we present our work in building technologies for natural multimodal human-robot interaction. We present our systems for spontaneous speech recognition, multimodal dialogue processing, and visual perception of a user, which includes ...
Read More
Timing multimodal turn-taking for human-robot cooperation
ICMI '12: Proceedings of the 14th ACM international conference on Multimodal interaction

In human cooperation, the concurrent usage of multiple social modalities such as speech, gesture, and gaze results in robust and efficient communicative acts. Such multimodality in combination with reciprocal intentions supports fluent turn-taking. I ...
Read More
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

In this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction
November 2014
558 pages
ISBN:9781450328852
DOI:10.1145/2663204
General Chairs:
Albert Ali Salah
Boğaziçi University, Turkey
,
Jeffrey Cohn
University of Pittsburgh, USA
,
Björn Schuller
University of Passau, Germany and Imperial College London, UK
,
Program Chairs:
Oya Aran
Idiap Research Institute, Switzerland
,
Louis-Philippe Morency
University of Southern California, USA
,
Philip R. Cohen
Adapx, USA
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 November 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
engagement
filled pauses
forecasting models
hesitation actions
human-robot interaction
multimodal interaction
self-supervision
Qualifiers
- research-article
Conference

Acceptance Rates
ICMI '14 Paper Acceptance Rate51of127submissions,40%Overall Acceptance Rate453of1,080submissions,42%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 44
  Total Citations
  View Citations
- 799
  Total Downloads
- Downloads (Last 12 months)60
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Managing Human-Robot Engagement with Forecasts and... um... Hesitations

ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Enabling Multimodal Human–Robot Interaction for the Karlsruhe Humanoid Robot

Timing multimodal turn-taking for human-robot cooperation

Human-robot collaborative tutoring using multiparty multimodal spoken dialogue