Visual display, pointing, and natural language: the power of multimodal interaction

Authors:
Antonella De Angeli

University of Trieste, Via dell'Università 7, Trieste - Italy

University of Trieste, Via dell'Università 7, Trieste - Italy
View Profile

,
Walter Gerbino

University of Trieste, Via dell'Università 7, Trieste - Italy

University of Trieste, Via dell'Università 7, Trieste - Italy
View Profile

,
Giulia Cassano

University of Trieste, Via dell'Università 7, Trieste - Italy

University of Trieste, Via dell'Università 7, Trieste - Italy
View Profile

,
Daniela Petrelli

IRST - Istituto per la Ricerca Scientifica e Tecnologica, Povo (Trento) - Italy

IRST - Istituto per la Ricerca Scientifica e Tecnologica, Povo (Trento) - Italy
View Profile

AVI '98: Proceedings of the working conference on Advanced visual interfacesMay 1998Pages 164–173https://doi.org/10.1145/948496.948519

Published:24 May 1998Publication History

AVI '98: Proceedings of the working conference on Advanced visual interfaces

Pages 164–173

ABSTRACT

This paper examines user behavior during multimodal human-computer interaction (HCI). It discusses how pointing, natural language, and graphical layout should be integrated to enhance the usability of multimodal systems. Two experiments were run to study simulated systems capable of understanding written natural language and mouse-supported pointing gestures. Results allowed to: (a) develop a taxonomy of communication acts aimed at identifying targets; (b) determine the conditions under which specific referent identification strategies are likely to be produced; (c) suggest guidelines for designing effective multimodal interfaces; (d) show that performance is strongly influenced by interface graphical layout and by user expertise. Our study confirms the value of simulation as a tool for building HCI models and supports the basic idea that linguistic, visual, and motor cues can be integrated to favor effective multimodal communication.

References

Brennan, S. E. Conversation With and Through Computers. User Modeling and User-Adapted Interaction, 1, 1991, 67--86.Google ScholarCross Ref
Buxton, B. The Natural Language of Interaction: A Perspective on Nonverbal Dialogues. In B. Laurel (Ed.) The Art of Human-Computer Interface Design, Addison-Wesley, 1991, 405--416.Google Scholar
Caramazza, A. and Hillis, A. E. Lexical Organization of Nouns and Verbs in Brain. Nature, 349, 1991, 788--790.Google Scholar
Cassell, J. and Prevost, S. Distribution of Semantic Features Across Speech and Gesture by Humans and Machine. In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University of Delaware, 1996.Google Scholar
Cohen, P. R., Dalrymple M., Moran, D. B., Pereira, F. C. N., Sullivan, J. W, Gargan, R. A., Schlossberg, J. L., and Tyler., S. W. Synergistic Use of Direct Manipulation and Natural Language. In CHI'89 Conf. Proc., ACM, Addison-Wesley, New York, 227--234. Google ScholarDigital Library
Clark, H. H., Schreuder, R., and Buttrick, S. Common Ground and the Understanding of Demostrative Reference. Journal of Verbal Learning and Verbal Behavior, 22, 1983, 245--258.Google ScholarCross Ref
Dahlback, N and Johnsson, A. Empirical studies of discourse representation for natural language interfaces. In Proc. of 4th European Conf. of ACL, 1989, 291--298. Google ScholarDigital Library
De Angeli, A., Petrelli, D. and Gerbino, W. Interface Features Affecting Deixis Production: A Simulation Study. In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University of Delaware, 1996, 195--204.Google Scholar
De Angeli, A., Petrelli, D., Gerbino, W. & Cassano, G. How Computer Literacy Affects Natural Language Interaction. Technical Report of the Cognitive Technology Laboratory, University of Trieste, 1997.Google Scholar
Fraser, N. M. and Gilbert, G. N. Simulating Speech System. Computer, Speech and Languages, 5, 1991, 81--99.Google ScholarCross Ref
Gentner, D. and Nielsen, J. The Anti-Mac Interface. Communication of the ACM, 39(8), 1996, 70--82. Google ScholarDigital Library
Gerbino W. Solving by Redundancy and Misunderstanding by Simplification. In V. Cantoni, V. Di Gesù, A. Setti and D. Tegolo (Eds.) Human and Machine Perception: Information Fusion. New York, Plenum, 1997, 147--154.Google Scholar
Glenberg, A., and McDaniel, M. Mental Models, Pictures, and Text: Integration of Spatial and Verbal Information. Memory and Cognition, 20 (5), 1992, 458--460.Google ScholarCross Ref
Gullberg, M. Deictic Gesture and Strategy in Second Language Narrative. In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University of Delaware, 1996, 155--174.Google Scholar
Johnson, A. and Dahlback, N. Talking To a Computer is Not Like Talking To Your Best Friend. Proc. of Scandinavian Conf. of Artificial Intelligence, 1988, 53--68.Google Scholar
Kendon, A. Gesticulation and Speech: Two Aspects of the Process of Utterances. In M. R. Key (Ed.) Nonverbal Communication and Language, The Hague: Mounton, 1980.Google Scholar
Levelt, W. J., Richardson, G., and La Heij, W. Pointing and Voicing in Deictic Expressions. Journal of Memory and Language, 24, (1985), 133--164.Google ScholarCross Ref
Levialdi, S., Mussio, P., and Mastronardi, G. Characters, Pixels, and Phonemes. In V. Cantoni, V. Di Gesù, A. Setti and D. Tegolo (Eds.) Human and Machine Perception: Information Fusion. New York, Plenum, 1997.Google Scholar
Levinson, S. C. Pragmatics. Cambridge: Cambridge University Press, 1983.Google Scholar
Lewkowicz, D. J. and Lickliter, R. The Development of Intersensory Perception: Comparative Perspectives. Hillsdale, NJ: LEA, 1994.Google Scholar
Mac Aogain, E. and Reilly, R. Discourse Theory and Interface Design: The Case of Pointing With the Mouse. Intl. Journal of Man-Machine Studies, 32, 1990, 591--602. Google ScholarDigital Library
Maybury, M. T. (Ed.), Intelligent Multimedia Interfaces. Cambridge, Mass: MIT Press, 1993. Google ScholarDigital Library
McNeill, D. Hand and Mind: What Gestures Reveal about Thought. Chicago: University of Chicago Press, 1992.Google Scholar
McNeill, D. Language as Gesture (Gesture as Language). In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University Of Delaware, 1996, 1--20.Google Scholar
Naughton, K. Spontaneous Gesture and Sign: A Study of ASL Signs Co-occurring with Speech. In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University Of Delaware, 1996, 125--34.Google Scholar
Norman, D. A. and Draper, S. W. User Centered System Design: New Perspectives on Human-Computer Interaction. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1986. Google ScholarDigital Library
Oviatt, S. Multimodal Interfaces for Dynamic Interactive Maps. CHI'96 Conf. Proc., New York: ACM Press, 1996, 95--102. Google ScholarDigital Library
Oviatt, S. and Olsen, E. Integration Themes in Multimodal Human-Computer Interaction. In Proc. of the Intl. Conf. on Spoken Language Processing, Yokohama, Japan, 1994, 551--554.Google Scholar
Oviatt, S. and vanGent, R. Error Resolution During Multimodal Human-Computer Interaction. In Proc. of the Intl. Conf. on Spoken Language Processing, Philadelfia, USA, 1996.Google Scholar
Oviatt, S., Cohen, P. R. and Wang, M. Toward Interface Design for Human Language Technology: Modality and Structure as Determinants of Linguistic Complexity. Speech Communication, 15, 1994, 283--300. Google ScholarDigital Library
Oviatt, S., Cohen, P. R., Fong, M. and Frank, M. A Rapid Semi-automatic Simulation Technique for Investigating Interactive Speech and Handwriting. Proc. of the Intl. Conf. on Spoken Language Processing, vol. 2, Alberta, 1992, 1351--1354.Google Scholar
Oviatt, S., De Angeli, A. and Kuhn, K. Integration and Synchronization of Input Modes During Multimodal Human-Computer Interaction. In CHI'96 Conf. Proc., New York: ACM Press, 1997, 415--422. Google ScholarDigital Library
Petrelli, D., De Angeli, A., Gerbino, W. and Cassano, G. Referring in Multimodal Systems: The Importance of User Expertise and System Features. In Proc. of the Workshop on Referring Phenomena in a Multimedia Context and Their Computational Treatment, ACL-EACL, 1997, 14--19. Google ScholarDigital Library
Schmauks, D. Natural and Simulated Pointing. Proc. of the 3rd European ACL Conf., Kopenhagen, Danmark, 1987, 179--185. Google ScholarDigital Library
Schmauks, D. and Reithinger, N. Generating Multimodal Output-Conditions, Advantages and Problems. In Proc. of the Intl. Conf. on Computational Linguistics, Budapest, Hungary, 1988, 584--588. Google ScholarDigital Library
Schmauks, d. and Wille M. Integration of Communicative Hand Movements into Human-Computer-Interaction. Computers and the Humanities, 25, 1991, 129--140.Google ScholarCross Ref
Shneiderman, B. Designing the User Interface: Strategies for Effective Human-Computer Interaction. Reading, MA: Addison-Wesley, second edition, 1992. Google ScholarDigital Library
Siroux, J., Guyomard, M., Multon, F. and Remondeau, C. Oral and Gestural Activities of the Users in the Georal System. In Proc. of the Intl. Conf. on Cooperative Multimodal Communication, vol. 2, 1995, 287--298.Google Scholar
Stock, O. A Third Modality? Artificial Intelligence Review, 9, 1995, 129--146. Google ScholarDigital Library
van Vliet, P. J., Kletke, M. G. and Chakraborty, G. The measurement of computer literacy: a comparison of self-appraisal and objective tests. Intl. Journal of Human-Computer Studies, 40, 1994, 835--875. Google ScholarDigital Library

Visual display, pointing, and natural language: the power of multimodal interaction
1. Hardware
  1. Power and energy
    1. Power estimation and optimization
2. Human-centered computing

Recommendations

Natural Language, Mixed-initiative Personal Assistant Agents
IMCOM '18: Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication

The increasing popularity and use of personal voice assistant technologies, such as Siri and Google Now, is driving and expanding progress toward the long-term and lofty goal of using artificial intelligence to build human-computer dialog systems ...
Read More
An evaluation of strategies for selective utterance verification for spoken natural language dialog
ANLC '97: Proceedings of the fifth conference on Applied natural language processing

As with human-human interaction, spoken human-computer dialog will contain situations where there is miscommunication. In experimental trials consisting of eight different users, 141 problem-solving dialogs, and 2840 user utterances, the Circuit Fix-It ...
Read More
Logic-based rhetorical structuring for natural language generation in human-computer dialogue
TSD'07: Proceedings of the 10th international conference on Text, speech and dialogue

Rhetorical structuring is field approached mostly by research in natural language (pragmatic) interpretation. However, in natural language generation (NLG) the rhetorical structure plays an important part, in monologues and dialogues as well. Hence, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AVI '98: Proceedings of the working conference on Advanced visual interfaces
May 1998
295 pages
ISBN:9781450374354
DOI:10.1145/948496
Editors:
Tiziana Catarci
Università degli Studi di Roma "La Sapienza", Roma, Italy
,
Maria Francesca Costabile
Università di Bari, Bari, Italy
,
Giuseppe Santucci
Università degli Studi di Roma "La Sapienza", Roma, Italy
,
Laura Taranfino
Università dell'Aquila, L'Aquila, Italy
,
General Chair:
Stefano Levialdi
Copyright © 1998 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 May 1998
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cross-modal integration
referent identification strategies
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate107of408submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 1,172
  Total Downloads
- Downloads (Last 12 months)58
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Visual display, pointing, and natural language: the power of multimodal interaction

AVI '98: Proceedings of the working conference on Advanced visual interfaces

ABSTRACT

References

Cited By

Recommendations

Natural Language, Mixed-initiative Personal Assistant Agents

An evaluation of strategies for selective utterance verification for spoken natural language dialog

Logic-based rhetorical structuring for natural language generation in human-computer dialogue