ABSTRACT
This paper examines user behavior during multimodal human-computer interaction (HCI). It discusses how pointing, natural language, and graphical layout should be integrated to enhance the usability of multimodal systems. Two experiments were run to study simulated systems capable of understanding written natural language and mouse-supported pointing gestures. Results allowed to: (a) develop a taxonomy of communication acts aimed at identifying targets; (b) determine the conditions under which specific referent identification strategies are likely to be produced; (c) suggest guidelines for designing effective multimodal interfaces; (d) show that performance is strongly influenced by interface graphical layout and by user expertise. Our study confirms the value of simulation as a tool for building HCI models and supports the basic idea that linguistic, visual, and motor cues can be integrated to favor effective multimodal communication.
- Brennan, S. E. Conversation With and Through Computers. User Modeling and User-Adapted Interaction, 1, 1991, 67--86.Google ScholarCross Ref
- Buxton, B. The Natural Language of Interaction: A Perspective on Nonverbal Dialogues. In B. Laurel (Ed.) The Art of Human-Computer Interface Design, Addison-Wesley, 1991, 405--416.Google Scholar
- Caramazza, A. and Hillis, A. E. Lexical Organization of Nouns and Verbs in Brain. Nature, 349, 1991, 788--790.Google Scholar
- Cassell, J. and Prevost, S. Distribution of Semantic Features Across Speech and Gesture by Humans and Machine. In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University of Delaware, 1996.Google Scholar
- Cohen, P. R., Dalrymple M., Moran, D. B., Pereira, F. C. N., Sullivan, J. W, Gargan, R. A., Schlossberg, J. L., and Tyler., S. W. Synergistic Use of Direct Manipulation and Natural Language. In CHI'89 Conf. Proc., ACM, Addison-Wesley, New York, 227--234. Google ScholarDigital Library
- Clark, H. H., Schreuder, R., and Buttrick, S. Common Ground and the Understanding of Demostrative Reference. Journal of Verbal Learning and Verbal Behavior, 22, 1983, 245--258.Google ScholarCross Ref
- Dahlback, N and Johnsson, A. Empirical studies of discourse representation for natural language interfaces. In Proc. of 4th European Conf. of ACL, 1989, 291--298. Google ScholarDigital Library
- De Angeli, A., Petrelli, D. and Gerbino, W. Interface Features Affecting Deixis Production: A Simulation Study. In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University of Delaware, 1996, 195--204.Google Scholar
- De Angeli, A., Petrelli, D., Gerbino, W. & Cassano, G. How Computer Literacy Affects Natural Language Interaction. Technical Report of the Cognitive Technology Laboratory, University of Trieste, 1997.Google Scholar
- Fraser, N. M. and Gilbert, G. N. Simulating Speech System. Computer, Speech and Languages, 5, 1991, 81--99.Google ScholarCross Ref
- Gentner, D. and Nielsen, J. The Anti-Mac Interface. Communication of the ACM, 39(8), 1996, 70--82. Google ScholarDigital Library
- Gerbino W. Solving by Redundancy and Misunderstanding by Simplification. In V. Cantoni, V. Di Gesù, A. Setti and D. Tegolo (Eds.) Human and Machine Perception: Information Fusion. New York, Plenum, 1997, 147--154.Google Scholar
- Glenberg, A., and McDaniel, M. Mental Models, Pictures, and Text: Integration of Spatial and Verbal Information. Memory and Cognition, 20 (5), 1992, 458--460.Google ScholarCross Ref
- Gullberg, M. Deictic Gesture and Strategy in Second Language Narrative. In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University of Delaware, 1996, 155--174.Google Scholar
- Johnson, A. and Dahlback, N. Talking To a Computer is Not Like Talking To Your Best Friend. Proc. of Scandinavian Conf. of Artificial Intelligence, 1988, 53--68.Google Scholar
- Kendon, A. Gesticulation and Speech: Two Aspects of the Process of Utterances. In M. R. Key (Ed.) Nonverbal Communication and Language, The Hague: Mounton, 1980.Google Scholar
- Levelt, W. J., Richardson, G., and La Heij, W. Pointing and Voicing in Deictic Expressions. Journal of Memory and Language, 24, (1985), 133--164.Google ScholarCross Ref
- Levialdi, S., Mussio, P., and Mastronardi, G. Characters, Pixels, and Phonemes. In V. Cantoni, V. Di Gesù, A. Setti and D. Tegolo (Eds.) Human and Machine Perception: Information Fusion. New York, Plenum, 1997.Google Scholar
- Levinson, S. C. Pragmatics. Cambridge: Cambridge University Press, 1983.Google Scholar
- Lewkowicz, D. J. and Lickliter, R. The Development of Intersensory Perception: Comparative Perspectives. Hillsdale, NJ: LEA, 1994.Google Scholar
- Mac Aogain, E. and Reilly, R. Discourse Theory and Interface Design: The Case of Pointing With the Mouse. Intl. Journal of Man-Machine Studies, 32, 1990, 591--602. Google ScholarDigital Library
- Maybury, M. T. (Ed.), Intelligent Multimedia Interfaces. Cambridge, Mass: MIT Press, 1993. Google ScholarDigital Library
- McNeill, D. Hand and Mind: What Gestures Reveal about Thought. Chicago: University of Chicago Press, 1992.Google Scholar
- McNeill, D. Language as Gesture (Gesture as Language). In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University Of Delaware, 1996, 1--20.Google Scholar
- Naughton, K. Spontaneous Gesture and Sign: A Study of ASL Signs Co-occurring with Speech. In L. Messing (Ed.) Proc. of the Workshop of the Integration of Gesture in Language and Speech, University Of Delaware, 1996, 125--34.Google Scholar
- Norman, D. A. and Draper, S. W. User Centered System Design: New Perspectives on Human-Computer Interaction. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1986. Google ScholarDigital Library
- Oviatt, S. Multimodal Interfaces for Dynamic Interactive Maps. CHI'96 Conf. Proc., New York: ACM Press, 1996, 95--102. Google ScholarDigital Library
- Oviatt, S. and Olsen, E. Integration Themes in Multimodal Human-Computer Interaction. In Proc. of the Intl. Conf. on Spoken Language Processing, Yokohama, Japan, 1994, 551--554.Google Scholar
- Oviatt, S. and vanGent, R. Error Resolution During Multimodal Human-Computer Interaction. In Proc. of the Intl. Conf. on Spoken Language Processing, Philadelfia, USA, 1996.Google Scholar
- Oviatt, S., Cohen, P. R. and Wang, M. Toward Interface Design for Human Language Technology: Modality and Structure as Determinants of Linguistic Complexity. Speech Communication, 15, 1994, 283--300. Google ScholarDigital Library
- Oviatt, S., Cohen, P. R., Fong, M. and Frank, M. A Rapid Semi-automatic Simulation Technique for Investigating Interactive Speech and Handwriting. Proc. of the Intl. Conf. on Spoken Language Processing, vol. 2, Alberta, 1992, 1351--1354.Google Scholar
- Oviatt, S., De Angeli, A. and Kuhn, K. Integration and Synchronization of Input Modes During Multimodal Human-Computer Interaction. In CHI'96 Conf. Proc., New York: ACM Press, 1997, 415--422. Google ScholarDigital Library
- Petrelli, D., De Angeli, A., Gerbino, W. and Cassano, G. Referring in Multimodal Systems: The Importance of User Expertise and System Features. In Proc. of the Workshop on Referring Phenomena in a Multimedia Context and Their Computational Treatment, ACL-EACL, 1997, 14--19. Google ScholarDigital Library
- Schmauks, D. Natural and Simulated Pointing. Proc. of the 3rd European ACL Conf., Kopenhagen, Danmark, 1987, 179--185. Google ScholarDigital Library
- Schmauks, D. and Reithinger, N. Generating Multimodal Output-Conditions, Advantages and Problems. In Proc. of the Intl. Conf. on Computational Linguistics, Budapest, Hungary, 1988, 584--588. Google ScholarDigital Library
- Schmauks, d. and Wille M. Integration of Communicative Hand Movements into Human-Computer-Interaction. Computers and the Humanities, 25, 1991, 129--140.Google ScholarCross Ref
- Shneiderman, B. Designing the User Interface: Strategies for Effective Human-Computer Interaction. Reading, MA: Addison-Wesley, second edition, 1992. Google ScholarDigital Library
- Siroux, J., Guyomard, M., Multon, F. and Remondeau, C. Oral and Gestural Activities of the Users in the Georal System. In Proc. of the Intl. Conf. on Cooperative Multimodal Communication, vol. 2, 1995, 287--298.Google Scholar
- Stock, O. A Third Modality? Artificial Intelligence Review, 9, 1995, 129--146. Google ScholarDigital Library
- van Vliet, P. J., Kletke, M. G. and Chakraborty, G. The measurement of computer literacy: a comparison of self-appraisal and objective tests. Intl. Journal of Human-Computer Studies, 40, 1994, 835--875. Google ScholarDigital Library
- Visual display, pointing, and natural language: the power of multimodal interaction
Recommendations
Natural Language, Mixed-initiative Personal Assistant Agents
IMCOM '18: Proceedings of the 12th International Conference on Ubiquitous Information Management and CommunicationThe increasing popularity and use of personal voice assistant technologies, such as Siri and Google Now, is driving and expanding progress toward the long-term and lofty goal of using artificial intelligence to build human-computer dialog systems ...
An evaluation of strategies for selective utterance verification for spoken natural language dialog
ANLC '97: Proceedings of the fifth conference on Applied natural language processingAs with human-human interaction, spoken human-computer dialog will contain situations where there is miscommunication. In experimental trials consisting of eight different users, 141 problem-solving dialogs, and 2840 user utterances, the Circuit Fix-It ...
Logic-based rhetorical structuring for natural language generation in human-computer dialogue
TSD'07: Proceedings of the 10th international conference on Text, speech and dialogueRhetorical structuring is field approached mostly by research in natural language (pragmatic) interpretation. However, in natural language generation (NLG) the rhetorical structure plays an important part, in monologues and dialogues as well. Hence, ...
Comments