ABSTRACT
Mobile usage patterns often entail high and fluctuating levels of difficulty as well as dual tasking. One major theme explored in this research is whether a flexible multimodal interface supports users in managing cognitive load. Findings from this study reveal that multimodal interface users spontaneously respond to dynamic changes in their own cognitive load by shifting to multimodal communication as load increases with task difficulty and communicative complexity. Given a flexible multimodal interface, users' ratio of multimodal (versus unimodal) interaction increased substantially from 18.6% when referring to established dialogue context to 77.1% when required to establish a new context, a +315% relative increase. Likewise, the ratio of users' multimodal interaction increased significantly as the tasks became more difficult, from 59.2% during low difficulty tasks, to 65.5% at moderate difficulty, 68.2% at high and 75.0% at very high difficulty, an overall relative increase of +27%. Analysis of users' task-critical errors and response latencies across task difficulty levels increased systematically and significantly as well, corroborating the manipulation of cognitive processing load. The adaptations seen in this study reflect users' efforts to self-manage limitations on working memory when task complexity increases. This is accomplished by distributing communicative information across multiple modalities, which is compatible with a cognitive load theory of multimodal interaction. The long-term goal of this research is the development of an empirical foundation for proactively guiding flexible and adaptive multimodal system design.
- Almor, A., Noun-phrase anaphora and focus: The informational load hypothesis. Psychological Review, 1999. 106: 748--765.]]Google Scholar
- Baddeley, A., Working Memory. Science, 1992. 255: 556--559.]]Google Scholar
- Benoit, C., J.-C. Martin, C. Pelachaud, L. Schomaker, & B. Suhm, Audio-visual and multimodal speech-based systems, Handbook of Multimodal and Spoken Dialogue Systems: Resources, Terminology and Product Evaluation, R. Moore, ed. 2000, Kluwer Academic Publishers: Boston, MA. 102--203.]]Google Scholar
- Calvert, G., C. Spence, & B.E. Stein, eds. The handbook of multisensory processing. 2004, MIT Press: Cambridge, MA.]]Google Scholar
- Chandler, P. & J. Sweller, Cognitive load theory and the format of instruction. Cognition and Instruction, 1991. 8: 293--332.]]Google Scholar
- Grant, K.W. & S. Greenberg. Speech intelligibility derived from asynchronous processing of auditory-visual information. Workshop on Audio-Visual Speech Processing (AVSP-2001). 2001. Scheelsminde, Denmark]]Google Scholar
- Grice, H.P., Logic and conversation, Syntax and Semantics: Speech Acts, J.L. Morgan, ed. 1975, Acad Press: NY. 41--58.]]Google Scholar
- Hinckley, K., Pierce J., E. Horvitz, & M. Sinclair, Foreground and background interaction with sensor-enhanced mobile devices. ACM Transactions on Computer Human Interaction, in press (Special Issue on Sensor-Based Interaction).]] Google ScholarDigital Library
- Jacko, J., L. Barnard, T. Kongnakorn, K. Moloney, P. Edwards, V. Emery, & F. Sainfort. Isolating the effects of visual impairment: Exploring the effect of AMD on the utility of multimodal feedback. Conf. on Human Factors in Comp. Systems: CHI '04. 2004. NY, NY: ACM Press]] Google ScholarDigital Library
- Jameson, A., Adaptive interfaces and agents, The human-computer interaction handbook, A. Sears, ed. 2003, Lawrence Erlbaum Associates: Mahwah NJ. 305--330.]] Google ScholarDigital Library
- Mayer, R.E. & R. Moreno, A split-attention effect in multimedia learning: evidence for dual processing systems in working memory. Journal of Educational Psychology, 1998. 90(2): 312--320.]]Google Scholar
- Mousavi, S.Y., R. Low, & J. Sweller, Reducing cognitive load by mixing auditory and visual presentation modes. Journal of Educational Psychology, 1995. 87(2): 319--334.]]Google Scholar
- Müller, C., B. Großmann-Hutter, A. Jameson, R. Rummer, & F. Wittig. Recognizing time pressure and cognitive load on the basis of speech: an experimental study. User Modeling. 2001: Springer]] Google ScholarDigital Library
- Oviatt, S.L., Predicting spoken disfluencies during human-computer interaction. Computer Speech and Language, 1995. 9: 19--35.]]Google Scholar
- Oviatt, S.L., Multimodal interactive maps: Designing for human performance. Human Computer Interaction, 1997. 12(1-2): 93--129.]] Google ScholarDigital Library
- Oviatt, S.L., Ten myths of multimodal interaction. Communications of the ACM, 1999. 42(11): 74--81.]] Google ScholarDigital Library
- Oviatt, S.L. Mutual disambiguation of recognition errors in a multimodal architecture. ACM SIGCHI Conf. on Human Factors in Comp. Sys. (CHI'99). 1999. Pittsburgh, PA: ACM Press: 576--583.]] Google ScholarDigital Library
- Oviatt, S.L. Multimodal system processing in mobile environments. 13th ACM Symp. on User Interface Software Tech. (UIST'2000). 2000. New York: ACM Press: 21--30.]] Google ScholarDigital Library
- Oviatt, S.L., P.R. Cohen, L. Wu, J. Vergo, L. Duncan, B. Suhm, J. Bers, T.G. Holzman, T. Winograd, J. Landay, J. Larson, & D. Ferro, Designing the user interface for multimodal speech and gesture applications: State-of-the-art systems and research directions. Human Computer Interaction, 2000. 15(4): 263--322.]] Google ScholarDigital Library
- Oviatt, S.L., R. Coulston, S. Tomko, B. Xiao, R. Lunsford, M. Wesson, & L. Carmichael. Toward a theory of organized multimodal integration patterns during human-computer interaction. Internat. Conf. on Multimodal Interfaces. 2003. Vancouver, B.C.: ACM Press: 44--51.]] Google ScholarDigital Library
- Oviatt, S.L., T. Darrell, & M. Flickner, Multimodal Interfaces that flex, adapt, and persist, Comm. of the ACM. 2004. 30--33]]Google Scholar
- Penney, C.G., Modality effects and the structure of short-term verbal memory. Memory and Cognition, 1989. 17: 398--422.]]Google Scholar
- Prince, E., Toward a taxonomy of given-new information, Radical Pragmatics, P. Cole, ed. 1986, Academic: NY. 223--255.]]Google Scholar
- Sweller, J., Cognitive load during problem solving: Effects on learning. Cognitive Science, 1988. 12: 257--285.]]Google Scholar
- Technology for adaptive aging: Reports and papers. 2003, Nat. Acad. of Sci. Workshop: Nat. Acad. Press. http://www.nap.edu/books/0309091160/html/]]Google Scholar
- Teder-Sälejärvi, W.A., J.J. McDonald, F. Di Russo, & S.A. Hillyard, An analysis of audio-visual crossmodal integration by means of event-related potential (ERP) recordings. Cognitive Brain Research, 2002. 14: 106--114.]]Google Scholar
- Tindall-Ford, S., P. Chandler, & J. Sweller, When two sensory modes are better than one. Journal of Experimental Psychology: Applied, 1997. 3(3): 257--287.]]Google Scholar
- Wickens, C., Sandry, D., and Vidulich, M., Compatibility and resource competition between modalities of input, central processing, and output. Human Factors, 1983. 25(2): 227--248.]]Google Scholar
- Xiao, B., R. Lunsford, R. Coulston, M. Wesson, & S.L. Oviatt. Modeling multimodal integration patterns and performance in seniors: Toward adaptive processing of individual differences. Internat. Conf. on Multimodal Interfaces. 2003. Vancouver, BC: ACM Press: 265--272.]] Google ScholarDigital Library
Index Terms
- When do we interact multimodally?: cognitive load and multimodal communication patterns
Recommendations
Private speech during multimodal human-computer interaction
ICMI '04: Proceedings of the 6th international conference on Multimodal interfacesAudio-visual cues distinguishing self- from system-directed speech in younger and older adults
ICMI '05: Proceedings of the 7th international conference on Multimodal interfacesIn spite of interest in developing robust open-microphone engagement techniques for mobile use and natural field contexts, there currently are no reliable techniques available. One problem is the lack of empirically-grounded models as guidance for ...
Toward a theory of organized multimodal integration patterns during human-computer interaction
ICMI '03: Proceedings of the 5th international conference on Multimodal interfacesAs a new generation of multimodal systems begins to emerge, one dominant theme will be the integration and synchronization requirements for combining modalities into robust whole systems. In the present research, quantitative modeling is presented on ...
Comments