skip to main content
Skip header Section
Voice User Interface DesignFebruary 2004
Publisher:
  • Addison Wesley Longman Publishing Co., Inc.
  • 350 Bridge Pkwy suite 208 Redwood City, CA
  • United States
ISBN:978-0-321-18576-1
Published:01 February 2004
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

This book is a comprehensive and authoritative guide to voice user interface (VUI) design. The VUI is perhaps the most critical factor in the success of any automated speech recognition (ASR) system, determining whether the user experience will be satisfying or frustrating, or even whether the customer will remain one. This book describes a practical methodology for creating an effective VUI design. The methodology is scientifically based on principles in linguistics, psychology, and language technology, and is illustrated here by examples drawn from the authors' work at Nuance Communications, the market leader in ASR development and deployment.The book begins with an overview of VUI design issues and a description of the technology. The authors then introduce the major phases of their methodology. They first show how to specify requirements and make high-level design decisions during the definition phase. They next cover, in great detail, the design phase, with clear explanations and demonstrations of each design principle and its real-world applications. Finally, they examine problems unique to VUI design in system development, testing, and tuning. Key principles are illustrated with a running sample application.A companion Web site provides audio clips for each example: www.VUIDesign.orgThe cover photograph depicts the first ASR system, Radio Rex: a toy dog who sits in his house until the sound of his name calls him out. Produced in 1911, Rex was among the few commercial successes in earlier days of speech recognition. Voice User Interface Design reveals the design principles and practices that produce commercial success in an era when effective ASRs are not toys but competitive necessities.

Cited By

  1. McTear M, Varghese Marokkie S and Bi Y A Comparative Study of Chatbot Response Generation: Traditional Approaches Versus Large Language Models Knowledge Science, Engineering and Management, (70-79)
  2. Klein A, Kölln K, Deutschländer J and Rauschenberger M Design and Evaluation of Voice User Interfaces: What Should One Consider? Design, Operation and Evaluation of Mobile Communications , (167-190)
  3. Ward N and Avila J (2023). A dimensional model of interaction style variation in spoken dialog, Speech Communication, 149:C, (47-62), Online publication date: 1-Apr-2023.
  4. Hsu W and Lee M (2023). Semantic Technology and Anthropomorphism, Journal of Global Information Management, 31:1, (1-21), Online publication date: 3-Feb-2023.
  5. ACM
    Guglielmi E, Rosa G, Scalabrino S, Bavota G and Oliveto R Sorry, I don’t Understand: Improving Voice User Interface Testing Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, (1-12)
  6. Ma Q, Zhou R, Zhang C and Chen Z (2022). Rationally or emotionally: how should voice user interfaces reply to users of different genders considering user experience?, Cognition, Technology and Work, 24:2, (233-246), Online publication date: 1-May-2022.
  7. ACM
    Wei J, Tag B, Trippas J, Dingler T and Kostakos V What Could Possibly Go Wrong When Interacting with Proactive Smart Speakers? A Case Study Using an ESM Application Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, (1-15)
  8. ACM
    Smith J, Spaulding A, Bratt H, Vergyri D, Acharya G, Precoda K, Kathol A and Richey C Towards Conversationally Intelligent Dialog Systems CHI Conference on Human Factors in Computing Systems Extended Abstracts, (1-7)
  9. Marge M, Espy-Wilson C, Ward N, Alwan A, Artzi Y, Bansal M, Blankenship G, Chai J, Daumé H, Dey D, Harper M, Howard T, Kennington C, Kruijff-Korbayová I, Manocha D, Matuszek C, Mead R, Mooney R, Moore R, Ostendorf M, Pon-Barry H, Rudnicky A, Scheutz M, Amant R, Sun T, Tellex S, Traum D and Yu Z (2022). Spoken language interaction with robots, Computer Speech and Language, 71:C, Online publication date: 1-Jan-2022.
  10. Tsang T and Morris A A Hybrid Quality-of-Experience Taxonomy for Mixed Reality IoT (XRI) Systems 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (1809-1816)
  11. ACM
    Xiao Z, Mennicken S, Huber B, Shonkoff A and Thom J (2021). Let Me Ask You This: How Can a Voice Assistant Elicit Explicit User Feedback?, Proceedings of the ACM on Human-Computer Interaction, 5:CSCW2, (1-24), Online publication date: 13-Oct-2021.
  12. ACM
    Capdevila M, Saltiveri T, Garrido J, Müller O and Ruas L Do current user testing practices meet the needs of the new interactive paradigms? Proceedings of the XXI International Conference on Human Computer Interaction, (1-9)
  13. ACM
    Pieraccini R Natural Language Understanding in Socially Interactive Agents The Handbook on Socially Interactive Agents, (147-172)
  14. ACM
    Chen C, Mrini K, Charles K, Lifset E, Hogarth M, Moore A, Weibel N and Farcas E Toward a Unified Metadata Schema for Ecological Momentary Assessment with Voice-First Virtual Assistants Proceedings of the 3rd Conference on Conversational User Interfaces, (1-6)
  15. Sciarretta E and Alimenti L Smart Speakers for Inclusion: How Can Intelligent Virtual Assistants Really Assist Everybody? Human-Computer Interaction. Theory, Methods and Tools, (77-93)
  16. Cebrián J, Martínez R, Rodríguez N and D'Haro L (2021). Considerations on creating conversational agents for multiple environments and users, AI Magazine, 42:2, (71-86), Online publication date: 1-Jun-2021.
  17. ACM
    Kim Y, Reza M, McGrenere J and Yoon D Designers Characterize Naturalness in Voice User Interfaces: Their Goals, Practices, and Challenges Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, (1-13)
  18. ACM
    Klein A Toward a user experience tool selector for voice user interfaces Proceedings of the 18th International Web for All Conference, (1-2)
  19. ACM
    Onishi R, Morisaki T, Suzuki S, Mizutani S, Kamigaki T, Fujiwara M, Makino Y and Shinoda H DualBreath: Input Method Using Nasal and Mouth Breathing Proceedings of the Augmented Humans International Conference 2021, (283-285)
  20. ACM
    Guerino G and Valentim N Usability and User eXperience Evaluation of Conversational Systems Proceedings of the XXXIV Brazilian Symposium on Software Engineering, (427-436)
  21. ACM
    Klein A, Hinderks A, Schrepp M and Thomaschewski J Construction of UEQ+ scales for voice quality Proceedings of Mensch und Computer 2020, (1-5)
  22. ACM
    Lee S, Cho M and Lee S (2020). What If Conversational Agents Became Invisible?, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4:3, (1-24), Online publication date: 4-Sep-2020.
  23. ACM
    Liesenfeld A and Huang C NameSpec asks Proceedings of the 2nd Conference on Conversational User Interfaces, (1-3)
  24. ACM
    Kirschthaler P, Porcheron M and Fischer J What Can I Say? Proceedings of the 2nd Conference on Conversational User Interfaces, (1-9)
  25. Zhang C, Zhou R, Zhang Y, Sun Y, Zou L and Zhao M How to Design the Expression Ways of Conversational Agents Based on Affective Experience Human-Computer Interaction. Multimodal and Natural Interaction, (302-320)
  26. Stigberg S Human Computer Interfaces Reconsidered: A Conceptual Model for Understanding User Interfaces Human-Computer Interaction. Design and User Experience, (160-171)
  27. ACM
    Khan T, Yoon D and McGrenere J Designing an Eyes-Reduced Document Skimming App for Situational Impairments Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, (1-14)
  28. ACM
    Choi D, Kwak D, Cho M and Lee S "Nobody Speaks that Fast!" An Empirical Study of Speech Rate in Conversational Agents for People with Vision Impairments Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, (1-13)
  29. Furini M, Mirri S, Montangero M and Prandi C Do Conversational Interfaces Kill Web Accessibility? 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC), (1-6)
  30. ACM
    Stigall B, Waycott J, Baker S and Caine K Older Adults' Perception and Use of Voice User Interfaces Proceedings of the 31st Australian Conference on Human-Computer-Interaction, (423-427)
  31. ACM
    Yang Z, Yu C, Zheng F and Shi Y (2019). ProxiTalk, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 3:3, (1-25), Online publication date: 9-Sep-2019.
  32. Augstein M, Neumayr T and Pimminger S WeldVUI: Establishing Speech-Based Interfaces in Industrial Applications Human-Computer Interaction – INTERACT 2019, (679-698)
  33. Heo J and Lee J CiSA: An Inclusive Chatbot Service for International Students and Academics HCI International 2019 – Late Breaking Papers, (153-167)
  34. Maguire M Development of a Heuristic Evaluation Tool for Voice User Interfaces Design, User Experience, and Usability. Practice and Case Studies, (212-225)
  35. ACM
    Kontogiorgos D, Pereira A, Andersson O, Koivisto M, Gonzalez Rabal E, Vartiainen V and Gustafson J The Effects of Anthropomorphism and Non-verbal Social Behaviour in Virtual Assistants Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, (133-140)
  36. ACM
    Kim H, Koh D, Lee G, Park J and Lim Y Designing Personalities of Conversational Agents Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, (1-6)
  37. ACM
    Sutton S, Foulkes P, Kirk D and Lawson S Voice as a Design Material Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, (1-14)
  38. ACM
    Fukumoto M SilentVoice Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology, (237-246)
  39. Garcia M, Lopez S and Donis H Voice activated virtual assistants personality perceptions and desires Proceedings of the 32nd International BCS Human Computer Interaction Conference, (1-10)
  40. ACM
    Luria M, Hoffman G and Zuckerman O Comparing Social Robot, Screen and Voice Interfaces for Smart-Home Control Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, (580-628)
  41. Ward N and DeVault D (2016). Challenges in Building Highly Interactive Dialogue Systems, AI Magazine, 37:4, (7-18), Online publication date: 1-Dec-2016.
  42. Kim K (2016). Interacting Socially with the Internet of Things IoT, Journal of Computer-Mediated Communication, 21:6, (420-435), Online publication date: 1-Nov-2016.
  43. ACM
    Almeida P, Lima A and Souza D A voice command interface for visually impaired on urban mobility Proceedings of the 14th Brazilian Symposium on Human Factors in Computing Systems, (1-4)
  44. Jeong D, Yi W and Cho J Applications of auditory cues for spatial cognitive behaviors based on embodied music cognition Proceedings of HCI Korea, (58-62)
  45. ACM
    Shrivastava A and Joshi A Effects of visuals, menu depths, and menu positions on IVR usage by non-tech savvy users Proceedings of the 6th Indian Conference on Human-Computer Interaction, (35-44)
  46. ACM
    Derboven J, Huyghe J and De Grooff D Designing voice interaction for people with physical and speech impairments Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Foundational, (217-226)
  47. ACM
    Komatsu T, Kobayashi K, Yamada S, Funakoshi K and Nakano M Augmenting expressivity of artificial subtle expressions (ASEs) Proceedings of the 5th Augmented Human International Conference, (1-10)
  48. Carneiro M, de Queiroz J and Fechine J Including uncertainty treatment on the accessibility assessment of DOSVOX system Proceedings of the 7th international conference on Universal Access in Human-Computer Interaction: design methods, tools, and interaction techniques for eInclusion - Volume Part I, (464-473)
  49. Maciel A, Rodrigues R, Barros P and Carvalho E Desenvolvendo soluções com interface baseada em voz Companion Proceedings of the 11th Brazilian Symposium on Human Factors in Computing Systems, (59-62)
  50. ACM
    Tchankue P, Wesson J and Vogts D Are mobile in-car communication systems feasible? Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference, (262-269)
  51. ACM
    Knutsen D, Bretier P, Ros C, Poletti M, Gimenes M, Rigalleau F and Le Bigot L Reducing user linguistic variability in speech interaction through lexical and syntactic priming Proceedings of the 30th European Conference on Cognitive Ergonomics, (160-167)
  52. Griol D, Molina J and Callejas Z (2012). Bringing together commercial and academic perspectives for the development of intelligent AmI interfaces, Journal of Ambient Intelligence and Smart Environments, 4:3, (183-207), Online publication date: 1-Aug-2012.
  53. ACM
    Komatsu T, Kobayashi K, Yamada S, Funakoshi K and Nakano M Can users live with overconfident or unconfident systems? CHI '12 Extended Abstracts on Human Factors in Computing Systems, (1595-1600)
  54. Prylipko D, Schnelle-Walka D, Lord S and Wendemuth A Zanzibar OpenIVR Proceedings of the 14th international conference on Text, speech and dialogue, (372-379)
  55. ACM
    Schnelle-Walka D I tell you something Proceedings of the 16th European Conference on Pattern Languages of Programs, (1-26)
  56. Kim H, Liu D and Kim H Inherent usability problems in interactive voice response systems Proceedings of the 14th international conference on Human-computer interaction: users and applications - Volume Part IV, (476-483)
  57. Stedmon A, Bayon V and Griffiths G (2011). Expanding interaction potentials within virtual environments, Advances in Human-Computer Interaction, 2011, (12-12), Online publication date: 1-Jan-2011.
  58. ACM
    Lerer A, Ward M and Amarasinghe S Evaluation of IVR data collection UIs for untrained rural users Proceedings of the First ACM Symposium on Computing for Development, (1-8)
  59. Muñoz P, Giner P and Pelechano V Refining interaction designs through simplicity Proceedings of the First international joint conference on Ambient intelligence, (31-40)
  60. Berg M, Thalheim B and Düsterhöft A Integration of dialogue patterns into the conceptual model of storyboard design Proceedings of the 2010 international conference on Advances in conceptual modeling: applications and challenges, (160-169)
  61. ACM
    Le Bigot L, Caroux L, Ros C, Lacroix A and Botherel V Combien d'options dans les messages vocaux? Proceedings of the 22nd Conference on l'Interaction Homme-Machine, (17-24)
  62. ACM
    Zhu S, Feng J and Sears A (2010). Investigating Grid-Based Navigation, ACM Transactions on Accessible Computing, 3:1, (1-30), Online publication date: 1-Sep-2010.
  63. Pucher M, Neubarth F and Schabus D Design and development of spoken dialog systems incorporating speech synthesis of Viennese varieties Proceedings of the 12th international conference on Computers helping people with special needs, (361-366)
  64. ACM
    Schnelle-Walka D A pattern language for error management in voice user interfaces Proceedings of the 15th European Conference on Pattern Languages of Programs, (1-23)
  65. Berg M, Düsterhöft A and Thalheim B Integration of natural language dialogues into the conceptual model of storyboard design Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems, (196-203)
  66. Jameson A (2009). Understanding and Dealing With Usability Side Effects of Intelligent Processing, AI Magazine, 30:4, (23-40), Online publication date: 1-Dec-2009.
  67. ACM
    Sporka A, Franc J and Riccardi G Can machines call people? CHI '09 Extended Abstracts on Human Factors in Computing Systems, (3625-3630)
  68. ACM
    Neto A, Bittar T, Fortes R and Felizardo K Developing and evaluating web multimodal interfaces - a case study with usability principles Proceedings of the 2009 ACM symposium on Applied Computing, (116-120)
  69. Minker W, López-Cózar R and Mctear M (2009). The role of spoken language dialogue interaction in intelligent environments, Journal of Ambient Intelligence and Smart Environments, 1:1, (31-36), Online publication date: 1-Jan-2009.
  70. Minker W, López-Cózar R and Mctear M (2009). The role of spoken language dialogue interaction in intelligent environments, Journal of Ambient Intelligence and Smart Environments, 1:1, (31-36), Online publication date: 1-Jan-2009.
  71. ACM
    Munzlinger E and Forster C Desenvolvimento e avaliação de um sistema multimodal e multiusuário de navegação web Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web, (29-32)
  72. Neto A, Bittar T, Pontin R, Fortes M and Felizardo K Abordagem para o desenvolvimento e avaliação de interfaces multimodais web pautada em princípios de usabilidade Proceedings of the VIII Brazilian Symposium on Human Factors in Computing Systems, (21-30)
  73. Porumb C, Porumb S, Vlaicu A and Orza B Mobile multimedia for improving the administrative and security services Proceedings of the 11th WSEAS International Conference on Computers, (552-556)
  74. Paek T Toward evaluation that leads to best practices Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies, (40-47)
  75. ACM
    Sporka A, Kurniawan S, Mahmud M and Slavík P Non-speech input and speech recognition for real-time control of computer games Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility, (213-220)
  76. ACM
    Kehoe A and Pitt I Designing help topics for use with text-to-speech Proceedings of the 24th annual ACM international conference on Design of communication, (157-163)
  77. Chatzichrisafis N, Bouillon P, Rayner M, Santaholma M, Starlander M and Hockey B Evaluating task performance for a unidirectional controlled language medical speech translation system Proceedings of the Workshop on Medical Speech Translation, (5-12)
  78. ACM
    Fröhlich P Dealing with system response times in interactive speech applications CHI '05 Extended Abstracts on Human Factors in Computing Systems, (1379-1382)
Contributors
  • Nuance Communications, Inc.

Recommendations