article

Spoken dialogue technology: enabling the conversational user interface

Author:
Michael F. McTear

University of Ulster

University of Ulster
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 34 Issue 1pp 90–169https://doi.org/10.1145/505282.505285

Published:01 March 2002Publication History

ACM Computing Surveys

Abstract

Spoken dialogue systems allow users to interact with computer-based applications such as databases and expert systems by using natural spoken language. The origins of spoken dialogue systems can be traced back to Artificial Intelligence research in the 1950s concerned with developing conversational interfaces. However, it is only within the last decade or so, with major advances in speech technology, that large-scale working systems have been developed and, in some cases, introduced into commercial environments. As a result many major telecommunications and software companies have become aware of the potential for spoken dialogue technology to provide solutions in newly developing areas such as computer-telephony integration. Voice portals, which provide a speech-based interface between a telephone user and Web-based services, are the most recent application of spoken dialogue technology. This article describes the main components of the technology---speech recognition, language understanding, dialogue management, communication with an external source such as a database, language generation, speech synthesis---and shows how these component technologies can be integrated into a spoken dialogue system. The article describes in detail the methods that have been adopted in some well-known dialogue systems, explores different system architectures, considers issues of specification, design, and evaluation, reviews some currently available dialogue development toolkits, and outlines prospects for future development.

References

ABNEY, S. 1997. Part-of-speech tagging and partial parsing. In Corpus-Based Methods in Language and Speech Processing, S. Young and G. Bloothooft, Eds. Kluwer Academic Publishers, Dordrecht, The Netherlands, 118-136.Google Scholar
ALLEN, J. 1983. Recognising intentions from natural language utterances. In Computational Models of Discourse, M. Brady and R. Berwick, Eds. MIT Press, Cambridge, MA, 107-166.Google Scholar
ALLEN, J. 1995. Natural Language Processing, 2nd ed. Benjamin Cummings Publishing Company Inc., Redwood, CA. Google Scholar
ALLEN, J., BYRON, D., DZIKOVSKA, M., FERGUSON,G., GALESCU, L., AND STENT, A. 2000. An architecture for a generic dialogue shell. Natural Language Engineering 6, 3, 1-16. Google Scholar
ALLEN, J., MILLER, B., RINGGER, E., AND SIKORSKI,T. 1996. A robust system for natural spoken dialogue. In Proceedings of the 34th Annual Meeting of the ACL (Santa Cruz, CA). ACL, 62-70. Google Scholar
ALLEN,J.AND PERRAULT, C. 1980. Analysing intention in utterances. Artificial Intelligence 15, 143- 178.Google Scholar
ALLEN, J., SCHUBERT, L., FERGUSON, G., HWANG, C., KATO, T. , LIGHT, M., MILLER, B., POESIO, M., AND TRAUM, D. 1995. The TRAINS project: a case study in building a conversational planning agent. Journal of Experimental and Theoretical Artificial Intelligence 7, 7-48.Google Scholar
ARETOULAKI,M.AND LUDWIG, B. 1999. Automatondescriptions and theorem-proving: a marriage made in heaven? In Proceedings of IJCAI'99 Workshop on Knowledge and Reasoning in Practical Dialogue Systems (Stockholm, Sweden), IJCAI.Google Scholar
AUST,H.AND OERDER, M. 1995. Dialogue control in automatic inquiry systems. In Proceedings of the ESCA Workshop on Spoken Dialogue Systems, P. Dalsgaard, L. Larsen, L. Boves, and I. Thomsen, Eds. ESCA, Vigso, Denmark, 121-124.Google Scholar
AUST, H., OERDER, M., SEIDE,F.,AND STEINBISS,V. 1995. The Philips automatic train timetable information system. Speech Communication 17, 249-262. Google Scholar
AUSTIN, J. L. 1962. How to Do Things with Words. Oxford University Press, Oxford, UK.Google Scholar
BAGGIA, P. 1996. Evaluation of spoken dialogue systems. Turorial, The 14th European Summer School on Language and Speech Communication.Google Scholar
BERNSEN, N. 1993. The structure of the design space. In Computers, Communication, and Usability: Design Issues, Research and Methods for Integrated Services, P. Byerley, P. Barnard, and J. May, Eds. North Holland, Amsterdam, The Netherlands, 221-244.Google Scholar
BERNSEN, N. 1994. Foundations of multimodal representations: a taxonomy of representational modalities. Interacting with Computers 6,4, 347-371.Google Scholar
BERNSEN, N., DYBKJ~R, H., AND DYBKJ~R, L. 1996. Co-operativity in human-machine and human-human spoken dialogue. Discourse Processes 21, 2, 213-236.Google Scholar
BERNSEN, N., DYBKJ~R, H., AND DYBKJ~R, L. 1998. Designing Interactive Speech Systems: From First Ideas to User Testing. Springer Verlag, New York, NY. Google Scholar
BILLI, R., CASTAGNERI,G.,AND DANIELI, M. 1996. Field trial evaluation of two different information inquiry systems. In IVTTA. IEEE, Basking Ridge, NJ, 129-132.Google Scholar
BOROS, M., ECKERT, W., GALLWITZ, F., GORZ,G., HANRIEDER,G.,AND NIEMANN, H. 1996. Towards understanding spontaneous speech: word accuracy vs. concept accuracy. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP96), Philadephia, PA). ICSLP, 1005-1008.Google Scholar
BRATMAN, M., ISRAEL,D.,AND POLLACK, M. 1988. Plans and resource-bounded practical reasoning. Computational Intelligence 4, 2, 349- 355.Google Scholar
CARBERRY, S. 1986. The use of inferred knowledge in handling pragmatically ill-formed queries. In Communication Failure in Dialogue, R. Reilly, Ed. Elsevier Science Publishers North Holland, Amsterdam, The Netherlands, 187-200.Google Scholar
CARBERRY, S. 1989. Plan recognition and its use in understanding dialogue. In User Models in Dialog Systems, A. Kobsa and W. Wahlster Eds. Springer Verlag, London, UK, 133-162.Google Scholar
CARBERRY,S.AND LAMBERT, L. 1999. A process model for recognising communicative acts and modeling negotiation subdialogues. Computational Linguistics 25, 1, 1-54. Google Scholar
CARLETTA, J. 1996. Assessing the reliability of subjective codings. Computational Linguistics 22,2, 249-254. Google Scholar
CARLSON,R.AND GRANSTR~M, B. 1997. Speech synthesis. In The Handbook of Phonetic Science, W. J. Hardcastle and J. Laver, Eds. Blackwell, Oxford, UK, 768-788.Google Scholar
CHIN, D. 1989. KNOME: Modeling what the user knows in UC. In User Models in Dialog Systems, A. Kobsa and W. Wahlster, Eds. Springer Verlag, London, UK, 74-107.Google Scholar
CLARK, H. 1992. Arenas of Language Use. University of Chicago Press, Chicago, IL.Google Scholar
COHEN, P. 1994. Models of dialogue. In Proceedings of the 4th NEC Research Symposium, M. Nagao, Ed. SIAM Press Philadephia, PA.Google Scholar
COHEN,P.AND LEVESQUE, H. 1990. Rational interaction as the basis for communication. In Intentions in Communication, P. Cohen, J. Morgan and M. Pollack, Eds. MIT Press, Cambridge, MA, 221-256.Google Scholar
COHEN,P.AND OVIATT, S. 1995. The role of voice in human-machine communication. In Voice Communication Between Humans and Machines, D. Roe and J. Wilpon, Eds. National Academy Press, Washington, DC, 34-75. Google Scholar
COLE, R., MASSARO,D.,DE VILLIERS,J.RUNDLE,B., SHOBAKI,K.WOUTERS, J., COHEN, M., BESKOW, J., STONE, P., CONNORS, P., TARACHOW, A., AND SOLCHER, D. 1999a. New tools for interactive speech and language training: Using animated conversational agents in the classrooms of profoundly deaf children. In Proceedings of ESCA/SOCRATES Workshop on Method and Tool Innovations for Speech Science Education (London). 45-52.Google Scholar
COLE, R., SERRIDGE, B., HOSOM,J.P.,CRONK, A., AND KAISER, E. 1999b. A platform for multilingual research in spoken dialogue systems. In Multi- Lingual Interoperability in Speech Technology (MIST) (Leusden, The Netherlands).Google Scholar
COLE, R., NOVICK, D., VERMEULEN, P., SUTTON,S., FANTY, M., WESSELS, L., DE VILLIERS, J., SCHALKWYK, J., HANSEN,B.,AND BURNETT, D. 1997. Experiments with a spoken dialogue system for taking the U.S. census. In Speech Communication 23,3, 243-260. Google Scholar
CONSTANTINIDES, P., HANSMA, S., TCHOU,C.,AND RUDNICKY, A. 1998. A schema based approach to dialog control. In Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP'98, Sydney, Australia), Vol. 2. ICSLP, 409-412.Google Scholar
DAHLBACK,N.AND JONSSON A. 1999. Knowledge sources in spoken dialogue systems. In Proceedings of 6th European Conf. on Speech Com-munication and Technology (Eurospeech'99, Budapest, Hungary). ESCA.Google Scholar
DALE,R.AND REITER, E. 1995. Computational interpretations of the Gricean maxims in the generation of referring expressions. Cognitive Science 19, 233-263.Google Scholar
DANIELI,M.AND GERBINO, E. 1995. Metrics for evaluating dialogue strategies in a spoken language system. In Working Notes of the AAAI Spring Symposium on Empirical Methods on Discourse Interpretation and Generation. AAAI, Stanford, CA, 34-39.Google Scholar
DENECKE,M.AND WAIBEL, A. 1997. Dialogue strategies guiding users to their communicative goals. In Proceedings of 5th European Conf. on Speech Communication and Technology (Eurospeech'97, Rhodes, Greece). ESCA.Google Scholar
DOWDING, J., GANRON, J. M., APPELT, D., BEAR,J.CHERNY, L., MOORE, R., AND MORAN, D. 1993. Gemini: a natural language system for spoken language understanding. In Proceedings of the 31st Annual Meeting of the ACL. ACL Columbus, OH, 54-61. Google Scholar
DYBKJ~R, L., BERNSEN,N.O.,AND DYBKJ~R,H. 1996. Evaluation of spoken dialogue systems. In Proceedings of the Eleventh Twente Workshop on Language Technology (TWLT 11): Dialogue Management in Natural Language Systems, S. LuperFoy and A. Nijholt and G. V. van Zanten, Eds. Universiteit Twente, Enschede, The Netherlands.Google Scholar
DYBKJ~R, L., BERNSEN,N.O.,AND DYBKJ~R,H. 1997. Generality and objectivity: central issues in putting a dialogue evaluation tool into practical use. In Interactive Spoken Dialog Systems: Bringing Speech and NLP Together in Real Applications. Proceedings of a Workshop Sponsored by the Association for Computational Linguistics (Madrid, Spain), J. Hirschberg, C. Kamm and M. Walker, Eds. ACL, 17-24. Google Scholar
DYBKJ~R,L.BERNSEN.N.O.,AND DYBKJ~R, H. 1998. A methodology for diagnostic evaluation of spoken human-machine interaction. International Journal of Human-Computer Studies 48, 605- 625. Google Scholar
ECKERT,W.AND NIEMANN, H. 1994. Semantic analysis in a robust spoken dialog system. In Proceedings of the 3rd International Conference on Spoken Language Processing (Yokohama, Japan). ICSLP, 107-110.Google Scholar
ECKERT, W., N~TH, E., NIEMANN, H., AND SCHUKAT- TALAMAZZANI, E. G. 1995. Real users behave weird experiences made collecting large humanmachine dialog corpora. In Proceedings of the ESCA Workshop on Spoken Dialogue Systems,P. Dalsgaard, L. Larsen, L. Boves, and I. Thomsen, Eds. ESCA, Vigso, Denmark, 193-196.Google Scholar
EDGINGTON, M., LOWRY, A., JACKSON, P., BREEN,A.P., AND MINNIS, S. 1996a. Overview of current text-to-speech synthesis techniques: Part I-text and linguistic analysis. BT Technology Journal 14, 1, 68-83.Google Scholar
EDGINGTON, M., LOWRY, A., JACKSON, P., BREEN,A.P., AND MINNIS, S. 1996b. Overview of current text-to-speech synthesis techniques: Part II- prosody and speech generation. BT Technology Journal 14, 1, 84-99.Google Scholar
FERGUSON, G., ALLEN,J.F.,AND MILLER, B. 1996. Trains-95: Towards a mixed-initiative planning assistant. Proceedings of the 3rd International Conference on AI Planning Systems (AIPS-96, Edinburgh, Scotland, UK). 70-77.Google Scholar
FRASER N. 1997. Assessment of interactive systems. In Handbook of Standards and Resources for Spoken Language Systems, D. Gibbon, R. Moore, and R. Winski, Eds. Mouton de Gruyter, New York, NY, 564-614.Google Scholar
FRASER,N.AND GILBERT, G. N. 1991. Simulating speech systems. Computer Speech and Language 5 81-99.Google Scholar
GERBINO,E.AND DANIELI, M. 1993. Managing dialogue in a continuous speech understanding system. In Proceedings of 3rd European Conference on Speech Communication and Technology (Eurospeech'93, Berlin, Germany). ESCA, 1661- 1664.Google Scholar
GIACHIN,E.AND MCGLASHAN, S. 1997. Spoken language dialogue systems. In Corpus-Based Methods in Language and Speech Processing, S. Young and G. Bloothooft, Eds. Kluwer Academic Publishers, Dordrecht, The Netherlands, 69-117.Google Scholar
GIBBON,D.MOORE, R., AND WINSKI R., EDS. 1997. Handbook of Standards and Resources for Spoken Language Systems. Mouton de Gruyter, New York, NY.Google Scholar
GODDEAU, D., MENG, H., POLIFRONI, J., SENEFF,S.,AND BUSAYAPONGCHAI. 1996. A form-based dialogue manager for spoken language applications. In Proceedings of 4th International Conference on Spoken Language Processing (ICSLP'96, Pittsburgh, PA). ICSLP, 701-704.Google Scholar
GRICE, P. 1975. Logic and conversation. In Syntax and Semantics Vol. 3: Speech Acts, P. Cole and J. Morgan, Eds. Academic Press, New York, NY, 41-58.Google Scholar
GROSZ,B.J.,JOSHI,A.K.,AND WEINSTEIN, S. 1983. Providing a unified account of definite noun phrases in discourse. In Proceedings of the 21st Annual Meeting of the ACL. ACL, Boston, MA, 44-50. Google Scholar
GROSZ,B.J.AND SIDNER, C. 1986. Attention, intention, and the structure of discourse. In Computational Linguistics 12, 3, 175-204. Google Scholar
HANSEN, B., NOVICK,D.G.,AND SUTTON, S. 1996. Systematic design of spoken prompts. In CHI'96 (Vancouver, B.C., Canada). ACM Press, New York, NY, 157-164. Google Scholar
HEEMAN,P.A.AND ALLEN, J. F. 1997. Intonational boundaries, speech repairs, and discourse markers: modeling spoken dialog. In Proceedings of the 35th Annual Meeting of the ACL and the 8th Conference of the European Chapter of the Association for Computational Linguistics (Madrid, Spain). ACL, 254-261. Google Scholar
HEISTERKAMP,P.AND MCGLASHAN, S. 1996. Units of dialogue management: an example. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP'96, Philadelphia, PA). ICSLP, 200-203.Google Scholar
HIRSCHMAN, L. 1995. The roles of language processing in a spoken language interface. In Voice Communication Between Humans and Machines, D. Roe and J. Wilpon, Eds. National Academy Press Washington, DC, 217-237. Google Scholar
HONE,K.S.AND BABER, C. 1995. Using a simulation method to predict the transaction time effects of applying alternative levels of constraint to utterances within speech interactive dialogues. In Proceedings of the ESCA Work-shop on Spoken Dialogue Systems, P. Dalsgaard, L. Larsen, L. Boves, and I. Thomsen, Eds. ESCA, Vigso, Denmark, 209-212.Google Scholar
JAMESON, A. 1989. But what will the user think? Belief ascription and image maintenance in dialog. In User Models in Dialog Systems, A. Kobsa and W. Wahlster, Eds. Springer Verlag, London, UK, 255-312.Google Scholar
JURAFSKY,D.AND MARTIN, J. 2000. Speech and Language Processing: An Introduction to Natural Language Processing. Prentice Hall, Englewood Cliffs, NJ. Google Scholar
KAISER,E.C.,JOHNSTON, M., AND HEEMAN, P. A. 1999. Profer: Predictive, robust finite-state parsing for spoken language. In Proceedings of ICASSP (Phoenix, AZ), Vol. 2. IEEE, 629-632. Google Scholar
KAMM, C. 1995. User interfaces for voice applications. In Voice Communication Between Humans and Machines, D. Roe and J. Wilpon, Eds. National Academy Press, Washington, DC, 34-75. Google Scholar
KAMM, C., WALKER, M. A., AND J. LITMAN, D. 1999. Evaluating spoken language systems. In Proceedings of American Voice Input/Output Society (AVIOS). AVIOS.Google Scholar
KAPLAN, J. 1983. Cooperative responses from a portable natural language database query system. In Computational Models of Discourse, M. Brady and R. Berwick, Eds. MIT Press, Cambridge, MA, 167-208.Google Scholar
KUBALA, F., BARRY, C., BATES, M., BOBROW, R., FUNG, P., INGRIA, R., MAKHOUL, J., NGUYEN, L., SCHWARTZ, R., AND STALLARD, D. 1992. BBNBy-blos and HARC February 1992 ATIS benchmark results. In Proceedings of the DARPA Speech and Natural Language Workshop, Harriman, N.Y. Morgan Kaufmann Publishers, San Mateo, CA, 72-77. Google Scholar
LAMEL,L.F.,BENNACEF, S. K., BONNEAU-MAYNARD, H., ROSSETN,S.,AND GAUVAIN, J. L. 1995. Recent developments in spoken language systems for information retrieval. In Proceedings of the ESCA Workshop on Spoken Dialogue Systems, P. Dalsgaard, L. Larsen, L. Boves, and I. Thomsen, Eds. ESCA, Vigso, Denmark, 17- 20.Google Scholar
LARSEN,L.B.,AND BAEEKGAARD, A. 1994. Rapid prototyping of a dialogue system using a generic dialogue development platform. In Proceedings of ICSLP'94 (Yokohama, Japan). ICSLP, 919-922.Google Scholar
LENNIG, M., BIELBY,G.,AND MASSICOTTE, J. 1995. Directory assistance automation in Bell Canada: Trial results. Speech Communication 17, 227- 234. Google Scholar
LITMAN,D.J.AND ALLEN, J. F. 1987. A plan recognition model for subdialogues in conversation. Cognitive Science 11, 163-200.Google Scholar
LITMAN,D.J.,KEARNS,M.S.,SINGH,S.,AND WALKER, M. A. 2000. Automatic optimization of dialogue management. In Proceedings of 18th International Conference on Computational Linguistics (COLING-2000, Saarbrucken, Germany). ACL. Google Scholar
MAKHOUL,J.AND SCHWARTZ, R. 1995. State of the art in continuous speech recognition. In Voice Communication Between Humans and Machines, D. Roe and J. Wilpon, Eds. National Academy Press, Washington, DC, 165-197. Google Scholar
MANN,W.AND THOMPSON, S. 1988. Rhetorical structure theory: toward a functional theory of text organisation. Text 3, 243-281.Google Scholar
MARCUS, M. 1995. New trends in natural language processing: statistical natural language processing. In Voice Communication Between Humans and Machines, D. Roe and J. Wilpon, Eds. National Academy Press, Washington, DC, 482- 504. Google Scholar
MARTIN, P., CRABBE, F., ADAMS, S., BAATZ, E., AND YANKELOVICH, N. 1996. SpeechActs: a spoken language framework. IEEE Computer 29, 7, 33- 40. Google Scholar
MAYBURY, M. T., Ed. 1993. Intelligent Multimedia Interfaces. MIT Press, Cambridge, MA. Google Scholar
MCCOY, K. F. 1986. Generating responses to property misconceptions using perspective. In Communication Failure in Dialogue,R. Reilly, Ed. Elsevier Science Publishers North Holland, Amsterdam, The Netherlands, 149- 160.Google Scholar
MCGLASHAN, S., BILANGE, E., FRASER, N., GILBERT,N., HEISTERKAMP,P.,AND YOUD, N. 1990. Managing oral dialogues. Research Report, Social and Computer Sciences Research Group, University of Surrey, Surrey, UK.Google Scholar
MCKEOWN, K. 1985. Text Generation. Cambridge University Press, Cambridge, UK.Google Scholar
MCROY, S. 1996. Detecting, repairing, and preventing human-machine miscommunication. In AAAI '96 Workshop (Portland, OR).Google Scholar
MCTEAR, M. 1998. Modelling spoken dialogues with state transition diagrams: experiences with the CSLU toolkit. In Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP'98, Sydney, Australia). ICSLP, 1223-1226.Google Scholar
MCTEAR, M. 1999. Using the CSLU toolkit for practicals in spoken dialogue technology. In Proceedings of ESCA/SOCRATES Workshop on Method and Tool Innovations for Speech Science Education (London, UK). ESCA, 113- 116.Google Scholar
MCTEAR, M., ALLEN, S., CLATWORTHY, L., ELLISON,N., LAVELLE,C.,AND MCCAFFERY, H. 2000. Integrating flexibility into a structured dialogue model: Some design considerations. In Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP'2000, Beijing, China), Vol. 1, ICSLP, 110-113.Google Scholar
MOORE, R. C. 1995. Integration of speech with natural language understanding. In Voice Communication Between Humans and Machines,D.Roe and J. Wilpon, Eds. National Academy Press, Washington, DC, 254-271. Google Scholar
NAGATA,M.AND MORIMOTO, T. 1994. First steps toward statistical modeling of dialogue to predict the speech act type of the next utterance. Speech Communication 15, 193-203. Google Scholar
PAGE,J.H.AND BREEN, A. P. 1996. The Laureate text-to-speech system-architecture and applications. BT Technology Journal 14, 1, 57-67.Google Scholar
PARIS, C. L. 1989. The use of explicit user models in a generation system for tailoring answers to the user's level of expertise. In User Models in Dialog Systems, A. Kobsa and W. Wahlster, Eds. Springer Verlag, London, UK, 200-232.Google Scholar
PECKHAM, J. 1993. A new generation of spoken dialogue systems: Results and lessons from the SUNDIAL project. In Proceedings of 3rd European Conference on Speech Communication and Technology (Eurospeech'93, Berlin, Germany). ESCA, 33-40.Google Scholar
PECKHAM, J. n.d. Vocalis develops directory enquiries service with speech recognition for Telia. http://www.callcentres.com.au/ speechr1.htm.Google Scholar
PHILIPS SPEECH PROCESSING. 1997. Hddl v2.0- dialog description language-user's guide. Philips Speech Processing, Aachen, Germany.Google Scholar
POLLACK, M. 1986. Some requirements for a model of the plan-inference process in conversation. In Communication Failure in Dialogue,R. Reilly, Ed. Elsevier Science Publishers North Holland, Amsterdam, The Netherlands, 245- 256.Google Scholar
POTJER, J., RUSSEL, A., BOVES, L., AND OS, E. D. 1996. Subjective and objective evaluation of two types of dialogues in a call assistance service. In IVTTA. IEEE, Basking Ridge, NJ, 121-124.Google Scholar
POWER, K. J. 1996. The listening telephoneautomating speech recognition over the PSTN. BT Technology Journal 14, 1, 112-126.Google Scholar
PRICE, P. 1996. Spoken language understanding. In Survey of the State of the Art in Human Language Technology, R. A. Cole, J. Mariani, H. Uszkoreit, A. Zaenen, and V. Zue, Eds. Cambridge University Press, Cambridge, UK. Online version: http://cslu.cse.ogi.edu/ HLTsurvey/. Google Scholar
PULMAN, S. G. 1996. Semantics. In Survey of the State of the Art in Human Language Technology, R. A. Cole, J. Mariani, H. Uszkoreit, A. Zaenen, and V. Zue, Eds. Cambridge University Press, Cambridge, UK. Online version: http://cslu.cse.ogi.edu/HLTsurvey/. Google Scholar
RABINER,L.R.AND JUANG, B. H. 1993. Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs, NJ. Google Scholar
REITER,E.AND DALE, R. 1997. Building applied natural language generation systems. Natural Language Engineering 3, 1, 57-87. Google Scholar
RUDNICKY, A. I., THAYER, E., CONSTANTINIDES,P., TCHOU, C., SHERN, R., LENZO, K., XU,W., AND OH, A. 1999. Creating natural dialogs in the Carnegie Mellon Communicator system. In Proceedings of 6th European Conference on Speech Communication and Technology (Eurospeech'99, Budapest, Hungary). ESCA.Google Scholar
SADEK,M.D.,BRETIER,P.,AND PANAGET, F. 1997. ARTIMIS: Natural dialogue meets rational agency. In Proceedings of 15th International Joint Conference on Artificial Intelligence, (IJCAI-97). Morgan Kaufmann Publishers, San Francisco, CA, 1030-1035. Google Scholar
SADEK,M.D.AND DE MORI, R. 1998. Dialogue systems. In Spoken Dialogues with Computers, R. de Mori, Ed. Academic Press, London, UK, 523-561.Google Scholar
SEARLE, J. R. 1969. Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press, Cambridge, UK.Google Scholar
SENEFF, S. 1992. TINA: A natural language system for spoken language applications. Computational Linguistics 18, 1, 61-86. Google Scholar
SHIEBER, S. M. 1986. An Introduction to Unification- based Approaches to Grammar. CSLI Lecture Notes, CSLI, Stanford, CA.Google Scholar
SIMPSON,A.AND FRASER, N. 1993. Black box and glass box evaluation of the SUNDIAL system. In Proceedings of 3rd European Conference on Speech Communication and Technology (Eurospeech'93, Berlin, Germany). ESCA, 33- 40.Google Scholar
SJOLANDER, K., BESKOW, J., GUSTAFSON, J., LEWIN, E., CARLSON, R., AND GRANSTR~M, B. 1998. Web-based educational tools for speech technology. In Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP'98, Sydney, Australia). ICSLP, 3217- 3220.Google Scholar
SMITH,R.AND GORDON, S. A. 1997. Effects of variable initiative on linguistic behavior in human-computer spoken natural language dialogue. Computational Linguistics 23, 1, 141- 168. Google Scholar
SMITH R. AND HIPP D. R. 1994 Spoken Natural Language Dialog Systems: A Practical Approach. Oxford University Press, New York, NY. Google Scholar
SMITH, R. W. 1997. Performance measures for the next generation of spoken natural language dialog systems. In Interactive Spoken Dialog Systems: Bringing Speech and NLP Together in Real Applications. Proceedings of a Workshop Sponsored by the Association for Computational Linguistics (Madrid, Spain), J. Hirschberg, C. Kamm, and M. Walker, Eds. ACL, 37-40. Google Scholar
STALLARD,D.AND BOBROW, R. 1992. Fragment processing in the DELPHI system. In Proceedings of the Speech and Natural Language Workshop, Harriman, N.Y. Morgan Kaufmann Publishers, San Mateo, CA, 305-310. Google Scholar
STRIK, H., RUSSEL, A., VAN DEN HEUVEL, H., CUCCHIARINI, C., AND BOVES, L. 1996. Localizing an au-tomatic inquiry system for public transport information. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP'96, Philadephia, PA), Vol. 2, ICSLP, 24-31.Google Scholar
SUTTON, S., COLE, R., DE VILLIERS, J., SCHALKWYK,J., VERMEULEN, P., MACON, M., YAN, Y., KAISER, E., RUNDLE, B., SHOBAKI, K., HOSOM,J.P.,KAIN, A., WOUTERS, J., MASSARO,D.,AND COHEN,M. 1998. Universal speech tools: The CSLU toolkit. In Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP'98, Sydney, Australia). ICSLP, 3221- 3224.Google Scholar
SUTTON, S., HANSEN, B., LANDER, T., NOVICK,D.G., AND COLE, R. 1995. Evaluating the effectiveness of dialogue for an automated spoken questionnaire. Tech. Rep. CS/E95-12, Department of Computer Science and Engineering, Oregon Graduate Institute of Science and Technology.Google Scholar
TRAUM, D. R. 1996. Conversational agency: the TRAINS-93 dialogue manager. In Proceedings of the eleventh Twente Workshop on Language Technology (TWLT 11): Dialogue Management in Natural Language Systems, S. LuperFoy, A. Nijholt, and G. V. van Zanten, Eds. Universiteit Twente, Enschede, The Netherlands.Google Scholar
TRAUM,D.R.AND ALLEN, J. F. 1994. Discourse obligations in dialogue processing. In Proceedings of the 32nd Annual General Meeting of the Association for Computational Linguistics (Las Cruces, NM). ACL, 1-8. Google Scholar
USZKOREIT,H.AND ZAENEN, A. 1996. Grammar formalisms. In Survey of the State of the Art in Human Language Technology,R.A. Cole, J. Mariani, H. Uszkoreit, A. Zaenen, and V. Zue, Eds. Cambridge University Press, Cambridge, UK. Online version: http://cslu. cse.ogi.edu/HLTsurvey/. Google Scholar
VERGEYNST, N. A., EDWARDS, K., FOSTER,J.C.,AND JACK, M. A. 1993. Spoken dialogues for humancomputer interaction over the telephone: complexity measures. In Proceedings of 3rd European Conference on Speech Communication and Technology (Eurospeech'93, Berlin, Germany). ESCA, 1415-1418.Google Scholar
VOICEXMLFORUM. n.d. http://www.voicexml.org.Google Scholar
WAHLSTER,W.AND KOBSA, A. 1989. User models in dialog systems. User Models in Dialog Systems, A. Kobsa and W. Wahlster, Eds. Springer Verlag, London, UK, 4-24. Google Scholar
WALKER, M. A. 1989. Evaluating discourse processing algorithms. In Proceedings of the 27th Annual General Meeting of the Association for Computational Linguistics (Vancouver, B.C., Canada). ACL, 251-261. Google Scholar
WALKER, M. A., LITMAN, D., KAMM,C.,AND ABELLA,A. 1997. PARADISE: a general framework for evaluating spoken dialogue agents. In Proceedings of the 35th Annual General Meeting of the Association for Computational Linguistics, ACL/EACL (Madrid, Spain). ACL, 271-280. Google Scholar
WALKER, M. A., LITMAN, D., KAMM,C.,AND ABELLA,A. 1998. Evaluating spoken dialogue agents with PARADISE: Two case studies. Computer Speech and Language 12, 3, 317-347.Google Scholar
WARD,W.AND PELLOM, B. 1999. The CU Communicator system. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding IEEE, (Keystone, CO). IEEE.Google Scholar
WHITTAKER,S.J.AND ATTWATER, D. J. 1996. The design of complex telephony applications using large vocabulary speech technology. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP-96, Philadephia, PA). ICSLP, 705-708.Google Scholar
WRIGHT, J., GORIN, A., AND ABELLA, A. 1998. Spoken language understanding within dialogs using a graphical model of task structure. In Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP'98, Sydney, Australia), Vol. 5. ECSLP.Google Scholar
WYARD,P.J.,SIMONS,A.D.,APPELBY, S., KANEEN, E., WILLIAMS,S.H.,AND PRESTON, K. R. 1996. Spoken language systems-beyond prompt and response. BT Technology Journal 14, 1, 187- 205.Google Scholar
YANKELOVICH, N. n.d. Using natural dialogues as the basis for speech interface design. In Automated Spoken Dialog Systems, S. Luperfoy, Ed. MIT Press, Cambridge, MA.Google Scholar
YANKELOVICH, N., LEVOW,G.,AND MARX,M. 1995. Designing SpeechActs: issues in speech user interfaces. In Proceedings of CHI95. Addison-Wesley, Reading, MA. 369- 375. Google Scholar
YOUNG,S.AND BLOOTHOOFT,G.EDS. 1997. Corpus- Based Methods in Language and Speech Processing. Kluwer Academic Publishers, Dordrecht, The Netherlands.Google Scholar
YOUNG, S. R., HAUPTMANN,A.G.,WARD, W. H., SMITH, E. T., AND WERNER, P. 1989. High level knowledge sources in usable speech recognition systems. Communications of the ACM 32, 2, 183- 194. Google Scholar

Recommendations

Learning the Structure of Task-Driven Human–Human Dialogs

With the availability of large corpora of spoken dialog, it is now possible to use data-driven techniques to build and use models of task-oriented dialogs. In this paper, we use data-driven techniques to build task structures for individual dialogs, and ...
Read More
From vocal to multimodal dialogue management
ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

Multimodal, speech-enabled systems pose different research problems when compared to unimodal, voice-only dialogue systems. One of the important issues is the question of how a multimodal interface should look like in order to make the multimodal ...
Read More
Assessment of dialogue systems by means of a new simulation technique

In recent years, a question of great interest has been the development of tools and techniqnes to facilitate the evaluation of dialogue systems. The latter can be evaluated from various points of view, such as recognition and understanding rates, ...
Read More

Reviews

Reviewer: D.C. Charles Hair

A comprehensive survey of current research and development in the area of spoken dialogue technology is presented in this paper. Included as components of that technology are speech recognition; language understanding; dialogue management; communication with external sources such as databases; language generation; and speech synthesis. The paper emphasizes recent advances both within spoken dialogue technology and in underlying computer technology. There is a focus on surveying underlying technologies rather than existing spoken dialogue systems. Section 1 of the paper gives an overview and an introduction to the subject. Section 2 is used to define the subject and to set forth a classification system. Three examples are given in section 3 to illustrate control strategies used in spoken dialogue systems. In section 4, there is an overview of the components used in spoken dialogue systems. It is noted that spoken dialogue systems are complicated systems that need to integrate these components, where each component itself is a major area of research interest. Section 5 describes methods of dialogue control. Section 6 reviews current trends in the specification, design, and evaluation of spoken dialogue systems. Section 7 describes toolkits that can be used in developing these complex systems. Finally, section 8 discusses future directions. The paper also includes a comprehensive list of related Web sites, and an extensive reference list. The technology surveyed is important and interesting. The paper itself is well organized and clearly written. This work should be invaluable to anyone with an interest in spoken dialogue systems. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Computing Surveys Volume 34, Issue 1
March 2002
169 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/505282
Issue’s Table of Contents

Copyright © 2002 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 March 2002
Published in csur Volume 34, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Dialogue management
human computer interaction
language generation
language understanding
speech recognition
speech synthesis
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 261
  Total Citations
  View Citations
- 10,916
  Total Downloads
- Downloads (Last 12 months)209
- Downloads (Last 6 weeks)36
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Spoken dialogue technology: enabling the conversational user interface

ACM Computing Surveys

Abstract

References

Cited By

Recommendations

Learning the Structure of Task-Driven Human–Human Dialogs

From vocal to multimodal dialogue management

Assessment of dialogue systems by means of a new simulation technique

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Spoken dialogue technology: enabling the conversational user interface

ACM Computing Surveys

Abstract

References

Cited By

Recommendations

Learning the Structure of Task-Driven Human–Human Dialogs

From vocal to multimodal dialogue management

Assessment of dialogue systems by means of a new simulation technique

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media