Results 1 - 10
of
23
A Stochastic Model of Human-Machine Interaction for learning dialog Strategies
- IEEE Transactions on Speech and Audio Processing
, 2000
"... Abstract—In this paper, we propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given ..."
Abstract
-
Cited by 122 (3 self)
- Add to MetaCart
Abstract—In this paper, we propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given application. We also show that any dialog system can be formally described as a sequential decision process in terms of its state space, action set, and strategy. With additional assumptions about the state transition probabilities and cost assignment, a dialog system can be mapped to a stochastic model known as Markov decision process (MDP). A variety of data driven algorithms for finding the optimal strategy (i.e., the one that optimizes the criterion) is available within the MDP framework, based on reinforcement learning. For an effective use of the available training data we propose a combination of supervised and reinforcement learning: the supervised learning is used to estimate a model of the user, i.e., the MDP parameters that quantify the user’s behavior. Then a reinforcement learning algorithm is used to estimate the optimal strategy while the system interacts with the simulated user. This approach is tested for learning the strategy in an air travel information system (ATIS) task. The experimental results we present in this paper show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups. Index Terms—Dialog systems, Markov decision process, reinforcement learning, sequential decision process, speech, spoken
Conversational Interfaces: Advances and Challenges
, 2000
"... The last decade has witnessed the emergence of a new breed of human computer interfaces that combines several human language technologies to enable information access and transactional processing using spoken dialogue. In this paper, I discuss my view on the research issues involved in the developme ..."
Abstract
-
Cited by 61 (4 self)
- Add to MetaCart
The last decade has witnessed the emergence of a new breed of human computer interfaces that combines several human language technologies to enable information access and transactional processing using spoken dialogue. In this paper, I discuss my view on the research issues involved in the development of such interfaces, describe the recent work done in this area at the MIT Laboratory for Computer Science, and outline some of the unmet research challenges, including the need to work in real domains, spoken language generation, and portability across domains and languages.
Talking To Machines (Statistically Speaking)
"... Statistical methods have long been the dominant approach in speech recognition and probabilistic modelling in ASR is now a mature technology. The use of statistical methods in other areas of spoken dialogue is however more recent and rather less mature. This paper reviews spoken dialogue systems fro ..."
Abstract
-
Cited by 31 (10 self)
- Add to MetaCart
Statistical methods have long been the dominant approach in speech recognition and probabilistic modelling in ASR is now a mature technology. The use of statistical methods in other areas of spoken dialogue is however more recent and rather less mature. This paper reviews spoken dialogue systems from a statistical modelling perspective. The complete system is first presented as a partially observable Markov decision process. The various sub-components are then exposed by introducing appropriate intermediate variables. Samples of existing work are reviewed within this framework, including dialogue control and optimisation, semantic interpretation, goal detection, natural language generation and synthesis.
Challenges For Spoken Dialogue Systems
- In Proceedings of 1999 IEEE ASRU Workshop
, 1999
"... The past decade has seen the development of a large number of spoken dialogue systems around the world, both as research prototypes and commercial applications. These systems allow users to interact with a machine to retrieve information, conduct transactions, or perform other problem-solving tasks. ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
The past decade has seen the development of a large number of spoken dialogue systems around the world, both as research prototypes and commercial applications. These systems allow users to interact with a machine to retrieve information, conduct transactions, or perform other problem-solving tasks. In this paper we discuss some of the design issues which confront developers of spoken dialogue systems, provide some examples of research being undertaken in this area, and describe some of the ongoing challenges facing current spoken language technology.
Probabilistic Methods in Spoken Dialogue Systems
- Philosophical Transactions of the Royal Society (Series A
, 1999
"... This paper presents a probabilistic framework for modelling spoken dialogue systems. On the assumption that the overall system behaviour can be represented as a Markov Decision Process, the optimisation of dialogue management strategy using reinforcement learning is reviewed. Examples of learning be ..."
Abstract
-
Cited by 24 (5 self)
- Add to MetaCart
This paper presents a probabilistic framework for modelling spoken dialogue systems. On the assumption that the overall system behaviour can be represented as a Markov Decision Process, the optimisation of dialogue management strategy using reinforcement learning is reviewed. Examples of learning behaviour are presented for both dynamic programming and sampling methods, but the latter is preferred. The paper concludes by noting the importance of user simulation models for the practical application of these techniques and the need for developing methods of mapping system features in order to achieve suciently compact state spaces.
The Thoughtful Elephant: Strategies for Spoken Dialog Systems
- IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
, 2000
"... In this paper we present technology used in spoken dialog systems for applications of a wide range. They include tasks from the travel domain and automatic switchboards as well as large scale directory assistance. The overall goal in developing spoken dialog systems is to allow for a natural and fle ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
In this paper we present technology used in spoken dialog systems for applications of a wide range. They include tasks from the travel domain and automatic switchboards as well as large scale directory assistance. The overall goal in developing spoken dialog systems is to allow for a natural and flexible dialog flow similar to human--human interaction. This imposes the challenging task to recognize and interpret user input, where he/she is allowed to choose from an unrestricted vocabulary and an infinite set of possible formulations. We therefore put emphasis on strategies that make the system more robust while still maintaining a high level of naturalness and flexibility. In view of this paradigm, we found that two fundamental principles characterize many of the proposed methods: 1) to consider available sources of information as early as possible, and 2) to keep alternative hypotheses and delay the decision for a single option as long as possible. We describe
A voice-controlled automatic telephone switchboard and directory information system
- Speech Communication
, 1997
"... The Philips automatic telephone switchboard and directory information system PADIS provides a natural-language user interface to a telephone directory database. Using speech recognition and language understanding technologies, the system offers phone numbers, fax numbers, email addresses, and room n ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
The Philips automatic telephone switchboard and directory information system PADIS provides a natural-language user interface to a telephone directory database. Using speech recognition and language understanding technologies, the system offers phone numbers, fax numbers, email addresses, and room numbers as well as direct call completion to a desired party. In this paper, we present the underlying probabilistic framework, the system architecture, and the individual modules for speech recognition, language understanding, dialogue control, and speech output. In addition, we report results on performance and user behaviour obtained from a field test in our research lab with a 600-entry database. We derive a new maximum-a-posteriori decision rule which incorporates database knowledge and dialogue history as constraints in speech recognition and language understanding. It has improved speech understanding accuracy by 19 % (in terms of concept error rate), and reduced attribute substitution errors (e.g. recognition of a wrong name) by 38%. The decision rule is implemented in a multi-stage approach as a combination of state-of-the-art speech recognition, partial parsing with an attributed stochastic context-free grammar, and an N-best algorithm which is also described in this paper. The system conducts a flexible mixed-initiative dialogue rather than using a rigid form-filling scheme, and incorporates database knowledge to optimize the dialogue flow.
The Use of Belief Networks for Mixed-Initiative Dialog Modeling
- Proceedings of ICSLP
, 2000
"... for mixed-initiative dialog modeling. The BN-based framework was previously used for natural language understanding, where BNs infer the informational goal of the user’s query based on its parsed semantic concepts. We extended this framework with the technique of backward inference that can automati ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
for mixed-initiative dialog modeling. The BN-based framework was previously used for natural language understanding, where BNs infer the informational goal of the user’s query based on its parsed semantic concepts. We extended this framework with the technique of backward inference that can automatically detect missing or spurious concepts based on the inferred goal. This is, in turn, used to drive the mixed-initiative dialog model that prompts for missing concepts and clarifies for spurious concepts. Applicability is demonstrated for a simple foreign exchange domain, and our framework’s mixed-initiative interactions were shown to be superior to the system-initiative and user-initiative interactions. We also investigate the scalability and portability of the BN-based framework to the more complex air travel (ATIS) domain. Backward inference detected an increased number of missing and spurious concepts, which led to redundancies in the dialog model. We experimented with several remedial measures that showed promise in reducing the redundancies. We also present a set of principles for hand-assigning “degrees of belief” to the BN to reduce the demand for massive training data when porting to a new domain. Experimentation with the ATIS data also gave promising results. Index Terms—Belief networks, dialog modeling, mixed-initiative. I.
Improving Speech Understanding By Incorporating Database Constraints And Dialogue History
- In Proc. ICSLP
, 1996
"... In the course of a "man-machine" dialogue, the system's belief concerning the user's intention is continuously being built up. Moreover, restricting the discourse to a narrow application domain further constrains the variety of possible user reactions. In this paper, we will show how these knowledge ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
In the course of a "man-machine" dialogue, the system's belief concerning the user's intention is continuously being built up. Moreover, restricting the discourse to a narrow application domain further constrains the variety of possible user reactions. In this paper, we will show how these knowledge sources may be utilized in a stochastic framework to improve speech understanding. On field-test data collected with our automatic exchange board prototype PADIS, a relative reduction of attribute errors by 27% has been obtained.
Combination of CFG and N-gram Modeling in Semantic Grammar Learning
- In In Proceedings of the Eurospeech Conference
, 2003
"... SGStudio is a grammar authoring tool that eases semantic grammar development. It is capable of integrating different information sources and learning from annotated examples to induct CFG rules. In this paper, we investigate a modification to its underlying model by replacing CFG rules with n-gram s ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
SGStudio is a grammar authoring tool that eases semantic grammar development. It is capable of integrating different information sources and learning from annotated examples to induct CFG rules. In this paper, we investigate a modification to its underlying model by replacing CFG rules with n-gram statistical models. The new model is a composite of HMM and CFG. The advantages of the new model include its built-in robust feature and its scalability to an n-gram classifier when the understanding does not involve slot filling. We devised a decoder for the model. Preliminary results show that the new model achieved 32 % error reduction in high resolution understanding. 1.

