Results 1 - 10
of
14
An Architecture for a Generic Dialogue Shell
, 2000
"... Architecture of the Dialogue Shell ***DRAFT*** 2/00 to appear in Natural Language Engineering, 2000. 7 mantic hierarchy and to a world KB manager that handles queries about the current situation, managing the interfaces to domain dependent reasoners and knowledge bases as needed. One of the key th ..."
Abstract
-
Cited by 72 (21 self)
- Add to MetaCart
Architecture of the Dialogue Shell ***DRAFT*** 2/00 to appear in Natural Language Engineering, 2000. 7 mantic hierarchy and to a world KB manager that handles queries about the current situation, managing the interfaces to domain dependent reasoners and knowledge bases as needed. One of the key things to note about this architecture is the separation of the basic dialogue system components from the more domain-specific components that provide the application (shown within the dotted lines at the lower left corner of Figure 1). To illustrate this separation, consider a specific example: a travel-agent application. The back-end would provide schedule and reservation information, booking, and so on, much as current computer systems provide to human travel agents. The behavioral agent and plan manager would be driven from a specification of desired behavior of the system as a travel agent, including the actions it typically will be asked to perform (e.g., what information is relevant to ...
A unified context-free grammar and n-gram model for spoken language processing
- in International Conference of Acoustics, Speech, and Signal Processing
, 2000
"... While context-free grammars (CFGs) remain as one of the most important formalisms for interpreting natural language, word ngram models are surprisingly powerful for domain-independent applications. We propose to unify these two formalisms for both speech recognition and spoken language understanding ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
While context-free grammars (CFGs) remain as one of the most important formalisms for interpreting natural language, word ngram models are surprisingly powerful for domain-independent applications. We propose to unify these two formalisms for both speech recognition and spoken language understanding (SLU). With portability as the major problem, we incorporated domainspecific CFGs into a domain-independent n-gram model that can improve generalizability of the CFG and specificity of the n-gram. In our experiments, the unified model can significantly reduce the test set perplexity from 378 to 90 in comparison with a domainindependent word trigram. The unified model converges well when the domain-specific data becomes available. The perplexity can be further reduced from 90 to 65 with a limited amount of domain-specific data. While we have demonstrated excellent portability, the full potential of our approach lies in its unified recognition and understanding that we are investigating. 1.
The Medication Advisor Project: Preliminary Report
, 2002
"... The Medication Advisor is the latest project of the Conversational Interaction and Spoken Dialogue research group at the University of Rochester. The goal of the project is an intelligent assistant that interacts with its users via conversational natural language, and provides them with information ..."
Abstract
-
Cited by 11 (8 self)
- Add to MetaCart
The Medication Advisor is the latest project of the Conversational Interaction and Spoken Dialogue research group at the University of Rochester. The goal of the project is an intelligent assistant that interacts with its users via conversational natural language, and provides them with information and advice regarding their prescription medications. Managing prescription drug regimens is a major problem, particularly for older people living at home who tend to have both complex medication schedules and, often, somewhat reduced faculties for keeping track of them. Patient compliance with prescribed regimens is notoriously low, leading to incorrect and sometimes harmful usage of both prescribed and over-the-counter medications. The Medication Advisor builds on our prior experience constructing conversational assistants in other domains. In addition to providing new challenges, the project allows us to validate previous efforts in areas such as portability. This brief report details our initial efforts and outlines our future direction.
Balancing data-driven and rule-based approaches in the context of a multimodal conversational system
- IN PROCEEDINGS OF HUMAN LANGUAGE TECHNOLOGY CONFERENCE. HLT-NAACL
, 2004
"... Moderate-sized rule-based spoken language models for recognition and understanding are easy to develop and provide the ability to rapidly prototype conversational applications. However, scalability of such systems is a bottleneck due to the heavy cost of authoring and maintenance of rule sets and in ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Moderate-sized rule-based spoken language models for recognition and understanding are easy to develop and provide the ability to rapidly prototype conversational applications. However, scalability of such systems is a bottleneck due to the heavy cost of authoring and maintenance of rule sets and inevitable brittleness due to lack of coverage in the rule sets. In contrast, data-driven approaches are robust and the procedure for model building is usually simple. However, the lack of data in a particular application domain limits the ability to build data-driven models. In this paper, we address the issue of combining data-driven and grammar-based models for rapid prototyping of robust speech recognition and understanding models for a multimodal conversational system. We also present methods that reuse data from different domains and investigate the limits of such models in the context of a particular application domain. 1
Software architectures for incremental understanding of human speech. Interspeech 2006
- In Proceedings of Interspeech/ICSLP
, 2006
"... The prevalent state of the art in spoken language understanding by spoken dialog systems is both modular and pipelined. It is modular in the sense that incoming utterances are processed by independent modules that handle different aspects of the signal, such as acoustics, syntax, semantics, and inte ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
The prevalent state of the art in spoken language understanding by spoken dialog systems is both modular and pipelined. It is modular in the sense that incoming utterances are processed by independent modules that handle different aspects of the signal, such as acoustics, syntax, semantics, and intention / goal recognition. It is pipelined in the sense that each module completes its work for an entire utterance prior to handing off the utterance to the next module. However, a growing body of evidence from the human language understanding literature suggests that humans do not process language in a modular, pipelined way. Rather, they process speech by rapidly integrating constraints from multiple sources of knowledge and multiple linguistic levels incrementally, as the utterance unfolds. In this paper we describe ongoing work aimed at developing an architecture that will allow machines to understand spoken language in a similar way. This revolutionary approach is promising for two reasons: 1) It more accurately reflects contemporary models of human language understanding, and 2) it results in technical improvements including increased parsing performance. 1.
Automatic induction of language model data for a spoken dialogue system
- In Proceedings of SIGDIAL
, 2005
"... When building a new spoken dialogue application, large amounts of domain specific data are required. This paper addresses the issue of generating in-domain training data when little or no real user data are available. The twostage approach taken begins with a data induction phase whereby linguistic ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
When building a new spoken dialogue application, large amounts of domain specific data are required. This paper addresses the issue of generating in-domain training data when little or no real user data are available. The twostage approach taken begins with a data induction phase whereby linguistic constructs from out-of-domain sentences are harvested and integrated with artificially constructed in-domain phrases. After some syntactic and semantic filtering, a large corpus of synthetically assembled user utterances is induced. The second stage involves sampling the synthetic corpus towards the goal of obtaining data that would be representative of the statistics of applicationspecific real user interactions. The sampling methods proposed employ an example-based generation framework, a simulated user model and information extracted from development data. Evaluation is conducted on recognition performance in a restaurant information domain. We show that word error rate can be reduced when limited amounts of real user training data are augmented with synthetic data derived by our methods. 1
Hierarchical Statistical Language Models: Experiments On In-Domain Adaptation
- PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP'2000)
, 2000
"... We introduce a hierarchical statistical language model, represented as a collection of local models plus a general sentence model. We provide an example that mixes a trigram general model and a PFSA local model for the class of decimal numbers, described in terms of sub-word units (graphemes). This ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We introduce a hierarchical statistical language model, represented as a collection of local models plus a general sentence model. We provide an example that mixes a trigram general model and a PFSA local model for the class of decimal numbers, described in terms of sub-word units (graphemes). This model practically extends the vocabulary of the overall model to an infinite size, but still has better performance compared to a word-based model. Using in-domain language model adaptation experiments, we show that local models can encode enough linguistic information, if well trained, that they may be ported to new language models without re-estimation.
CARDIAC: An Intelligent Conversational Assistant for Chronic Heart Failure
- Patient Heath Monitoring, Proc. of the AAAI Fall Symposium on Virtual Healthcare Interaction
, 2009
"... We describe CARDIAC, a prototype for an intelligent conversational assistant that provides health monitoring for chronic heart failure patients. CARDIAC supports user initiative through its ability to understand natural language and connect it to intention recognition. The natural language interface ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We describe CARDIAC, a prototype for an intelligent conversational assistant that provides health monitoring for chronic heart failure patients. CARDIAC supports user initiative through its ability to understand natural language and connect it to intention recognition. The natural language interface allows patients to interact with CARDIAC without special training. The system is designed to understand information that arises spontaneously in the course of the interview. If the patient gives more detail than necessary for answering a question, the system updates the user model accordingly. CARDIAC is a first step towards developing cost-effective, customizable, automated in-home conversational assistants that help patients manage their care and monitor their health using natural language.
Language model data filtering via user simulation and dialogue resynthesis
- in Proc. of INTERSPEECH
, 2005
"... In this paper, we address the issue of generating language model training data during the initial stages of dialogue system development. The process begins with a large set of sentence templates, automatically adapted from other application domains. We propose two methods to filter the raw data set ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In this paper, we address the issue of generating language model training data during the initial stages of dialogue system development. The process begins with a large set of sentence templates, automatically adapted from other application domains. We propose two methods to filter the raw data set to achieve a desired probability distribution of the semantic content, both on the sentence level and on the class level. The first method utilizes user simulation technology, which obtains the probability model via an interplay between a probabilistic user model and the dialogue system. The second method synthesizes novel dialogue interactions by modeling after a small set of dialogues produced by the developers during the course of system refinement. We evaluated our methodology by speech recognition performance on a set of 520 unseen utterances from naive users interacting with a restaurant domain dialogue system. 1.
Towards Speech-Driven Question Answering: Experiments Using the NTCIR-3 Question Answering Collection
"... We developed a method for producing statistical language models for speech-driven question answering, which recognizes spoken questions with high accuracy. Our method uses a target collection (i.e., a document set from which answers are derived) to extract N-grams, and adapts them to the questionans ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We developed a method for producing statistical language models for speech-driven question answering, which recognizes spoken questions with high accuracy. Our method uses a target collection (i.e., a document set from which answers are derived) to extract N-grams, and adapts them to the questionanswering task by way of frozen patterns typically used in interrogative questions. In addition, our method magnifies N-gram statistics corresponding to frozen patterns in the original N-gram. For the purpose of experiments, we used dictated questions in the NTCIR-3 QAC test collection, and showed that our method outperformed a conventional language model adaptation method in terms of the speech recognition accuracy. 1

