Results 1 - 10
of
52
From Data to Speech: A General Approach
- Natural Language Engineering
, 2000
"... We present a data-to-speech system called D2S, which can be used for the creation of datato -speech systems in different languages and domains. The most important characteristic of a data-to-speech system is that it combines language and speech generation: language generation is used to produce a na ..."
Abstract
-
Cited by 21 (9 self)
- Add to MetaCart
We present a data-to-speech system called D2S, which can be used for the creation of datato -speech systems in different languages and domains. The most important characteristic of a data-to-speech system is that it combines language and speech generation: language generation is used to produce a natural language text expressing the system's input data, and speech generation is used to make this text audible. In D2S, this combination is exploited by using linguistic information available in the language generation module for the computation of prosody. This allows us to achieve a better prosodic output quality than can be achieved in a plain text-to-speech system. For language generation in D2S, the use of syntactically enriched templates is guided by knowledge of the discourse context, while for speech generation pre-recorded phrases are combined in a prosodically sophisticated manner. This combination of techniques makes it possible to create linguistically sound but efficient system...
The Thoughtful Elephant: Strategies for Spoken Dialog Systems
- IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
, 2000
"... In this paper we present technology used in spoken dialog systems for applications of a wide range. They include tasks from the travel domain and automatic switchboards as well as large scale directory assistance. The overall goal in developing spoken dialog systems is to allow for a natural and fle ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
In this paper we present technology used in spoken dialog systems for applications of a wide range. They include tasks from the travel domain and automatic switchboards as well as large scale directory assistance. The overall goal in developing spoken dialog systems is to allow for a natural and flexible dialog flow similar to human--human interaction. This imposes the challenging task to recognize and interpret user input, where he/she is allowed to choose from an unrestricted vocabulary and an infinite set of possible formulations. We therefore put emphasis on strategies that make the system more robust while still maintaining a high level of naturalness and flexibility. In view of this paradigm, we found that two fundamental principles characterize many of the proposed methods: 1) to consider available sources of information as early as possible, and 2) to keep alternative hypotheses and delay the decision for a single option as long as possible. We describe
A voice-controlled automatic telephone switchboard and directory information system
- Speech Communication
, 1997
"... The Philips automatic telephone switchboard and directory information system PADIS provides a natural-language user interface to a telephone directory database. Using speech recognition and language understanding technologies, the system offers phone numbers, fax numbers, email addresses, and room n ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
The Philips automatic telephone switchboard and directory information system PADIS provides a natural-language user interface to a telephone directory database. Using speech recognition and language understanding technologies, the system offers phone numbers, fax numbers, email addresses, and room numbers as well as direct call completion to a desired party. In this paper, we present the underlying probabilistic framework, the system architecture, and the individual modules for speech recognition, language understanding, dialogue control, and speech output. In addition, we report results on performance and user behaviour obtained from a field test in our research lab with a 600-entry database. We derive a new maximum-a-posteriori decision rule which incorporates database knowledge and dialogue history as constraints in speech recognition and language understanding. It has improved speech understanding accuracy by 19 % (in terms of concept error rate), and reduced attribute substitution errors (e.g. recognition of a wrong name) by 38%. The decision rule is implemented in a multi-stage approach as a combination of state-of-the-art speech recognition, partial parsing with an attributed stochastic context-free grammar, and an N-best algorithm which is also described in this paper. The system conducts a flexible mixed-initiative dialogue rather than using a rigid form-filling scheme, and incorporates database knowledge to optimize the dialogue flow.
Speech Technology on Trial: Experiences from the August System
- Natural Language Engineering
, 2000
"... In this paper, the August spoken dialogue system is described. This experimental Swedish dialogue system, which featured an animated talking agent, was exposed to the general public during a trial period of six months. The construction of the system was partly motivated by the need to collect genuin ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
In this paper, the August spoken dialogue system is described. This experimental Swedish dialogue system, which featured an animated talking agent, was exposed to the general public during a trial period of six months. The construction of the system was partly motivated by the need to collect genuine speech data from people with little or no previous experience of spoken dialogue systems. A corpus of more than 10,000 utterances of spontaneous computer-directed speech was collected and empirical linguistic analyses were carried out. Acoustical, lexical and syntactical aspects of this data were examined. In particular, user behavior and user adaptation during error resolution were emphasized. Repetitive sequences in the database were analyzed in detail. Results suggest that computer-directed speech during error resolution is increased in duration, hyperarticulated and contains inserted pauses. Design decisions which may have influenced how the users behaved when they interacted with August are discussed and implications for the development of future systems are outlined.
YPA - An Intelligent Directory Enquiry Assistant
, 1998
"... The Ypa project is building a system to make the information in classified directories more accessible. BT's Yellow Pages 1 provides an example of a classified database with which this work would be useful. There are two reasons for doing this: (i) directories like Yellow Pages contain much useful b ..."
Abstract
-
Cited by 11 (8 self)
- Add to MetaCart
The Ypa project is building a system to make the information in classified directories more accessible. BT's Yellow Pages 1 provides an example of a classified database with which this work would be useful. There are two reasons for doing this: (i) directories like Yellow Pages contain much useful but hard-toaccess information, especially in the free text in semi-display advertisements; (ii) more generally, the project is a demonstrator for exploitation of semi-structured data -- data that is less systematic than database entries or logical clauses, but more systematic than free text because it has been marked up, for display or some other purpose. Accessing the directory source data file requires both natural language processing (for softening the interface to the system, and separately for analysis of natural-language-like constructs in the data) and information retrieval techniques, which are assisted by shallow knowledge. Deep world knowledge is impractical. The project...
Context-Sensitive Spoken Dialogue Processing with the DOP Model
- Natural Language Engineering
, 1999
"... We show how the DOP model can be used for fast and robust context-sensitive processing of spoken input in a practical spoken dialogue system called OVIS. OVIS, Openbaar Vervoer Informatie Systeem ("Public Transport Information System"), is a Dutch spoken language information system which operates ov ..."
Abstract
-
Cited by 9 (8 self)
- Add to MetaCart
We show how the DOP model can be used for fast and robust context-sensitive processing of spoken input in a practical spoken dialogue system called OVIS. OVIS, Openbaar Vervoer Informatie Systeem ("Public Transport Information System"), is a Dutch spoken language information system which operates over ordinary telephone lines. The prototype system is the immediate goal of the NWO Priority Programme "Language and Speech Technology". In this paper, we extend the original DOP model to context-sensitive interpretation of spoken input. The system we describe uses the OVIS corpus (which consists of 10,000 trees enriched with compositional semantics) to compute from an input word-graph the best utterance together with its meaning. Dialogue context is taken into account by dividing up the OVIS corpus into contextdependent subcorpora. Each system question triggers a subcorpus by which the user answer is analyzed and interpreted. Our experiments indicate that the context-sensitive DOP model obtains better accuracy than the original model, allowing for fast and robust processing of spoken input.
Linguistic adaptations in spoken human-computer dialogues -- Empirical studies of user behavior
, 2003
"... ..."
Spoken Dialogue Interpretation with the DOP Model
, 1998
"... We show how the DOP model can be used for fast and robust processing of spoken input in a practical spoken dialogue system called OVIS. OVIS, Openbaar Vervoer Informatie Systeem ("Public Transport Information System"), is a Dutch spoken language information system which operates over ordinary teleph ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
We show how the DOP model can be used for fast and robust processing of spoken input in a practical spoken dialogue system called OVIS. OVIS, Openbaar Vervoer Informatie Systeem ("Public Transport Information System"), is a Dutch spoken language information system which operates over ordinary telephone lines. The prototype system is the immediate goal of the NWO 1 Priority Programme "Language and Speech Technology". In this paper, we extend the original DOP model to context-sensitive interpretation of spoken input. The system we describe uses the OVIS corpus (10,000 trees enriched with compositional semantics) to compute from an input word-graph the best utterance together with its meaning. Dialogue context is taken into account by dividing up the OVIS corpus into context-dependent subcorpora. Each system question triggers a subcorpus by which the user answer is analyzed and interpreted. Our experiments indicate that the context-sensitive DOP model obtains better accuracy than the original model, allowing for fast and robust processing of spoken input.
The YPA - An Assistant for Classified Directory Enquiries
, 2000
"... The YPA is a directory enquiry system which allows a user to access advertiser information in classified directories. It converts semi-structured data in the Yellow Pages machine readable classified directories into a set of indices appropriate to the domain and task, and converts natural language q ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
The YPA is a directory enquiry system which allows a user to access advertiser information in classified directories. It converts semi-structured data in the Yellow Pages machine readable classified directories into a set of indices appropriate to the domain and task, and converts natural language queries into filled slot and filler structures appropriate for queries in the domain. The generation of answers requires a domain dependent query construction step, connecting the indices and the slot and fillers. The YPA illustrates an unusual but useful intermediate point between information retrieval and logical knowledge representation.

