Results 1 - 10
of
13
New tools for interactive speech and language training: Using animated Conversational Agents In the . . .
- UNIVERSITY COLLEGE LONDON
, 1999
"... This article describes our experiences with an animated conversational agent being used in daily classroom activities with profoundly deaf children at the Tucker Maxon Oral School in Portland Oregon. We first articulate some reasons why animated conversational agents can revolutionize learning and l ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
This article describes our experiences with an animated conversational agent being used in daily classroom activities with profoundly deaf children at the Tucker Maxon Oral School in Portland Oregon. We first articulate some reasons why animated conversational agents can revolutionize learning and language training by providing a more effective mode of human computer interaction. We then describe the capabilities of our animated agent, Baldi, and the software environment used to design and run interactive media systems. We then describe applications designed by teachers and students that illustrate ways in which students in three different classrooms converse and interact with Baldi. We conclude with a brief look at the next generation of animated conversational agents.
Tools for Research and Education in Speech Science
- IN PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF PHONETIC SCIENCES
, 1999
"... The Center for Spoken Language Understanding (CSLU) provides free language resources to researchers and educators in all areas of speech and hearing science. These resources are of great potential value to speech scientists for analyzing speech, for diagnosing and treating speech and language proble ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
The Center for Spoken Language Understanding (CSLU) provides free language resources to researchers and educators in all areas of speech and hearing science. These resources are of great potential value to speech scientists for analyzing speech, for diagnosing and treating speech and language problems, for researching and evaluating language technologies, and for training students in the theory and practice of speech science. This article describes language resources from CSLU, and some of the ways in which these resources can be used.
Spin: Language Understanding For Spoken Dialogue Systems Using A Production System Approach
, 2002
"... This paper describes a language understanding module for spoken dialogue systems producing frame based semantic output. The presented approach adapts ideas from production systems to the task of language understanding. It interleaves in a new manner template-driven cascaded word-to-frame transformat ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
This paper describes a language understanding module for spoken dialogue systems producing frame based semantic output. The presented approach adapts ideas from production systems to the task of language understanding. It interleaves in a new manner template-driven cascaded word-to-frame transformation with syntactic analysis. The advantages over conventional parsers are the flexible output structure being independent of the syntactic structure in a wide range, the ability to use different levels of syntactic analysis at the same time and better support for relatively free word order languages like German. Other important properties are robustness, the capability to process complex utterances, and the easy creation of knowledge bases. A preliminary evaluation shows promising results.
Smart web handheld -- multimodal interaction with ontological knowledge bases and semantic web services
- IN: PROC. INTERNATIONAL WORKSHOP ON AI FOR HUMAN COMPUTING (IN CONJUNCTION WITH IJCAI
, 2007
"... SmartWeb aims to provide intuitive multimodal access to a rich selection of Web-based information services. We report on the current prototype with a smartphone client interface to the Semantic Web. An advanced ontology-based representation of facts and media structures serves as the central descrip ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
SmartWeb aims to provide intuitive multimodal access to a rich selection of Web-based information services. We report on the current prototype with a smartphone client interface to the Semantic Web. An advanced ontology-based representation of facts and media structures serves as the central description for rich media content. Underlying content is accessed through conventional web service middleware to connect the ontological knowledge base and an intelligent web service composition module for external web services, which is able to translate between ordinary XML-based data structures and explicit semantic representations for user queries and system responses. The presentation module renders the media content and the results generated from the services and provides a detailed description of the content and its layout to the fusion module. The user is then able to employ multiple modalities, like speech and gestures, to interact with the presented multimedia material in a multimodal way.
Integrating Flexibility into a Structured Dialogue Model: Some Design Considerations
- Some Design Considerations, Proceedings of International Conference on Speech and Language Processing
, 2000
"... Structured dialogue models are the most commonly used dialogue models in commercial systems, particularly as they are relatively easy to design and re-use. The current paper reports on a study that examined the feasibility of combining more flexible dialogue control with a structured dialogue model. ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Structured dialogue models are the most commonly used dialogue models in commercial systems, particularly as they are relatively easy to design and re-use. The current paper reports on a study that examined the feasibility of combining more flexible dialogue control with a structured dialogue model. Several systems were built using the RAD (Rapid Application Developer) component of the CSLU toolkit, augmented with the Phoenix natural language parsing system and a dialogue manager that used a representation of the systems information state to determine the system s next question or action. Results indicated that with an optimized continuous speech recognizer a dialogue permitting flexible input can be concluded efficiently and successfully, while in cases of degraded recognition the recovery strategies and more structured dialogue control enhance the likelihood of a successful transaction. The paper discusses a number of design issues that support developers in making structured dialogue models more flexible.
Performance ‘General Purpose’ Phonetic Recognition for Italian
- In Proceedings ICSLP-2000, International Conference on Spoken Language Processing
, 2000
"... The development of a speaker independent “general purpose” phonetic recognizer for Italian is described. The CSLU Toolkit was used to develop and implement the system. The recognizer, based on a frame-based hybrid HMM/ANN architecture trained on context-dependent categories to account for coarticula ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
The development of a speaker independent “general purpose” phonetic recognizer for Italian is described. The CSLU Toolkit was used to develop and implement the system. The recognizer, based on a frame-based hybrid HMM/ANN architecture trained on context-dependent categories to account for coarticulatory variation, recognizes 38 different phonemes (not including silence or closures), and can distinguish between stressed and unstressed vowels as well as open and closed vowels. The APASCI corpus, containing nearly 2500 sentences read by 100 speakers, where the sentences have been designed to maximize the number of phonemes occurring in different contexts, was used for training and testing. As of the time of this writing, a phoneme-level accuracy of 82.90 % on the development set and of 80.53 % on the test set has been obtained. This level of accuracy is much greater than on a similar English-language corpus (with state-of-the-art performance of slightly better than 70%) and it represents the best performance obtained so far on this corpus. 1.
Improvements in Neural-Network Training and Search Techniques for Continuous Digit Recognition
- Australian Journal of Intelligent Information Processing Systems (AJIIPS
, 1998
"... This paper describes a set of experiments on training and search techniques for development of a neural-network based continuous digits recognizer. When the best techniques from these experiments were combined to train a final recognizer, there was a 56% reduction in word-level error on the continuo ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This paper describes a set of experiments on training and search techniques for development of a neural-network based continuous digits recognizer. When the best techniques from these experiments were combined to train a final recognizer, there was a 56% reduction in word-level error on the continuous digits recognition task. The best system had word accuracy of 97.67% on a test set of the OGI 30K Numbers corpus; this corpus contains naturally-produced continuous digit strings recorded over telephone channels. Experiments investigated the effects of the feature set, the amount of data used for training, the type of context-dependent categories to be recognized, the values for duration limits, and the type of grammar. The experiments indicate that the grammar and duration limits had a greater effect on recognition accuracy than the output categories, cepstral features, or a doubling of the amount of training data. In addition, the forwardbackward method of training neural networks was e...
Robust and efficient semantic parsing of free word order languages in spoken dialogue systems
- In Proceedings of 9th Conference on Speech Communication and technology
, 2005
"... This paper presents a semantic parser for spoken dialogue systems. The parser is designed especially for the analysis of free word order languages by providing a feature called orderindependent matching. We describe how this feature allows writing of rules for free word order languages in an elegant ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper presents a semantic parser for spoken dialogue systems. The parser is designed especially for the analysis of free word order languages by providing a feature called orderindependent matching. We describe how this feature allows writing of rules for free word order languages in an elegant way (using German as example language) and how it increases the robustness against speech recognition errors. As orderindependent matching makes efficient parsing more difficult, we present a new parsing approach which provides efficient processing for rule bases that are, according to our experience, typical for spoken dialogue systems. The key feature of the parsing approach is a fixed application order of the rules to prune irrelevant results. A preliminary evaluation of the parser shows that this approach works very well in real-world dialogue systems. 1.
Implementation Testing of a Hybrid Symbolic/Statistical Multimodal Architecture
, 2002
"... The design and implementation of hybrid symbolic/statistical architectures is a major area of interest in current multimodal system development. Such an architecture attempts to improve multimodal recognition and disambiguation rates by using corpus-based statistics to weight the contributions from ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
The design and implementation of hybrid symbolic/statistical architectures is a major area of interest in current multimodal system development. Such an architecture attempts to improve multimodal recognition and disambiguation rates by using corpus-based statistics to weight the contributions from various input streams. This is in contrast to current architectures that assume independence between input streams, and combine un-weighted posterior probabilities simply by taking their cross product.
Demonstrations of Dialogue Design Tools in the CSLU Toolkit
"... The CSLU Toolkit and accompanying tutorials are designed to provide a platform for researching and developing language technologies and systems, and to engage nave users in using and experimenting with interactive language systems. We provide a set of demonstrations in this special session that illu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The CSLU Toolkit and accompanying tutorials are designed to provide a platform for researching and developing language technologies and systems, and to engage nave users in using and experimenting with interactive language systems. We provide a set of demonstrations in this special session that illustrate capabilities. They include rapid prototyping of a spoken dialogue system that integrates an animated talking face, speech recognition and text-to-speech synthesis, and a variety of applications created by practitioners using the rapid application developer.

