Results 11 -
14 of
14
Voice-IF: A Mixed-Initiative Spoken Dialogue System for
- AT&T Conference Services”, Eurospeech ’01
"... This paper presents the Voice-IF system; a mixedinitiative spoken dialogue system for AT&T conference services. One objective for creating Voice-IF is to provide a vehicle for evaluating our technologies in speech synthesis, recognition, understanding, dialogue and user interfaces on a real applicat ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents the Voice-IF system; a mixedinitiative spoken dialogue system for AT&T conference services. One objective for creating Voice-IF is to provide a vehicle for evaluating our technologies in speech synthesis, recognition, understanding, dialogue and user interfaces on a real application with relatively novice users. Another objective is to design, build and test a set of tools that allow us to rapidly prototype spoken dialogue applications. In this paper, we describe the performance of Voice-IF during its 6-week deployment period. In particular, we report a) results of perceptual evaluations of the synthesized speech, b) system performance and user satisfaction ratings, c) PARADISE analysis of the data, and d) comparisons with other systems, including the W99 conference registration system used at the ASRU’99 workshop and the Travel Communicator system. 1.
WEB-BASED MONITORING, LOGGING AND REPORTING TOOLS FOR MULTI-SERVICE MULTI-MODAL SYSTEMS
"... This paper describes MILER (Multi-modal data Logger for Evaluation and Report), a web-based multi-service monitoring, logging and reporting tool for advanced multi-modal dialog systems. MILER has been designed to directly arrange and synchronize logging data collected from live services and to provi ..."
Abstract
- Add to MetaCart
This paper describes MILER (Multi-modal data Logger for Evaluation and Report), a web-based multi-service monitoring, logging and reporting tool for advanced multi-modal dialog systems. MILER has been designed to directly arrange and synchronize logging data collected from live services and to provide real-time reports about service usage and system performance. Special attention has been given to the architecture design in order to achieve service and access-device independence and reliable synchronization of data from distributed logs. MILER allows researchers to analyze multimodal interactions, analyze the call flow, reconstruct the system/user dialogue turns, play the recorded user utterances, and provide a preliminary dialogue performance evaluation. It also supports labeling and annotation of the dialogue turns for further offline analysis. Once the user inputs (i.e. speech and other input modalities) are manually transcribed and labeled, along with detailed log events from each dialog, MILER derives a set of objective measures, which includes word and concept accuracy, number of attempts per concept, dialog turn counts and duration, and task completion rates. Subjective measures extracted from user’s surveys, including perceived task success and ease of use measures, can be combined with the objective measures and the results used later for accuracy computation. 1.
Extending a Standards-based IP and Computer Telephony Platform to Support Multi-modal Services
"... Despite recent advances in Computer Telephony (CT) and IP Telephony (IPT) standards at defining flexible architectures to support new technologies, the current CT paradigm does not adequately support the requirements of advanced spoken dialogue systems. This paper describes an application framework ..."
Abstract
- Add to MetaCart
Despite recent advances in Computer Telephony (CT) and IP Telephony (IPT) standards at defining flexible architectures to support new technologies, the current CT paradigm does not adequately support the requirements of advanced spoken dialogue systems. This paper describes an application framework based on CT and IPT standards that defines new architectural components for information access, alerting, and multi-modal input/output integration. This framework permits separation of the application logic from low-level resource management in order to facilitate the design and development of advanced, multi-modal voice-enabled services. 1
Caller Identification for the SCANMail Voicemail Browser
, 2001
"... SCANMail is a prototype system developed at AT&T Labs for the purpose of providing useful tools for managing and searching through voicemail messages. Content is extracted from voicemail messages using various speech and text processing tools. One such content category is the identity of the message ..."
Abstract
- Add to MetaCart
SCANMail is a prototype system developed at AT&T Labs for the purpose of providing useful tools for managing and searching through voicemail messages. Content is extracted from voicemail messages using various speech and text processing tools. One such content category is the identity of the message caller. This paper describes CallerID, the server tool attached to SCANMail for the purpose of providing caller labels for voicemail messages. CallerID make use of text independent speaker recognition techniques. Two kinds of requests are handled by the CallerID server. A request triggered by the arrival of a new voicemail message results in the processing of the message to score it against the models of callers assigned to the user (recipient) in order to propose the identity of the caller. A second request is initiated by a user who provides a caller label for a message he/she has reviewed. CallerID processes the message and uses it to train or adapt a speaker model for the caller whose label is provided. The paper describes in detail the CallerID functions and provides some results of performance evaluations of the caller identification capability.

