Results 1 - 10
of
12
Integrated Dialog Act Segmentation And Classification Using Prosodic Features And Language Models
, 1997
"... This paper presents an integrated approach for the segmentation and classification of dialog acts (DA) in the Verbmobil project. In Verbmobil it is often sufficient to recognize the sequence of DAs occurring during a dialog between the two partners. In our previous work [5] we segmented and classifi ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
This paper presents an integrated approach for the segmentation and classification of dialog acts (DA) in the Verbmobil project. In Verbmobil it is often sufficient to recognize the sequence of DAs occurring during a dialog between the two partners. In our previous work [5] we segmented and classified a dialog in two steps: first we calculated hypotheses for the segment boundaries and decided for a boundary if the probabilities exceeded a predefined threshold level. Second we classified the segments into DAs using semantic classification trees or stochastic language models. In our new approach we integrate the segmentation and classification in the A --algorithm to search for the optimal segmentation and classification of DAs on the basis of word hypotheses graphs (WHGs). The hypotheses for the segment boundaries are calculated with the help of a stochastic language model operating on the word chain and a multi-layer perceptron (MLP) classifying prosodic features. The DA classificat...
Interpolated Markov Chains for Eukaryotic Promoter Recognition
, 1999
"... Motivation: We describe a new content based approach for the detection of promoter regions of eukaryotic protein encoding genes. Our system is based on three interpolated Markov chains (IMCs) of different order which are trained on coding, non-coding, and promoter sequences. It was recently shown th ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
Motivation: We describe a new content based approach for the detection of promoter regions of eukaryotic protein encoding genes. Our system is based on three interpolated Markov chains (IMCs) of different order which are trained on coding, non-coding, and promoter sequences. It was recently shown that the interpolation of Markov chains leads to stable parameters and improves on the results in microbial gene finding (Salzberg et al., 1998). Here, we present new methods for an automated estimation of optimal interpolation parameters and show how the IMCs can be applied to detect promoters in contiguous DNA sequences. Our interpolation approach can also be employed to obtain a reliable scoring function for human coding DNA regions, and the trained models can easily be incorporated in the general framework for gene recognition systems. Results: A fivefold cross-validation evaluation of our IMC approach on a representative sequence set yielded a mean correlation coefficient of 0.84 (promot...
The Erlangen Spoken Dialogue System EVAR: A State-of-the-Art Information Retrieval System
, 1998
"... In this paper, we present an overview of the spoken dialogue system EVAR that was developed at the University of Erlangen. In January 1994, it became accessible over telephone line and could answer inquiries in the German language about German InterCity train connections. It has since been continuou ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
In this paper, we present an overview of the spoken dialogue system EVAR that was developed at the University of Erlangen. In January 1994, it became accessible over telephone line and could answer inquiries in the German language about German InterCity train connections. It has since been continuously improved and extended, including some unique features, such as the processing of out--of--vocabulary words and a flexible dialogue strategy that adapts to the quality of the recognition of the user input. In fact, several different versions of the system have emerged, i.e. a subway information system, train and flight information systems in different languages, and an integrated multilingual and multifunctional system which covers German and 3 additional languages in parallel. Current research focuses on the introduction of stochastic models into the semantic analysis, on the direct integration of prosodic information into the word recognition process, on the detection of user emotion, and on multilinguality and multifunctionality.
Improving And Predicting Performance Of Statistical Language Models In Sparse Domains
, 1998
"... Standard statistical language models, or n-gram models, which represent the probability of word sequences, suffer from sparse-data problems in tasks where large amounts of domain-specific text are not available. This thesis focuses on improving the estimation of domain-dependent n-gram models by usi ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Standard statistical language models, or n-gram models, which represent the probability of word sequences, suffer from sparse-data problems in tasks where large amounts of domain-specific text are not available. This thesis focuses on improving the estimation of domain-dependent n-gram models by using out-of-domain text data. Previous approaches for estimating language models from multi-domain data have not accounted for the characteristic variations of style and content across domains. In contrast, this thesis introduces two approaches that compensate for multi-domain differences, both representing "style" by part-of-speech (POS) sequences and "content" by the particular choice of words. First, data from multiple domains is combined using similarity weighting schemes that discriminate for content and style relevance prior to pooling multi-domain text. Second, n-gram distributions from multiple domains are combined, via a POS-dependent n-gram framework that separately compensate for word and POS usage differences. Two variations are explored: explicitly transforming the out-of-domain distribution before combining with an in-domain model, and vi separately estimating components of the POS-dependent n-gram model using multidomain data. Finally, measures to analyze and predict recognition performance of language models are also investigated, resulting in an algorithm for predicting performance differences associated with localized changes in language models given a recognition system.
Discriminative Training Of Language Model Classifiers
, 1999
"... We show how discriminative training methods, namely the Maximum Mutual Information and Maximum Discrimination approach, can be adopted for the training of N-gram language models used as classifiers working on symbol strings. By estimating the model parameters according to a discriminative objective ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We show how discriminative training methods, namely the Maximum Mutual Information and Maximum Discrimination approach, can be adopted for the training of N-gram language models used as classifiers working on symbol strings. By estimating the model parameters according to a discriminative objective function instead of Maximum Likelihood, the emphasis is not put on the exact modeling of each class, but on the right classification of the samples. The methods are shown to be suited for a variety of applications, such as the recognition of regulatory DNA sequences and language identification. Using phonotactic information, we achieve an error reduction of 10.7% (phoneme sequences) or 41.9% (codebook classes) with respect to the standard ML estimation on a corpus of English and German sentences. 1.
Multigrams For Language Identification
, 1999
"... In our paper we present two new approaches for language identification. Both of them are based on the use of so-called multigrams, an information theoretic based observation representation. In the first approach we use multigram models for phonotactic modeling of phoneme or codebook sequences. The m ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In our paper we present two new approaches for language identification. Both of them are based on the use of so-called multigrams, an information theoretic based observation representation. In the first approach we use multigram models for phonotactic modeling of phoneme or codebook sequences. The multigram model can be used to segment the new observation into larger units (e.g. something like words) and calculates a probability for the best segmentation. In the second approach we build a fenon recognizer using the segments of the best segmentation of the training material as "words" inside the recognition vocabulary. On the OGI test corpus and on the NIST'95 evaluation corpus we got significant improvements with this second approach in comparison to the unsupervised codebook approach when discriminating between English and German utterances.
A Frame and Segment Based Approach for Topic Spotting
- Proceedings European Conference on Speech Comm. and Technology (Eurospeech
, 1997
"... In this paper we present a new approach for topic spotting based on subword units (phonemes and feature vectors) instead of words. Classification of topics is done by running topic dependent polygram language models over these symbol sequences and deciding for the one with the best score. We trained ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper we present a new approach for topic spotting based on subword units (phonemes and feature vectors) instead of words. Classification of topics is done by running topic dependent polygram language models over these symbol sequences and deciding for the one with the best score. We trained and tested the two methods on three different corpora. The first is a part of a media corpus which contains data from TV shows for three different topics (IDS), the second is part of the Switchboard corpus, the third is a collection of human machine dialogs about train timetable information (EVAR corpus) . The results on Switchboard are compared with phoneme based approaches which were made at CRIM (Montr'eal) and DRA (Malvern) and are presented as ROC curves; the results on IDS and EVAR are compared with a word based approach and presented as confusion tables. We show that a surprisingly little amount of recognition accuracy is lost when going from word to subword based topic spotting. 1....
Detection Of Eukaryotic Promoter Regions Using Stochastic Language Models
, 1998
"... : We present a new search-by-content method to identify transcriptional regulatory regions in eukaryotic genomic sequences. The method is based on stochastic language models which are a straightforward generalization of oligomer statistics. We describe the theoretical background and different parame ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
: We present a new search-by-content method to identify transcriptional regulatory regions in eukaryotic genomic sequences. The method is based on stochastic language models which are a straightforward generalization of oligomer statistics. We describe the theoretical background and different parameter estimation techniques used to build the models. The resulting language models are applied to classify fixed length sequences into the classes of promoters and non-promoters, and to search for transcription start sites in contiguous sequences. Detailed classification results for human and Drosophila data sets are presented, and the practical applicability of the models is demonstrated on an independent test set of vertebrate genomic sequences. On this set, which has already been used to compare different computational approaches for promoter recognition, the performance of our method is comparable to the best algorithms described so far. The number of false positives can be further reduce...
Prosody and Automatic Speech Recognition - Why not yet a Success Story and where to go from here
, 2003
"... We describe the different linguistic and paralinguistic functions of prosody, show how features can be computed that describe the prosodic marking of these functions, and how this knowledge can be used in an automatic speech understanding system. This is done in the context of the speech--to--speech ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We describe the different linguistic and paralinguistic functions of prosody, show how features can be computed that describe the prosodic marking of these functions, and how this knowledge can be used in an automatic speech understanding system. This is done in the context of the speech--to--speech translation system Verbmobil, where prosody is used to segment the user utterance and to find self repairs. We then go on to discuss, why most speech processing systems do not use prosodic information and end by showing some new trends in prosody research, namely the classification of emotion and the classification of "offtalk" (speaking aside).
A Hybrid Approach To Spoken Dialogue Understanding: Prosody, Statistics And Partial Parsing
, 1999
"... Linguistic processing in spoken dialogue systems has to be robust against a large number of phenomena such as recognizer errors, spontaneous speech phenomena and out-of-vocabulary (OOV) words. A commonly used solution to this problem is partial parsing, that aims at detecting only parts of sentences ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Linguistic processing in spoken dialogue systems has to be robust against a large number of phenomena such as recognizer errors, spontaneous speech phenomena and out-of-vocabulary (OOV) words. A commonly used solution to this problem is partial parsing, that aims at detecting only parts of sentences/utterances that are vital for the respective task of the parser. In our paper we present a framework for robust linguistic processing in our spoken dialogue system EVAR for train timetable information. The linguistic processor combines partial parsing with prosody and statistical concept prediction. Parsing is restricted to the detection and analysis of those parts of an utterance that are crucial for its understanding by the system. In order to accomplish this task most efficiently, the parser operates not only on word lattices as delivered by the recognizer, but also on prosodic information and statistical concept prediction.

