Results 1 - 10
of
10
Intonational Boundaries, Speech Repairs and Discourse Markers: Modeling Spoken Dialog
, 1997
"... To understand a speaker's turn of a conversation, one needs to segment it into intonational phrases, clean up any speech repairs that might have occurred, and identify discourse markers. In this paper, we argue that these problems must be resolved together, and that they must be resolved earl ..."
Abstract
-
Cited by 29 (5 self)
- Add to MetaCart
To understand a speaker's turn of a conversation, one needs to segment it into intonational phrases, clean up any speech repairs that might have occurred, and identify discourse markers. In this paper, we argue that these problems must be resolved together, and that they must be resolved early in the processing stream. We put forward a statistical language model that resolves these problems, does POS tagging, and can be used as the language model of a speech recognizer. We find that by accounting for the interactions between these tasks that the performance on each task improves, as does POS tagging and perplexity.
Speech Repairs, Intonational Boundaries and Discourse Markers: Modeling Speakers
- Department of Computer Science, University of Rochester
, 1997
"... Peter Heeman was born October 22, 1963, and much to his dismay his parents had already moved away from Toronto. Instead he was born in London Ontario, where he grew up on a strawberry farm. He attended the University of Waterloo where he re-ceived a Bachelors of Mathematics with a joint degree in Pu ..."
Abstract
-
Cited by 24 (8 self)
- Add to MetaCart
Peter Heeman was born October 22, 1963, and much to his dismay his parents had already moved away from Toronto. Instead he was born in London Ontario, where he grew up on a strawberry farm. He attended the University of Waterloo where he re-ceived a Bachelors of Mathematics with a joint degree in Pure Mathematics and Com-puter Science in the spring of 1987. After working two years for a software engineering company, which supposedly used artificial intelligence techniques to automate COBOL and CICS programming, Peter was ready for a change. What better way to wipe the slate clear than by going to graduate school at the University of Toronto, but not without first spending the sum-mer in Europe. After spending two months in countries where he couldn’t speak the language, Peter became fascinated by language, and so decided to give computational linguistics a try.
The Use of Clustering Techniques for Language Modeling - Application to Asian Languages
"... Cluster-based n-gram modeling is a variant of normal word-based n-gram modeling. It attempts to make use of the similarities between words. In this paper, we present an empirical study of clustering techniques for Asian language modeling. Clustering is used to improve the performance (i.e. perplex ..."
Abstract
-
Cited by 15 (11 self)
- Add to MetaCart
Cluster-based n-gram modeling is a variant of normal word-based n-gram modeling. It attempts to make use of the similarities between words. In this paper, we present an empirical study of clustering techniques for Asian language modeling. Clustering is used to improve the performance (i.e. perplexity) of language models as well as to compress language models. Experimental tests are presented for cluster-based trigram models on a Japanese newspaper corpus, and on a Chinese heterogeneous corpus.
Identifying Discourse Markers in Spoken Dialog
, 1998
"... In this paper, we present a method for identifying discourse marker usage in spontaneous speech based on machine learning. Discourse markers are denoted by special POS tags, and thus the process of POS tagging can be used to identify discourse markers. By incorporating POS tagging into language ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
In this paper, we present a method for identifying discourse marker usage in spontaneous speech based on machine learning. Discourse markers are denoted by special POS tags, and thus the process of POS tagging can be used to identify discourse markers. By incorporating POS tagging into language modeling, discourse markers can be identified during speech recognition, in which the timeliness of the information can be used to help predict the following words. We contrast this approach with an alternative machine learning approach proposed by Litman (1996). This paper also argues that discourse markers can be used to help the hearer predict the role that the upcoming utterance plays in the dialog. Thus discourse markers should provide valuable evidence for automatic dialog act prediction. Introduction Discourse markers are a linguistic devise that speakers use to signal how the upcoming unit of speech or text relates to the current discourse state (Schiffrin 1987). Previous ...
POS Tagging versus Classes in Language Modeling
, 1998
"... Language models for speech recognition concentrate solely on recognizing the words that were spoken. In this paper, we advocate redefining the speech recognition problem so that its goal is to find both the best sequence of words and their POS tags, and thus incorporate POS tagging. The use of POS t ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Language models for speech recognition concentrate solely on recognizing the words that were spoken. In this paper, we advocate redefining the speech recognition problem so that its goal is to find both the best sequence of words and their POS tags, and thus incorporate POS tagging. The use of POS tags allows more sophisticated generalizations than are afforded by using a class-based approach. Furthermore, if we want to incorporate speech repair and intonational phrase modeling into the language model, using POS tags rather than classes gives .bet- ter performance in this task.
POS Tags and Decision Trees for Language Modeling
- IN PROCEEDINGS OF THE JOINT SIGDAT CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND VERY LARGE CORPORA
, 1999
"... Language model's for speech recognition concentrate solely on recognizing the words that were spoken. In this paper, we advocate redefining the speech recognition problem so that its goal is to find both the best sequence of words and their POS tags, and thus incorporate POS tagging. To use POS tags ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Language model's for speech recognition concentrate solely on recognizing the words that were spoken. In this paper, we advocate redefining the speech recognition problem so that its goal is to find both the best sequence of words and their POS tags, and thus incorporate POS tagging. To use POS tags effectively, we use clustering and decision tree algorithms, which allow generalizations between POS tags and words to be effectively used in estimating the probability distributions. We show that our POS model gives.a reduction in word error rate and perplexity for the Trains corpus in comparison to word and class-based approaches. By using the Wall Street Journal corpus, we show that this approach scales up when more training data is available.
Performance Prediction for Exponential Language Models
"... We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set cross-entropy for n-gram language models. We build models over varying domains, data set sizes, and n-gram orders, an ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set cross-entropy for n-gram language models. We build models over varying domains, data set sizes, and n-gram orders, and perform linear regression to see whether we can model test set performance as a simple function of training set performance and various model statistics. Remarkably, we find a simple relationship that predicts test set performance with a correlation of 0.9997. We analyze why this relationship holds and show that it holds for other exponential language models as well, including class-based models and minimum discrimination information models. Finally, we discuss how this relationship can be applied to improve language model performance. 1
Deriving Phrase-based Language Models
, 1997
"... Phrase-based language models have grown in popularity since they allow the speech recognition process to make use of more context in recognizing the words. Previous approaches have used perplexity reduction to identify groups of words to be linked into phrases and have used these phrases as the basi ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Phrase-based language models have grown in popularity since they allow the speech recognition process to make use of more context in recognizing the words. Previous approaches have used perplexity reduction to identify groups of words to be linked into phrases and have used these phrases as the basis for computing the language model probabilities. In this paper, we argue that perplexity reduction is only one of three aspects to be considered in choosing the phrases. We also argue that the chosen phrases should not be the basis for computing the language model probabilities. Rather, the probabilities should be derived from a language model built at the lexical level. 1 Introduction Most research in speech recognition is based on using lexical units as the interface between acoustic and language modeling. The acoustic model rates the likeliness of a stretch of sound given a certain word and the language model rates the likeliness of the word given the past words that were recognized. H...
The Use of Clustering Techniques for Asian Language Modeling
, 2001
"... Cluster-based n-gram modeling is a variant of normal word-based n-gram modeling. It attempts to make use of the similarities between words. In this paper, we present an empirical study of clustering techniques for Asian language modeling. Clustering is used to improve the performance (i.e. perplexit ..."
Abstract
- Add to MetaCart
Cluster-based n-gram modeling is a variant of normal word-based n-gram modeling. It attempts to make use of the similarities between words. In this paper, we present an empirical study of clustering techniques for Asian language modeling. Clustering is used to improve the performance (i.e. perplexity) of language models as well as to compress language models. Experimental tests are presented for cluster-based trigram models on a Japanese newspaper corpus, and on a Chinese heterogeneous corpus. While the majority of previous research on word clustering has focused on how to get the best clusters, we have concentrated our research on the best way to use the clusters. Experimental results show that some novel techniques we present work much better than previous methods, and achieve up to more than 40% size reduction at the same perplexity
Curriculum Vitae
, 1997
"... Peter Heeman was born October 22, 1963, and much to his dismay his parents had already moved away from Toronto. Instead he was born in London Ontario, where he grew up on a strawberry farm. He attended the University of Waterloo where he received a Bachelors of Mathematics with a joint degree in Pur ..."
Abstract
- Add to MetaCart
Peter Heeman was born October 22, 1963, and much to his dismay his parents had already moved away from Toronto. Instead he was born in London Ontario, where he grew up on a strawberry farm. He attended the University of Waterloo where he received a Bachelors of Mathematics with a joint degree in Pure Mathematics and Computer Science in the spring of 1987. After working two years for a software engineering company, which supposedly used artificial intelligence techniques to automate COBOL and CICS programming, Peter was ready for a change. What better way to wipe the slate clear than by going to graduate school at the University of Toronto, but not without first spending the summer in Europe. After spending two months in countries where he couldn’t speak the language, Peter became fascinated by language, and so decided to give computational linguistics a try.

