Results 1 -
4 of
4
Switchboard Discourse Language Modeling Project (Final Report)
, 1997
"... We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 `Dialog Acts' (DAs), (question, answer, backchannel, agreement, disagreement, apology, etc). We labeled 1155 conversations from the Switchboard (SWBD) ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 `Dialog Acts' (DAs), (question, answer, backchannel, agreement, disagreement, apology, etc). We labeled 1155 conversations from the Switchboard (SWBD) database (Godfrey et al. 1992) of human-to-human telephone conversations with these 42 types and trained a Dialog Act detector based on three distinct knowledge sources: sequences of words which characterize a dialog act, prosodic features which characterize a dialog act, and a statistical Discourse Grammar. Our combined detector, although still in preliminary stages, already achieves a 65% Dialog Act detection rate based on acoustic waveforms, and 72% accuracy based on word transcripts. Using this detector to switch among the 42 dialog-act-specific trigram LMs also gave us an encouraging but not statistically significant reduction in SWBD word error. 1 Introduction The ability to model and...
Class phrase models for language modeling
- In Proceedings of ICSLP
, 1996
"... Previous attempts to automatically determine multi-words as the basic unit for language modeling have been successful for extending bigram models [10, 9, 2, 8] to improve the perplexity ofthelanguage model and/or the word accuracy of the speech decoder. However, none ofthese techniques gave improvem ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
Previous attempts to automatically determine multi-words as the basic unit for language modeling have been successful for extending bigram models [10, 9, 2, 8] to improve the perplexity ofthelanguage model and/or the word accuracy of the speech decoder. However, none ofthese techniques gave improvements over the trigram model so far, except for the rather controlled ATIS task [8]. We therefore propose an algorithm, that minimizes the perplexity improvement ofa bigram model directly. The new algorithm is able to reduce the trigram perplexity andalso achieves word accuracy improvements in the Verbmobil task. It is the natural counterpart of successful word classi cation algorithms for language modeling [4, 7] that minimize the leaving-one-out bigram perplexity. Wealso give some details on the usage of class nding techniques and m-gram models, which can be crucial to successful applications of this technique. 1.
Hmm And Neural Network Based Speech Act Detection
, 1999
"... We present an incremental lattice generation approach to speech act detection for spontaneous and overlapping speech in telephone conversations (CallHome Spanish). At each stage of the process it is therefore possible to use different models after the initial HMM models have generated a reasonable s ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
We present an incremental lattice generation approach to speech act detection for spontaneous and overlapping speech in telephone conversations (CallHome Spanish). At each stage of the process it is therefore possible to use different models after the initial HMM models have generated a reasonable set of hypothesis. These lattices can be processed further by more complex models. This study shows how neural networks can be used very effectively in the classification of speech acts. We find that speech acts can be classified better using the neural net based approach than using the more classical ngram backoff model approach. The best resulting neural network operates only on unigrams and the integration of the ngram backoff model as a prior to the model reduces the performance of the model. The neural network can therefore more likely be robust against errors from an LVCSR system and can potentially be trained from a smaller database. To appear in: International Conference on Acoustics...
Towards The Detection And Description Of Textual Meaning Indicators In Spontaneous Conversations
- in Proceedings of the Eurospeech
, 1999
"... The description of textual and stylistic features has so far been largely neglected in the empirical study of conversational speech. In this paper we want to make a couple of strong initial points towards the use textual meaning and stylistic features in language engineering: First of all we want to ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
The description of textual and stylistic features has so far been largely neglected in the empirical study of conversational speech. In this paper we want to make a couple of strong initial points towards the use textual meaning and stylistic features in language engineering: First of all we want to show that there are other besides the traditional features in spontaneous speech that are worth studying and that might reveal good information: These are related to the interactive nature of the language and to the distribution of the most frequent (non-topical) words. Secondly we want to present two tasks that we have chosen as our benchmark and present detection results. Finally we want to motivate how this can be used in information access applications. To appear in: European Conference On Speech Communication And Technology (EUROSPEECH '99), Budapest, Hungary, September 5-9, 1999 1. INTRODUCTION The study of textual meaning has often be confined to the discussion of thematic or to...

