Results 11 -
19 of
19
Dependency Language Modeling
, 1997
"... This report summarizes the work of the Dependency Language Modeling group at the 1996 Summer Speech Workshop at the Center for Language and Speech Processing at Johns Hopkins University (WS96). We motivate and descibe a novel statistical language model that models the syntactic dependencies between ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This report summarizes the work of the Dependency Language Modeling group at the 1996 Summer Speech Workshop at the Center for Language and Speech Processing at Johns Hopkins University (WS96). We motivate and descibe a novel statistical language model that models the syntactic dependencies between words. The model is formulated in the maximum entropy framework, which expresses statistical constraints on the frequencies of various type of dependencies, as well the standard N-gram statistics. We describe how this model was applied to the recognition of spontaneous English speech from the Switchboard corpus. Due to implementation constraints, only a reduced version of our model could be tested so far. The model gave a modest improvement over an N-gram baseline model. A by-product of the project is the Maximim Entropy Modeling Toolkit (MEMT), a freely available software package for domain-independent maximum entropy modeling. 1 Introduction Current state-of-the-art language models for s...
Introduction to Corpus-based Statistics-oriented (CBSO) Techniques
- Pre-Conference Workshop on Corpus-based NLP, ROCLING VII, National Tsing-Hua Univ
, 1994
"... A Corpus-Based Statistics-Oriented (CBSO) methodology, which is an attempt to avoid the drawbacks of traditional rule-based approaches and purely statistical approaches, is introduced in this paper. Rule-based approaches, with rules induced by human experts, had been the dominant paradigm in the nat ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
A Corpus-Based Statistics-Oriented (CBSO) methodology, which is an attempt to avoid the drawbacks of traditional rule-based approaches and purely statistical approaches, is introduced in this paper. Rule-based approaches, with rules induced by human experts, had been the dominant paradigm in the natural language processing community. Such approaches, however, suffer from serious difficulties in knowledge acquisition in terms of cost and consistency. Therefore, it is very difficult for such systems to be scaled-up. Statistical methods, with the capability of automatically acquiring knowledge from corpora, are becoming more and more popular, in part, to amend the shortcomings of rule-based approaches. However, most simple statistical models, which adopt almost nothing from existing linguistic knowledge, often result in a large parameter space and, thus, require an unaffordably large training corpus for even well-justified linguistic phenomena. The corpus-based statistics-oriented (CBSO) approach is a compromise between the two extremes of the spectrum for knowledge acquisition. CBSO approach
Analyzing And Improving Statistical Language Models For Speech Recognition
, 1994
"... A speech recognizer is a device that translates speech into text. Many current speech recognizers contain two components, an acoustic model and a statistical language model. The acoustic model indicates how likely it is that a certain word corresponds to a part of the acoustic signal (e.g. the speec ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A speech recognizer is a device that translates speech into text. Many current speech recognizers contain two components, an acoustic model and a statistical language model. The acoustic model indicates how likely it is that a certain word corresponds to a part of the acoustic signal (e.g. the speech). The statistical language model indicates how likely it is that a certain word will be spoken next, given the words recognized so far. Even though the acoustic model might for example not be able to decide between the acoustically similar words "peach" and "teach", the statistical language model can indicate that the word "peach" is more likely if the previously recognized words are "He ate the". Current speech recognizers perform well on constrained tasks, but the goal of continuous, speaker independent speech recognition in potentially noisy environments with a very large vocabulary has not been reached so far. How can statistical language models be improved so that more complex tasks c...
Mlp Emulation Of N-Gram Models As A First Step To Connectionist Language Modeling
- In: Proc. of the ICANN
, 1992
"... In problems such as automatic speech recognition and machine translation, where the system response must be a sentence in a given language, language models are em- ployed in order to improve system performance. These language models are usually N-gram models (for instance, bigram or trigram models) ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In problems such as automatic speech recognition and machine translation, where the system response must be a sentence in a given language, language models are em- ployed in order to improve system performance. These language models are usually N-gram models (for instance, bigram or trigram models) which are estimated from large text databases using the occurrence frequen- cies of these N-grams.
Log-Linear Interpolation of Language Models
, 2000
"... Building probabilistic models of language is a central task in natural language and speech processing allowing to integrate the syntactic and/or semantic (and recently pragmatic) constraints of the language into the systems. Probabilistic language models are an attractive alternative to the more tra ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Building probabilistic models of language is a central task in natural language and speech processing allowing to integrate the syntactic and/or semantic (and recently pragmatic) constraints of the language into the systems. Probabilistic language models are an attractive alternative to the more traditional rule-based systems, such as context free grammars, because of the recent availability of massive amount of text corpora which can be used to e#ciently train the models and because instead of binary grammaticality judgement o#ered by the rule-based systems, likelihood of any sequence of lexical units can be obtained, which is a crucial factor in such tasks as speech recognition. Probabilistic language models also find their application in part-of-speech tagging, machine translation, semantic disambiguation and numerous other fields.
Radiological Reporting Based on Voice Recognition
- Human-computer interaction : third International conference, EWHCI: selected papers. Lecture Notes on Computer Science
, 1993
"... . Speech recognition has proved to be a natural interaction modality and an effective technology for medical reporting, in particular in the speciality of radiology. High-volume text creation requirement and the complex structure of these texts make voice technologies useful. By employing speech, pr ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
. Speech recognition has proved to be a natural interaction modality and an effective technology for medical reporting, in particular in the speciality of radiology. High-volume text creation requirement and the complex structure of these texts make voice technologies useful. By employing speech, professionals in the field can generate reports and do so at a speed that approaches traditional dictation methods. However, the integration of speech recognition in a user interface creates new problems: speech recognizers may introduce errors and moreover they should be adaptable to spoken language variations. This paper describes a radiological reporting system and the related motivations for the use of the speech modality. A preliminary evaluation of the system has shown that, on average, although text recalling functions and keyword shortcuts are available, more than two thirds of a radiological report are generated by means of dictation. 1 Introduction Recent progress in Automatic Speec...
ON THE RELATION BETWEEN ADDITIVE SMOOTHING AND UNIVERSAL CODING
"... We analyze the performance of smoothing methods for language modeling from the perspective of universal compression. We use existing asymptotic bounds on the performance of simple additive rules for compression of finite-alphabet memoryless sources to explain the empirical predictive abilities of ad ..."
Abstract
- Add to MetaCart
We analyze the performance of smoothing methods for language modeling from the perspective of universal compression. We use existing asymptotic bounds on the performance of simple additive rules for compression of finite-alphabet memoryless sources to explain the empirical predictive abilities of additive smoothing techniques. We further suggest a smoothing method that overcomes some of the problems observed in previous approaches. The new method outperforms existing ones on the Wall Street Journal(WSJ) database for bigram and trigram models. We then suggest possible directions for future research. 1.
A Universal Compression Perspective of Smoothing
"... We analyze smoothing algorithms from a universal-compression perspective. Instead of evaluating their performance on an empirical sample, we analyze their performance on the most inconvenient sample possible. Consequently the performance of the algorithm can be guaranteed even on unseen data. We sho ..."
Abstract
- Add to MetaCart
We analyze smoothing algorithms from a universal-compression perspective. Instead of evaluating their performance on an empirical sample, we analyze their performance on the most inconvenient sample possible. Consequently the performance of the algorithm can be guaranteed even on unseen data. We show that universal compression bounds can explain the empirical performance of several smoothing methods. We also describe a new interpolated additive smoothing algorithm, and show that it has lower training complexity and better compression performance than existing smoothing techniques. Key words: Language modeling, universal compression, smoothing 1
IMPROVING LANGUAGE MODELS BY USING DISTANT INFORMATION
"... This study examines how to take originally advantage from distant information in statistical language models. We show that it is possible to use n-gram models considering histories different from those used during training. These models are called crossing context models. Our study deals with classi ..."
Abstract
- Add to MetaCart
This study examines how to take originally advantage from distant information in statistical language models. We show that it is possible to use n-gram models considering histories different from those used during training. These models are called crossing context models. Our study deals with classical and distant n-gram models. A mixture of four models is proposed and evaluated. A bigram linear mixture achieves an improvement of 14% in terms of perplexity. Moreover the trigram mixture outperforms the standard trigram by 5.6%. These improvements have been obtained without complexifying standard n-gram models. The resulting mixture language model has been integrated into a speech recognition system. Its evaluation achieves a slight improvement in terms of word error rate on the data used for the francophone evaluation campaign ESTER [1]. Finally, the impact of the proposed crossing context language models on performance is presented according to various speakers. 1.

