Results 11 - 20
of
33
An Empirical Evaluation of LFG-DOP
- In Proceedings of the 19th International Conference on Computational Linguistics
, 2000
"... This paper presents an empirical assessment of the LFG-DOP model introduced by Bod & Kaplan (1998). The parser we describe uses fragments from LFG-annotated sentences to parse new sentences and Monte Carlo techniques to compute the most probable parse. While our main goal is to test Bod & Kaplan's m ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This paper presents an empirical assessment of the LFG-DOP model introduced by Bod & Kaplan (1998). The parser we describe uses fragments from LFG-annotated sentences to parse new sentences and Monte Carlo techniques to compute the most probable parse. While our main goal is to test Bod & Kaplan's model, we will also test a version of LFG-DOP which treats generalized fragments as previously unseen events. Experiments with the Verbmobil and Homecentre corpora show that our version of LFG-DOP outperforms Bod & Kaplan's model, and that LFG's functional information improves the parse accuracy of tree structures. 1
An Improved Parser for Data-Oriented Lexical-Functional Analysis
- In Proceedings of the 38th Conference of the Association for Computational Linguistics
"... We present an LFG-DOP parser which uses fragments from LFG-annotated sentences to parse new sentences. Experiments with the Verbmobil and Homecentre corpora show that (1) Viterbi n best search performs about 100 times faster than Monte Carlo search while both achieve the same accuracy; (2) the DOP h ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
We present an LFG-DOP parser which uses fragments from LFG-annotated sentences to parse new sentences. Experiments with the Verbmobil and Homecentre corpora show that (1) Viterbi n best search performs about 100 times faster than Monte Carlo search while both achieve the same accuracy; (2) the DOP hypothesis which states that parse accuracy increases with increasing fragment size is confirmed for LFG-DOP; (3) LFG-DOP's relative frequency estimator performs worse than a discounted frequency estimator; and (4) LFG-DOP significantly outperforms Tree-DOP if evaluated on tree structures only. 1
On the Use of Grammar Based Language Models for Statistical Machine Translation
- 6th Int. Workshop on Parsing Technologies
, 1999
"... In this paper, we describe some concepts of language models beyond the usually used standard trigram and prove the need of such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
In this paper, we describe some concepts of language models beyond the usually used standard trigram and prove the need of such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target language. The most important demands for the language model in statistical machine translation is the correct word order, given a certain choice of words, and to score the selection of translations, that are done by the translation model Pr(f J 1 je I 1 ), in view of the syntactical context. Beside the inquisition of standard m-grams with long histories, we examined the use of Part-of-Speech based models as well as linguistically motivated grammars with stochastic parsing as a special type of language model. Translation results are given on the Verbmobil task, where translation are performed from German to English, with vocabulary sizes of 6500 and 4000 words respectively. 1 Introduct...
A Comparison Of Dialogue-State Dependent Language Models
- in Proceedings of ECSA Workshop on Interactive Dialogue in Multi-Modal Systems, Irsee
, 1999
"... Dialogue-state dependent language models in automatic inquiry systems can be employed to improve speech recognition and understanding. In this paper, the dialogue state is defined by the set of parameters contained in the system prompt. Using this knowledge, a separate language model for each state ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Dialogue-state dependent language models in automatic inquiry systems can be employed to improve speech recognition and understanding. In this paper, the dialogue state is defined by the set of parameters contained in the system prompt. Using this knowledge, a separate language model for each state can be constructed. In order to obtain robust language models we study the linear interpolation of all dialogue-state dependent language models and an automatic text clustering algorithm. In particular, we extend the clustering algorithm so as to automatically determine the optimal number of clusters. These clusters are then be combined with linear interpolation. We present experimental results on a Dutch corpus which has been recorded in the Netherlands with a train timetable information system in the framework of the ARISE project [1]. The perplexity, the word error rate, and the attribute error rate can be reduced significantly with all of these methods. 1. INTRODUCTION If the choice o...
Smoothing Methods In Maximum Entropy Language Modeling
- In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, volume I
, 1999
"... This paper discusses various aspects of smoothing techniques in maximum entropy language modeling, a topic not sufficiently covered by previous publications. We show (1) that straightforward maximum entropy models with nested features, e.g. tri-, bi-, and unigrams, result in unsmoothed relative freq ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper discusses various aspects of smoothing techniques in maximum entropy language modeling, a topic not sufficiently covered by previous publications. We show (1) that straightforward maximum entropy models with nested features, e.g. tri-, bi-, and unigrams, result in unsmoothed relative frequencies models; (2) that maximum entropy models with nested features and discounted feature counts approximate backing-off smoothed relative frequencies models with Kneser's advanced marginal back-off distribution; this explains some of the reported success of maximum entropy models in the past; (3) perplexity results for nested and non-nested features, e.g. trigrams and distance-trigrams, on a 4-million word subset of the Wall Street Journal Corpus, showing that the smoothing method has more effect on the perplexity than the method to combine information.
Hidden Model Sequence Models for Automatic Speech Recognition
, 2001
"... Most modern automatic speech recognition systems make use of acoustic models based on hidden Markov models. To obtain reasonable recognition performance within a large vocabulary framework, the acoustic models usually include a pronunciation model, together with complex parameter tying schemes. In m ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Most modern automatic speech recognition systems make use of acoustic models based on hidden Markov models. To obtain reasonable recognition performance within a large vocabulary framework, the acoustic models usually include a pronunciation model, together with complex parameter tying schemes. In many cases the pronunciation model operates on a phoneme level and is derived independently of the underlying models. In contrast, this work is aimed at improving pronunciation modelling on a sub-phone level in a combined framework. The modelling of pronunciation variation is assumed to be of special importance for recognition of spontaneous speech.
A Data-Oriented Parsing Model for Lexical-Functional Grammar
- In Data-Oriented Parsing, ed. by Rens Bod, Remko Scha, & Khalil Sima’an
, 2003
"... Data-Oriented Parsing (DOP) models of natural language propose that human language processing works with representations of concrete past language experiences rather than with abstract linguistic rules. These models operate by decomposing the given representations into fragments and recomposing t ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Data-Oriented Parsing (DOP) models of natural language propose that human language processing works with representations of concrete past language experiences rather than with abstract linguistic rules. These models operate by decomposing the given representations into fragments and recomposing those pieces to analyze new utterances. A probability model is used to select from all possible analyses of an utterance the most likely one. Previous DOP models were based on simple tree representations that neglect grammatical functions and syntactic features (Tree-DOP). In this paper, we present a new DOP model based on the more articulated representations of Lexical-Functional Grammar theory (LFG-DOP). LFG-DOP triggers a new, corpus-based notion of grammaticality, and an interestingly different class of probability models. An empirical evaluation of the model shows that larger as well as richer fragments improve performance. Finally, we go into some of the conceptual implications of our approach. 1
Reversing and Smoothing the Multinomial Naive Bayes Text Classifer
- In Proceedings of the 2nd Int. Workshop on Pattern Recognition in Information Systems (PRIS 2002
, 2002
"... The naive Bayes text classifier has long been a core technique in information retrieval and, more recently, it has emerged as a focus of research itself in machine learning. This paper is concerned with the naive Bayes text classifier in its multinomial model instantiation. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The naive Bayes text classifier has long been a core technique in information retrieval and, more recently, it has emerged as a focus of research itself in machine learning. This paper is concerned with the naive Bayes text classifier in its multinomial model instantiation.
Tagging a Corpus of Spoken Swedish
- International Journal of Corpus Linguistics
, 2001
"... In this article, we present and evaluate a method for training a statistical partof-speech tagger on data from written language and then adapting it to the requirements of tagging a corpus of transcribed spoken language, in our case spoken Swedish. This is currently a significant problem for many re ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this article, we present and evaluate a method for training a statistical partof-speech tagger on data from written language and then adapting it to the requirements of tagging a corpus of transcribed spoken language, in our case spoken Swedish. This is currently a significant problem for many research groups working with spoken language, since the availability of tagged training data from spoken language is still very limited for most languages. The overall accuracy of the tagger developed for spoken Swedish is quite respectable, varying from 95% to 97 % depending on the tagset used. In conclusion, we argue that the method presented here gives good tagging accuracy with relatively little effort.
Effect of feature smoothing methods in text classification tasks
- In Proc. 4th International Workshop on Pattern Recognition in Information Systems, PRIS
, 2004
"... Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds of thousands even for small tasks. This leads to parameter estimation problems for statistical based methods and counterm ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds of thousands even for small tasks. This leads to parameter estimation problems for statistical based methods and countermeasures have to be found. One of the most widely used methods consists of reducing the size of the vocabulary according to a well defined criterion in order to be able to reliably estimate the set of parameters. In the field of language modeling this problem is also encountered and several smoothing techniques have been developed. In this paper we show that using the full vocabulary together with a suitable choice of the smoothing technique for the text classification task obtains better results than the standard feature selection techniques.

