Results 1 - 10
of
62,379
SRILM -- An extensible language modeling toolkit
- IN PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP 2002
, 2002
"... SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports creation ..."
Abstract
-
Cited by 1218 (21 self)
- Add to MetaCart
SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports
Time-Based Language Models
, 2003
"... We explore the relationship between time and relevance using TREC ad-hoc queries. A type of query is identified that favors very recent documents. We propose a time-based language model approach to retrieval for these queries. We show how time can be incorporated into both query-likelihood models an ..."
Abstract
-
Cited by 440 (36 self)
- Add to MetaCart
We explore the relationship between time and relevance using TREC ad-hoc queries. A type of query is identified that favors very recent documents. We propose a time-based language model approach to retrieval for these queries. We show how time can be incorporated into both query-likelihood models
An Empirical Study of Smoothing Techniques for Language Modeling
, 1998
"... We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e.g., Br ..."
Abstract
-
Cited by 1224 (21 self)
- Add to MetaCart
We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e
A Language Modeling Approach to Information Retrieval
, 1998
"... Models of document indexing and document retrieval have been extensively studied. The integration of these two classes of models has been the goal of several researchers but it is a very difficult problem. We argue that much of the reason for this is the lack of an adequate indexing model. This sugg ..."
Abstract
-
Cited by 1154 (42 self)
- Add to MetaCart
an approach to retrieval based on probabilistic language modeling. We estimate models for each document individually. Our approach to modeling is non-parametric and integrates document indexing and document retrieval into a single model. One advantage of our approach is that collection statistics which
A Neural Probabilistic Language Model
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen ..."
Abstract
-
Cited by 447 (19 self)
- Add to MetaCart
A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences
Statistical Language Modeling Using The Cmu-Cambridge Toolkit
, 1997
"... The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the construction and testing of bigram and trigram language models. It is currently in use in over 40 academic, government and industrial laboratories in over 12 countries. This paper presents a new version of ..."
Abstract
-
Cited by 387 (4 self)
- Add to MetaCart
The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the construction and testing of bigram and trigram language models. It is currently in use in over 40 academic, government and industrial laboratories in over 12 countries. This paper presents a new version
Self-organized language modeling for speech recognition
- Readings in Speech Recognition
, 1990
"... In the case of a trlgr~m language model, the proba-bility of the next word conditioned on the previous two words is estimated from a large corpus of text. The re-sulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve the l ..."
Abstract
-
Cited by 394 (6 self)
- Add to MetaCart
In the case of a trlgr~m language model, the proba-bility of the next word conditioned on the previous two words is estimated from a large corpus of text. The re-sulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve
Estimation of probabilities from sparse data for the language model component of a speech recognizer
- IEEE Transactions on Acoustics, Speech and Signal Processing
, 1987
"... Abstract-The description of a novel type of rn-gram language model is given. The model offers, via a nonlinear recursive procedure, a com-putation and space efficient solution to the problem of estimating prob-abilities from sparse data. This solution compares favorably to other proposed methods. Wh ..."
Abstract
-
Cited by 799 (2 self)
- Add to MetaCart
Abstract-The description of a novel type of rn-gram language model is given. The model offers, via a nonlinear recursive procedure, a com-putation and space efficient solution to the problem of estimating prob-abilities from sparse data. This solution compares favorably to other proposed methods
Parsimonious Language Models for Information Retrieval
- In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 2004
"... We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly address the relation between levels of language models that are typically used for smoothing. As such, ..."
Abstract
-
Cited by 322 (41 self)
- Add to MetaCart
We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly address the relation between levels of language models that are typically used for smoothing. As such
Results 1 - 10
of
62,379