• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 62,379
Next 10 →

SRILM -- An extensible language modeling toolkit

by Andreas Stolcke - IN PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP 2002 , 2002
"... SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports creation ..."
Abstract - Cited by 1218 (21 self) - Add to MetaCart
SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports

Time-Based Language Models

by Xiaoyan Li, W. Bruce Croft , 2003
"... We explore the relationship between time and relevance using TREC ad-hoc queries. A type of query is identified that favors very recent documents. We propose a time-based language model approach to retrieval for these queries. We show how time can be incorporated into both query-likelihood models an ..."
Abstract - Cited by 440 (36 self) - Add to MetaCart
We explore the relationship between time and relevance using TREC ad-hoc queries. A type of query is identified that favors very recent documents. We propose a time-based language model approach to retrieval for these queries. We show how time can be incorporated into both query-likelihood models

An Empirical Study of Smoothing Techniques for Language Modeling

by Stanley F. Chen , 1998
"... We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e.g., Br ..."
Abstract - Cited by 1224 (21 self) - Add to MetaCart
We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e

A Language Modeling Approach to Information Retrieval

by Jay M. Ponte, W. Bruce Croft , 1998
"... Models of document indexing and document retrieval have been extensively studied. The integration of these two classes of models has been the goal of several researchers but it is a very difficult problem. We argue that much of the reason for this is the lack of an adequate indexing model. This sugg ..."
Abstract - Cited by 1154 (42 self) - Add to MetaCart
an approach to retrieval based on probabilistic language modeling. We estimate models for each document individually. Our approach to modeling is non-parametric and integrates document indexing and document retrieval into a single model. One advantage of our approach is that collection statistics which

A Neural Probabilistic Language Model

by Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian Jauvin - JOURNAL OF MACHINE LEARNING RESEARCH , 2003
"... A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen ..."
Abstract - Cited by 447 (19 self) - Add to MetaCart
A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences

Statistical Language Modeling Using The Cmu-Cambridge Toolkit

by Philip Clarkson, Ronald Rosenfeld , 1997
"... The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the construction and testing of bigram and trigram language models. It is currently in use in over 40 academic, government and industrial laboratories in over 12 countries. This paper presents a new version of ..."
Abstract - Cited by 387 (4 self) - Add to MetaCart
The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the construction and testing of bigram and trigram language models. It is currently in use in over 40 academic, government and industrial laboratories in over 12 countries. This paper presents a new version

Self-organized language modeling for speech recognition

by F. Jelinek, B. Merialdo, S. Roukos, M. Strauss I - Readings in Speech Recognition , 1990
"... In the case of a trlgr~m language model, the proba-bility of the next word conditioned on the previous two words is estimated from a large corpus of text. The re-sulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve the l ..."
Abstract - Cited by 394 (6 self) - Add to MetaCart
In the case of a trlgr~m language model, the proba-bility of the next word conditioned on the previous two words is estimated from a large corpus of text. The re-sulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve

Estimation of probabilities from sparse data for the language model component of a speech recognizer

by Slava M. Katz - IEEE Transactions on Acoustics, Speech and Signal Processing , 1987
"... Abstract-The description of a novel type of rn-gram language model is given. The model offers, via a nonlinear recursive procedure, a com-putation and space efficient solution to the problem of estimating prob-abilities from sparse data. This solution compares favorably to other proposed methods. Wh ..."
Abstract - Cited by 799 (2 self) - Add to MetaCart
Abstract-The description of a novel type of rn-gram language model is given. The model offers, via a nonlinear recursive procedure, a com-putation and space efficient solution to the problem of estimating prob-abilities from sparse data. This solution compares favorably to other proposed methods

Parsimonious Language Models for Information Retrieval

by Djoerd Hiemstra, Stephen Robertson, Hugo Zaragoza - In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , 2004
"... We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly address the relation between levels of language models that are typically used for smoothing. As such, ..."
Abstract - Cited by 322 (41 self) - Add to MetaCart
We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly address the relation between levels of language models that are typically used for smoothing. As such

A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval

by Chengxiang Zhai, John Lafferty
"... ..."
Abstract - Cited by 961 (40 self) - Add to MetaCart
Abstract not found
Next 10 →
Results 1 - 10 of 62,379
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University