• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 62,441
Next 10 →

A comparison of event models for Naive Bayes text classification

by Andrew McCallum, Kamal Nigam , 1998
"... Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey ..."
Abstract - Cited by 1025 (26 self) - Add to MetaCart
.g. Larkey and Croft 1996; Koller and Sahami 1997). Others use a multinomial model, that is, a uni-gram language model with integer word counts (e.g. Lewis and Gale 1994; Mitchell 1997). This paper aims to clarify the confusion by describing the differences and details of these two models, and by empirically

SRILM -- An extensible language modeling toolkit

by Andreas Stolcke - IN PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP 2002 , 2002
"... SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports creation ..."
Abstract - Cited by 1218 (21 self) - Add to MetaCart
creation and evaluation of a variety of language model types based on N-gram statistics, as well as several related tasks, such as statistical tagging and manipulation of N-best lists and word lattices. This paper summarizes the functionality of the toolkit and discusses its design and implementation

A Language Modeling Approach to Information Retrieval

by Jay M. Ponte, W. Bruce Croft , 1998
"... Models of document indexing and document retrieval have been extensively studied. The integration of these two classes of models has been the goal of several researchers but it is a very difficult problem. We argue that much of the reason for this is the lack of an adequate indexing model. This sugg ..."
Abstract - Cited by 1154 (42 self) - Add to MetaCart
an approach to retrieval based on probabilistic language modeling. We estimate models for each document individually. Our approach to modeling is non-parametric and integrates document indexing and document retrieval into a single model. One advantage of our approach is that collection statistics which

An Empirical Study of Smoothing Techniques for Language Modeling

by Stanley F. Chen , 1998
"... We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e.g., Br ..."
Abstract - Cited by 1224 (21 self) - Add to MetaCart
We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e

Class-Based n-gram Models of Natural Language

by Peter F. Brown, Peter V. deSouza, Robert L. Mercer, Vincent J. Della Pietra, Jenifer C. Lai - Computational Linguistics , 1992
"... We address the problem of predicting a word from previous words in a sample of text. In particular we discuss n-gram models based on calsses of words. We also discuss several statistical algoirthms for assigning words to classes based on the frequency of their co-occurrence with other words. We find ..."
Abstract - Cited by 986 (5 self) - Add to MetaCart
We address the problem of predicting a word from previous words in a sample of text. In particular we discuss n-gram models based on calsses of words. We also discuss several statistical algoirthms for assigning words to classes based on the frequency of their co-occurrence with other words. We

Head-Driven Statistical Models for Natural Language Parsing

by Michael Collins , 1999
"... ..."
Abstract - Cited by 1158 (15 self) - Add to MetaCart
Abstract not found

The Lorel Query Language for Semistructured Data

by Serge Abiteboul, Dallan Quass, Jason Mchugh, Jennifer Widom, Janet Wiener - International Journal on Digital Libraries , 1997
"... We present the Lorel language, designed for querying semistructured data. Semistructured data is becoming more and more prevalent, e.g., in structured documents such as HTML and when performing simple integration of data from multiple sources. Traditional data models and query languages are inapprop ..."
Abstract - Cited by 731 (29 self) - Add to MetaCart
We present the Lorel language, designed for querying semistructured data. Semistructured data is becoming more and more prevalent, e.g., in structured documents such as HTML and when performing simple integration of data from multiple sources. Traditional data models and query languages

The Semantics of Predicate Logic as a Programming Language

by M. H. Van Emden, R. A. Kowalski - JOURNAL OF THE ACM , 1976
"... Sentences in first-order predicate logic can be usefully interpreted as programs In this paper the operational and fixpoint semantics of predicate logic programs are defined, and the connections with the proof theory and model theory of logic are investigated It is concluded that operational seman ..."
Abstract - Cited by 808 (18 self) - Add to MetaCart
Sentences in first-order predicate logic can be usefully interpreted as programs In this paper the operational and fixpoint semantics of predicate logic programs are defined, and the connections with the proof theory and model theory of logic are investigated It is concluded that operational

A Maximum Entropy approach to Natural Language Processing

by Adam L. Berger, Stephen A. Della Pietra , Vincent J. Della Pietra - COMPUTATIONAL LINGUISTICS , 1996
"... The concept of maximum entropy can be traced back along multiple threads to Biblical times. Only recently, however, have computers become powerful enough to permit the widescale application of this concept to real world problems in statistical estimation and pattern recognition. In this paper we des ..."
Abstract - Cited by 1366 (5 self) - Add to MetaCart
describe a method for statistical modeling based on maximum entropy. We present a maximum-likelihood approach for automatically constructing maximum entropy models and describe how to implement this approach efficiently, using as examples several problems in natural language processing.

Logical foundations of object-oriented and frame-based languages

by Michael Kifer, Georg Lausen, James Wu - JOURNAL OF THE ACM , 1995
"... We propose a novel formalism, called Frame Logic (abbr., F-logic), that accounts in a clean and declarative fashion for most of the structural aspects of object-oriented and frame-based languages. These features include object identity, complex objects, inheritance, polymorphic types, query methods, ..."
Abstract - Cited by 876 (65 self) - Add to MetaCart
that come from object-oriented programming have direct representation in F-logic; other, secondary aspects of this paradigm are easily modeled as well. The paper also discusses semantic issues pertaining to programming with a deductive object-oriented language based on a subset of F-logic.
Next 10 →
Results 1 - 10 of 62,441
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University