Results 1  10
of
20,463
Toward a model of text comprehension and production
 Psychological Review
, 1978
"... The semantic structure of texts can be described both at the local microlevel and at a more global macrolevel. A model for text comprehension based on this notion accounts for the formation of a coherent semantic text base in terms of a cyclical process constrained by limitations of working memory. ..."
Abstract

Cited by 557 (12 self)
 Add to MetaCart
The semantic structure of texts can be described both at the local microlevel and at a more global macrolevel. A model for text comprehension based on this notion accounts for the formation of a coherent semantic text base in terms of a cyclical process constrained by limitations of working memory
A comparison of event models for Naive Bayes text classification
, 1998
"... Recent work in text classification has used two different firstorder probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multivariate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey ..."
Abstract

Cited by 1025 (26 self)
 Add to MetaCart
Recent work in text classification has used two different firstorder probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multivariate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e
Termweighting approaches in automatic text retrieval
 INFORMATION PROCESSING AND MANAGEMENT
, 1988
"... The experimental evidence accumulated over the past 20 years indicates that text indexing systems based on the assignment of appropriately weighted single terms produce retrieval results that are superior to those obtainable with other more elaborate text representations. These results depend crucia ..."
Abstract

Cited by 2189 (10 self)
 Add to MetaCart
The experimental evidence accumulated over the past 20 years indicates that text indexing systems based on the assignment of appropriately weighted single terms produce retrieval results that are superior to those obtainable with other more elaborate text representations. These results depend
Parallel Networks that Learn to Pronounce English Text
 COMPLEX SYSTEMS
, 1987
"... This paper describes NETtalk, a class of massivelyparallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed h ..."
Abstract

Cited by 549 (5 self)
 Add to MetaCart
This paper describes NETtalk, a class of massivelyparallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed
Text Classification from Labeled and Unlabeled Documents using EM
 MACHINE LEARNING
, 1999
"... This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large qua ..."
Abstract

Cited by 1033 (15 self)
 Add to MetaCart
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large
Latent dirichlet allocation
 Journal of Machine Learning Research
, 2003
"... We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a threelevel hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, ..."
Abstract

Cited by 4365 (92 self)
 Add to MetaCart
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a threelevel hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is
Using Lexical Chains for Text Summarization
, 1997
"... We investigate one technique to produce a summary of an original text without requiring its full semantic interpretation, but instead relying on a model of the topic progression in the text derived from lexical chains. We present a new algorithm to compute lexical chains in a text, merging several r ..."
Abstract

Cited by 451 (9 self)
 Add to MetaCart
We investigate one technique to produce a summary of an original text without requiring its full semantic interpretation, but instead relying on a model of the topic progression in the text derived from lexical chains. We present a new algorithm to compute lexical chains in a text, merging several
The Infinite Hidden Markov Model
 Machine Learning
, 2002
"... We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. Th ..."
Abstract

Cited by 637 (41 self)
 Add to MetaCart
We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract

Cited by 561 (18 self)
 Add to MetaCart
Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled
Three Generative, Lexicalised Models for Statistical Parsing
, 1997
"... In this paper we first propose a new statistical parsing model, which is a generative model of lexicalised contextfree gram mar. We then extend the model to in clude a probabilistic treatment of both subcategorisation and wh~movement. Results on Wall Street Journal text show that the parse ..."
Abstract

Cited by 570 (8 self)
 Add to MetaCart
In this paper we first propose a new statistical parsing model, which is a generative model of lexicalised contextfree gram mar. We then extend the model to in clude a probabilistic treatment of both subcategorisation and wh~movement. Results on Wall Street Journal text show
Results 1  10
of
20,463