Results 1  10
of
775,305
Contrastive estimation: Training loglinear models on unlabeled data
 In Proc. of ACL
, 2005
"... Conditional random fields (Lafferty et al., 2001) are quite effective at sequence labeling tasks like shallow parsing (Sha and Pereira, 2003) and namedentity extraction (McCallum and Li, 2003). CRFs are loglinear, allowing the incorporation of arbitrary features into the model. To train on unlabele ..."
Abstract

Cited by 157 (16 self)
 Add to MetaCart
on unlabeled data, we require unsupervised estimation methods for loglinear models; few exist. We describe a novel approach, contrastive estimation. We show that the new technique can be intuitively understood as exploiting implicit negative evidence and is computationally efficient. Applied to a sequence
Text Classification from Labeled and Unlabeled Documents using EM
 MACHINE LEARNING
, 1999
"... This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large qua ..."
Abstract

Cited by 1033 (19 self)
 Add to MetaCart
, and probabilistically labels the unlabeled documents. It then trains a new classifier using the labels for all the documents, and iterates to convergence. This basic EM procedure works well when the data conform to the generative assumptions of the model. However these assumptions are often violated in practice
A framework for learning predictive structures from multiple tasks and unlabeled data
 Journal of Machine Learning Research
, 2005
"... One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semisupervised learning. Although a number of such methods ar ..."
Abstract

Cited by 440 (3 self)
 Add to MetaCart
One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semisupervised learning. Although a number of such methods
Parsing the WSJ using CCG and loglinear models
 In Proceedings of the 42nd Meeting of the ACL
, 2004
"... This paper describes and evaluates loglinear parsing models for Combinatory Categorial Grammar (CCG). A parallel implementation of the LBFGS optimisation algorithm is described, which runs on a Beowulf cluster allowing the complete Penn Treebank to be used for estimation. We also develop a new eff ..."
Abstract

Cited by 187 (22 self)
 Add to MetaCart
This paper describes and evaluates loglinear parsing models for Combinatory Categorial Grammar (CCG). A parallel implementation of the LBFGS optimisation algorithm is described, which runs on a Beowulf cluster allowing the complete Penn Treebank to be used for estimation. We also develop a new
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propo ..."
Abstract

Cited by 766 (29 self)
 Add to MetaCart
propose a method to approach this problem by trying to estimate a function f which is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length
Estimating Wealth Effects without Expenditure Data— or Tears
 Policy Research Working Paper 1980, The World
, 1998
"... Abstract: We use the National Family Health Survey (NFHS) data collected in Indian states in 1992 and 1993 to estimate the relationship between household wealth and the probability a child (aged 6 to 14) is enrolled in school. A methodological difficulty to overcome is that the NFHS, modeled closely ..."
Abstract

Cited by 832 (16 self)
 Add to MetaCart
Abstract: We use the National Family Health Survey (NFHS) data collected in Indian states in 1992 and 1993 to estimate the relationship between household wealth and the probability a child (aged 6 to 14) is enrolled in school. A methodological difficulty to overcome is that the NFHS, modeled
Minimum Error Rate Training in Statistical Machine Translation
, 2003
"... Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is only a loose relation to the final translation quality on unseen text. In this paper, we analyze various training cri ..."
Abstract

Cited by 663 (7 self)
 Add to MetaCart
Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is only a loose relation to the final translation quality on unseen text. In this paper, we analyze various training
LogLinear Models
, 2004
"... This is yet another introduction to loglinear (“maximum entropy”) models for NLP practitioners, in the spirit of Berger (1996) and Ratnaparkhi (1997b). The derivations here are similar to Berger’s, but more details are filled in and some errors are corrected. I do not address iterative scaling (Dar ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
This is yet another introduction to loglinear (“maximum entropy”) models for NLP practitioners, in the spirit of Berger (1996) and Ratnaparkhi (1997b). The derivations here are similar to Berger’s, but more details are filled in and some errors are corrected. I do not address iterative scaling
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 557 (28 self)
 Add to MetaCart
for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster
Models and issues in data stream systems
 IN PODS
, 2002
"... In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, timevarying data streams. In addition to reviewing past work releva ..."
Abstract

Cited by 770 (19 self)
 Add to MetaCart
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, timevarying data streams. In addition to reviewing past work
Results 1  10
of
775,305