Results 1  10
of
18
Generalized expectation criteria for semisupervised learning of conditional random fields
 In In Proc. ACL, pages 870 – 878
, 2008
"... This paper presents a semisupervised training method for linearchain conditional random fields that makes use of labeled features rather than labeled instances. This is accomplished by using generalized expectation criteria to express a preference for parameter settings in which the model’s distri ..."
Abstract

Cited by 64 (8 self)
 Add to MetaCart
This paper presents a semisupervised training method for linearchain conditional random fields that makes use of labeled features rather than labeled instances. This is accomplished by using generalized expectation criteria to express a preference for parameter settings in which the model’s distribution on unlabeled data matches a target distribution. We induce target conditional probability distributions of labels given features from both annotated feature occurrences in context and adhoc feature majority label assignment. The use of generalized expectation criteria allows for a dramatic reduction in annotation time by shifting from traditional instancelabeling to featurelabeling, and the methods presented outperform traditional CRF training and other semisupervised methods when limited human effort is available. 1
Finding Deceptive Opinion Spam by Any Stretch of the Imagination
"... Consumers increasingly rate, review and research products online (Jansen, 2010; Litvin et al., 2008). Consequently, websites containing consumer reviews are becoming targets of opinion spam. While recent work has focused primarily on manually identifiable instances of opinion spam, in this work we s ..."
Abstract

Cited by 43 (7 self)
 Add to MetaCart
Consumers increasingly rate, review and research products online (Jansen, 2010; Litvin et al., 2008). Consequently, websites containing consumer reviews are becoming targets of opinion spam. While recent work has focused primarily on manually identifiable instances of opinion spam, in this work we study deceptive opinion spam—fictitious opinions that have been deliberately written to sound authentic. Integrating work from psychology and computational linguistics, we develop and compare three approaches to detecting deceptive opinion spam, and ultimately develop a classifier that is nearly 90 % accurate on our goldstandard opinion spam dataset. Based on feature analysis of our learned models, we additionally make several theoretical contributions, including revealing a relationship between deceptive opinions and imaginative writing. 1
Attacks on privacy and de finetti’s theorem
 In SIGMOD
, 2009
"... In this paper we present a method for reasoning about privacy using the concepts of exchangeability and deFinetti’s theorem. We illustrate the usefulness of this technique by using it to attack a popular data sanitization scheme known as Anatomy. We stress that Anatomy is not the only sanitization s ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
In this paper we present a method for reasoning about privacy using the concepts of exchangeability and deFinetti’s theorem. We illustrate the usefulness of this technique by using it to attack a popular data sanitization scheme known as Anatomy. We stress that Anatomy is not the only sanitization scheme that is vulnerable to this attack. In fact, any scheme that uses the random worlds model, i.i.d. model, or tupleindependent model needs to be reevaluated. The difference between the attack presented here and others that have been proposed in the past is that we do not need extensive background knowledge. An attacker only needs to know the nonsensitive attributes of one individual in the data, and can carry out this attack just by building a machine learning model over the sanitized data. The reason this attack is successful is that it exploits a subtle flaw in the way prior work computed the probability of disclosure of a sensitive attribute. We demonstrate this theoretically, empirically, and with intuitive examples. We also discuss how this generalizes to many other privacy schemes.
Learning From Measurements in Exponential Families
"... Given a model family and a set of unlabeled examples, one could either label specific examples or state general constraints—both provide information about the desired model. In general, what is the most costeffective way to learn? To address this question, we introduce measurements, a general class ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
Given a model family and a set of unlabeled examples, one could either label specific examples or state general constraints—both provide information about the desired model. In general, what is the most costeffective way to learn? To address this question, we introduce measurements, a general class of mechanisms for providing information about a target model. We present a Bayesian decisiontheoretic framework, which allows us to both integrate diverse measurements and choose new measurements to make. We use a variational inference algorithm, which exploits exponential family duality. The merits of our approach are demonstrated on two sequence labeling tasks. 1.
Hilbert Space Embeddings of Conditional Distributions with Applications to Dynamical Systems
, 2009
"... In this paper, we extend the Hilbert space embedding approach to handle conditional distributions. We derive a kernel estimate for the conditional embedding, and show its connection to ordinary embeddings. Conditional embeddings largely extend our ability to manipulate distributions in Hibert spaces ..."
Abstract

Cited by 28 (11 self)
 Add to MetaCart
In this paper, we extend the Hilbert space embedding approach to handle conditional distributions. We derive a kernel estimate for the conditional embedding, and show its connection to ordinary embeddings. Conditional embeddings largely extend our ability to manipulate distributions in Hibert spaces, and as an example, we derive a nonparametric method for modeling dynamical systems where the belief state of the system is maintained as a conditional embedding. Our method is very general in terms of both the domains and the types of distributions that it can handle, and we demonstrate the effectiveness of our method in various dynamical systems. We expect that conditional embeddings will have wider applications beyond modeling dynamical systems.
SVM Classifier Estimation from Group Probabilities
, 2010
"... A learning problem that has only recently gained attention in the machine learning community is that of learning a classifier from group probabilities. It is a learning task that lies somewhere between the wellknown tasks of supervised and unsupervised learning, in the sense that for a set of obser ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A learning problem that has only recently gained attention in the machine learning community is that of learning a classifier from group probabilities. It is a learning task that lies somewhere between the wellknown tasks of supervised and unsupervised learning, in the sense that for a set of observations we do not know the labels, but for some groups of observations, the frequency distribution of the label is known. This learning problem has important practical applications, for example in privacypreserving data mining. This paper presents an approach to learn a classifier from group probabilities based on support vector regression and the idea of inverting a classifier calibration process. A detailed analysis will show that this new approach outperforms existing approaches.
Unsupervised Supervised Learning II: MarginBased Classification without Labels
"... Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing marginbased risk functions. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and kno ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing marginbased risk functions. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and knowledgeof p(y). Weprove that theproposedrisk estimator is consistent on highdimensional datasets and demonstrate it on synthetic and realworld data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers using exclusively unlabeled data. 1
Learning from label proportions by optimizing cluster model selection
 Neural Networks, IEEE Transactions on
, 2005
"... Abstract. In a supervised learning scenario, we learn a mapping from input to output values, based on labeled examples. Can we learn such a mapping also from groups of unlabeled observations, only knowing, for each group, the proportion of observations with a particular label? Solutions have real w ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. In a supervised learning scenario, we learn a mapping from input to output values, based on labeled examples. Can we learn such a mapping also from groups of unlabeled observations, only knowing, for each group, the proportion of observations with a particular label? Solutions have real world applications. Here, we consider groups of steel sticks as samples in quality control. Since the steel sticks cannot be marked individually, for each group of sticks it is only known how many sticks of high (low) quality it contains. We want to predict the achieved quality for each stick before it reaches the final production station and quality control, in order to save resources. We define the problem of learning from label proportions and present a solution based on clustering. Our method empirically shows a better prediction performance than recent approaches based on probabilistic SVMs, Kernel kMeans or conditional exponential models. 1
MFSPFA: An Enhanced Filter based Feature Selection Algorithm
"... Feature Selection is the process of selecting the momentous feature subset from the original ones. This technique is frequently used as a preprocessing technique in data mining. In this study, a new feature selection algorithm is proposed and is called Modified Fisher Score Principal Feature Analysi ..."
Abstract
 Add to MetaCart
Feature Selection is the process of selecting the momentous feature subset from the original ones. This technique is frequently used as a preprocessing technique in data mining. In this study, a new feature selection algorithm is proposed and is called Modified Fisher Score Principal Feature Analysis (MFSPFA). The new algorithm is developed by combining the proposed Modified Fisher Score (MFS) and Principal Feature Analysis (PFA). The proposed algorithm is tested on publicly available datasets. The experimental results show that, the proposed algorithm is able to reduce the futile features and improves the classification accuracy.
Surrogate Learning From Feature Independence to SemiSupervised Classification
"... We consider the task of learning a classifier from the feature space X to the set of classes Y = {0, 1}, when the features can be partitioned into classconditionally independent feature sets X1 and X2. We show that the classconditional independence can be used to represent the original learning ta ..."
Abstract
 Add to MetaCart
We consider the task of learning a classifier from the feature space X to the set of classes Y = {0, 1}, when the features can be partitioned into classconditionally independent feature sets X1 and X2. We show that the classconditional independence can be used to represent the original learning task in terms of 1) learning a classifier from X2 to X1 (in the sense of estimating the probability P (x1x2))and 2) learning the classconditional distribution of the feature set X1. This fact can be exploited for semisupervised learning because the former task can be accomplished purely from unlabeled samples. We present experimental evaluation of the idea in two real world applications. 1