Results 1 
7 of
7
Semisupervised learning with measure propagation
 Journal of Machine Learning. Research
, 2011
"... We describe a new objective for graphbased semisupervised learning based on minimizing the KullbackLeibler divergence between discrete probability measures that encode class membership probabilities. We show how the proposed objective can be efficiently optimized using alternating minimization. W ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
We describe a new objective for graphbased semisupervised learning based on minimizing the KullbackLeibler divergence between discrete probability measures that encode class membership probabilities. We show how the proposed objective can be efficiently optimized using alternating minimization. We prove that the alternating minimization procedure converges to the correct optimum and derive a simple test for convergence. In addition, we show how this approach can be scaled to solve the semisupervised learning problem on very large data sets, for example, in one instance we use a data set with over 10 8 samples. In this context, we propose a graph node ordering algorithm that is also applicable to other graphbased semisupervised learning approaches. We compare the proposed approach against other standard semisupervised learning algorithms on the semisupervised learning benchmark data sets (Chapelle et al., 2007), and other realworld tasks such as text classification on Reuters and WebKB, speech phone classification on TIMIT and Switchboard, and linguistic dialogact tagging on Dihana and Switchboard. In each case, the proposed approach outperforms the stateoftheart. Lastly, we show that our objective can be generalized into a form that includes the standard squarederror loss, and we prove a geometric rate of convergence in that case.
SoftSupervised Learning for Text Classification
"... We propose a new graphbased semisupervised learning (SSL) algorithm and demonstrate its application to document categorization. Each document is represented by a vertex within a weighted undirected graph and our proposed framework minimizes the weighted KullbackLeibler divergence between distribut ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
We propose a new graphbased semisupervised learning (SSL) algorithm and demonstrate its application to document categorization. Each document is represented by a vertex within a weighted undirected graph and our proposed framework minimizes the weighted KullbackLeibler divergence between distributions that encode the class membership probabilities of each vertex. The proposed objective is convex with guaranteed convergence using an alternating minimization procedure. Further, it generalizes in a straightforward manner to multiclass problems. We present results on two standard tasks, namely Reuters21578 and WebKB, showing that the proposed algorithm significantly outperforms the stateoftheart. 1
Graphbased Learning for Statistical Machine Translation
"... Current phrasebased statistical machine translation systems process each test sentence in isolation and do not enforce global consistency constraints, even though the test data is often internally consistent with respect to topic or style. We propose a new consistency model for machine translation ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Current phrasebased statistical machine translation systems process each test sentence in isolation and do not enforce global consistency constraints, even though the test data is often internally consistent with respect to topic or style. We propose a new consistency model for machine translation in the form of a graphbased semisupervised learning algorithm that exploits similarities between training and test data and also similarities between different test sentences. The algorithm learns a regression function jointly over training and test data and uses the resulting scores to rerank translation hypotheses. Evaluation on two travel expression translation tasks demonstrates improvements of up to 2.6 BLEU points absolute and 2.8 % in PER. 1
A Graphbased SemiSupervised Learning for QuestionAnswering
"... We present a graphbased semisupervised learning for the questionanswering (QA) task for ranking candidate sentences. Using textual entailment analysis, we obtain entailment scores between a natural language question posed by the user and the candidate sentences returned from search engine. The te ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
We present a graphbased semisupervised learning for the questionanswering (QA) task for ranking candidate sentences. Using textual entailment analysis, we obtain entailment scores between a natural language question posed by the user and the candidate sentences returned from search engine. The textual entailment between two sentences is assessed via features representing highlevel attributes of the entailment problem such as sentence structure matching, questiontype namedentity matching based on a questionclassifier, etc. We implement a semisupervised learning (SSL) approach to demonstrate that utilization of more unlabeled data points can improve the answerranking task of QA. We create a graph for labeled and unlabeled data using matchscores of textual entailment features as similarity weights between data points. We apply a summarization method on the graph to make the computations feasible on large datasets. With a new representation of graphbased SSL on QA datasets using only a handful of features, and under limited amounts of labeled data, we show improvement in generalization performance over stateoftheart QA models. 1
Classifier Based Graph Construction for Video Segmentation
"... Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graphbased methods, enabling topperformance on recent benchmarks, consist of three essential components: 1. powerful features account for object appearance and motion similarities; ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graphbased methods, enabling topperformance on recent benchmarks, consist of three essential components: 1. powerful features account for object appearance and motion similarities; 2. spatiotemporal neighborhoods of pixels or superpixels (the graph edges) are modeled using a combination of those features; 3. video segmentation is formulated as a graph partitioning problem. While a wide variety of features have been explored and various graph partition algorithms have been proposed, there is surprisingly little research on how to construct a graph to obtain the best video segmentation performance. This is the focus of our paper. We propose to combine features by means of a classifier, use calibrated classifier outputs as edge weights and define the graph topology by edge selection. By learning the graph (without changes to the graph partitioning method), we improve the results of the best performing video segmentation algorithm by 6% on the challenging VSB100 benchmark, while reducing its runtime by 55%, as the learnt graph is much sparser.
Using the Mutual kNearest Neighbor Graphs for Semisupervised Classification of Natural Language Data
"... The first step in graphbased semisupervised classification is to construct a graph from input data. While the knearest neighbor graphs have been the de facto standard method of graph construction, this paper advocates using the less wellknown mutual knearest neighbor graphs for highdimensional ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
The first step in graphbased semisupervised classification is to construct a graph from input data. While the knearest neighbor graphs have been the de facto standard method of graph construction, this paper advocates using the less wellknown mutual knearest neighbor graphs for highdimensional natural language data. To compare the performance of these two graph construction methods, we run semisupervised classification methods on both graphs in word sense disambiguation and document classification tasks. The experimental results show that the mutual knearest neighbor graphs, if combined with maximum spanning trees, consistently outperform the knearest neighbor graphs. We attribute better performance of the mutual knearest neighbor graph to its being more resistive to making hub vertices. The mutual knearest neighbor graphs also perform equally well or even better in comparison to the stateoftheart bmatching graph construction, despite their lower computational complexity. 1
A Graph Regularization Based Approach to Transductive ClassMembership Prediction
"... Abstract. Considering the increasing availability of structured machine processable knowledge in the context of the Semantic Web, only relying on purely deductive inference may be limiting. This work proposes a new method for similaritybased classmembership prediction in Description Logic knowledge ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Considering the increasing availability of structured machine processable knowledge in the context of the Semantic Web, only relying on purely deductive inference may be limiting. This work proposes a new method for similaritybased classmembership prediction in Description Logic knowledge bases. The underlying idea is based on the concept of propagating classmembership information among similar individuals; it is nonparametric in nature and characterised by interesting complexity properties, making it a potential candidate for largescale transductive inference. We also evaluate its effectiveness with respect to other approaches based on inductive inference in SW literature. 1