Results 11  20
of
554
Maximum margin semisupervised learning for structured variables
 Advances in Neural Information Processing Systems 18
, 2005
"... Abstract Many realworld classification problems involve the prediction ofmultiple interdependent variables forming some structural dependency. Recent progress in machine learning has mainly focused onsupervised classification of such structured variables. In this paper, we investigate structured c ..."
Abstract

Cited by 65 (0 self)
 Add to MetaCart
(Show Context)
Abstract Many realworld classification problems involve the prediction ofmultiple interdependent variables forming some structural dependency. Recent progress in machine learning has mainly focused onsupervised classification of such structured variables. In this paper, we investigate structured classification in a semisupervised setting.We present a discriminative approach that utilizes the intrinsic geometry of input patterns revealed by unlabeled data points and wederive a maximummargin formulation of semisupervised learning for structured variables. Unlike transductive algorithms, our formulation naturally extends to new test points.
Domain Adaptation from Multiple Sources via Auxiliary Classifiers
"... We propose a multiple source domain adaptation method, referred to as Domain Adaptation Machine (DAM), to learn a robust decision function (referred to as target classifier) for label prediction of patterns from the target domain by leveraging a set of precomputed classifiers (referred to as auxili ..."
Abstract

Cited by 59 (11 self)
 Add to MetaCart
(Show Context)
We propose a multiple source domain adaptation method, referred to as Domain Adaptation Machine (DAM), to learn a robust decision function (referred to as target classifier) for label prediction of patterns from the target domain by leveraging a set of precomputed classifiers (referred to as auxiliary/source classifiers) independently learned with the labeled patterns from multiple source domains. We introduce a new datadependent regularizer based on smoothness assumption into LeastSquares SVM (LSSVM), which enforces that the target classifier shares similar decision values with the auxiliary classifiers from relevant source domains on the unlabeled patterns of the target domain. In addition, we employ a sparsity regularizer to learn a sparse target classifier. Comprehensive experiments on the challenging TRECVID 2005 corpus demonstrate that DAM outperforms the existing multiple source domain adaptation methods for video concept detection in terms of effectiveness and efficiency. 1.
Data fusion and multicue data matching by diffusion maps
 IEEE Transactions on Pattern Analysis and Machine Intelligence
"... Abstract—Data fusion and multicue data matching are fundamental tasks of highdimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is threefold: First, we present the LaplaceBeltrami approach for computing density i ..."
Abstract

Cited by 56 (5 self)
 Add to MetaCart
Abstract—Data fusion and multicue data matching are fundamental tasks of highdimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is threefold: First, we present the LaplaceBeltrami approach for computing density invariant embeddings which are essential for integrating different sources of data. Second, we describe a refinement of the Nyström extension algorithm called “geometric harmonics. ” We also explain how to use this tool for data assimilation. Finally, we introduce a multicue data matching scheme based on nonlinear spectral graphs alignment. The effectiveness of the presented schemes is validated by applying it to the problems of lipreading and image sequence alignment. Index Terms—Pattern matching, graph theory, graph algorithms, Markov processes, machine learning, data mining, image databases. Ç 1
Least Squares Linear Discriminant Analysis
"... Linear Discriminant Analysis (LDA) is a wellknown method for dimensionality reduction and classification. LDA in the binaryclass case has been shown to be equivalent to linear regression with the class label as the output. This implies that LDA for binaryclass classifications can be formulated as ..."
Abstract

Cited by 51 (6 self)
 Add to MetaCart
(Show Context)
Linear Discriminant Analysis (LDA) is a wellknown method for dimensionality reduction and classification. LDA in the binaryclass case has been shown to be equivalent to linear regression with the class label as the output. This implies that LDA for binaryclass classifications can be formulated as a least squares problem. Previous studies have shown certain relationship between multivariate linear regression and LDA for the multiclass case. Many of these studies show that multivariate linear regression with a specific class indicator matrix as the output can be applied as a preprocessing step for LDA. However, directly casting LDA as a least squares problem is challenging for the multiclass case. In this paper, a novel formulation for multivariate linear regression is proposed. The equivalence relationship between the proposed least squares formulation and LDA for multiclass classifications is rigorously established under a mild condition, which is shown empirically to hold in many applications involving highdimensional data. Several LDA extensions based on the equivalence relationship are discussed. 1.
Large Graph Construction for Scalable SemiSupervised Learning
"... In this paper, we address the scalability issue plaguing graphbased semisupervised learningviaasmallnumberofanchorpointswhich adequatelycovertheentirepointcloud. Critically, these anchor points enable nonparametric regression that predicts the label for each data point as a locally weighted averag ..."
Abstract

Cited by 51 (15 self)
 Add to MetaCart
In this paper, we address the scalability issue plaguing graphbased semisupervised learningviaasmallnumberofanchorpointswhich adequatelycovertheentirepointcloud. Critically, these anchor points enable nonparametric regression that predicts the label for each data point as a locally weighted average of the labels on anchor points. Becauseconventionalgraphconstructionisinefficient in large scale, we propose to construct a tractable large graph by coupling anchorbased label prediction and adjacency matrix design. Contrary to the Nyström approximation of adjacency matrices which results in indefinite graph Laplacians and in turn leads to potential nonconvex optimization over graphs, the proposed graph construction approach based on a unique idea called AnchorGraph provides nonnegative adjacency matrices to guarantee positive semidefinite graph Laplacians. Our approach scales linearly with the data size and in practice usually produces a large sparse graph. Experiments on large datasets demonstrate the significant accuracy improvement and scalability of the proposed approach. 1.
Exploiting Social Context for Review Quality Prediction
"... Online reviews in which users publish detailed commentary about their experiences and opinions with products, services, or events are extremely valuable to users who rely on them to make informed decisions. However, reviews vary greatly in quality and are constantly increasing in number, therefore, ..."
Abstract

Cited by 49 (4 self)
 Add to MetaCart
(Show Context)
Online reviews in which users publish detailed commentary about their experiences and opinions with products, services, or events are extremely valuable to users who rely on them to make informed decisions. However, reviews vary greatly in quality and are constantly increasing in number, therefore, automatic assessment of review helpfulness is of growing importance. Previous work has addressed the problem by treating a review as a standalone document, extracting features from the review text, and learning a function based on these features for predicting the review quality. In this work, we exploit contextual information about authors ’ identities and social networks for improving review quality prediction. We propose a generic framework for incorporating social context information by adding regularization constraints to the textbased predictor. Our approach can effectively use the social context information available for large amount of unlabeled reviews. It also has the advantage that the resulting predictor is usable even when social context is unavailable. We validate our framework within a real commerce portal and experimentally demonstrate that using social context information can help improve the accuracy of review quality prediction especially when the available training data is sparse.
Harmonic mixtures: combining mixture models and graphbased methods for inductive and scalable semisupervised learning
 In Proc. Int. Conf. Machine Learning
, 2005
"... Graphbased methods for semisupervised learning have recently been shown to be promising for combining labeled and unlabeled data in classification problems. However, inference for graphbased methods often does not scale well to very large data sets, since it requires inversion of a large matrix or ..."
Abstract

Cited by 48 (2 self)
 Add to MetaCart
Graphbased methods for semisupervised learning have recently been shown to be promising for combining labeled and unlabeled data in classification problems. However, inference for graphbased methods often does not scale well to very large data sets, since it requires inversion of a large matrix or solution of a large linear program. Moreover, such approaches are inherently transductive, giving predictions for only those points in the unlabeled set, and not for an arbitrary test point. In this paper a new approach is presented that preserves the strengths of graphbased semisupervised learning while overcoming the limitations of scalability and noninductive inference, through a combination of generative mixture models and discriminative regularization using the graph Laplacian. Experimental results show that this approach preserves the accuracy of purely graphbased transductive methods when the data has “manifold structure, ” and at the same time achieves inductive learning with significantly reduced computational cost. 1.
Manifold Discriminant Analysis
, 2009
"... This paper presents a novel discriminative learning method, called Manifold Discriminant Analysis (MDA), to solve the problem of image set classification. By modeling each image set as a manifold, we formulate the problem as classificationoriented multimanifolds learning. Aiming at maximizing âm ..."
Abstract

Cited by 48 (12 self)
 Add to MetaCart
This paper presents a novel discriminative learning method, called Manifold Discriminant Analysis (MDA), to solve the problem of image set classification. By modeling each image set as a manifold, we formulate the problem as classificationoriented multimanifolds learning. Aiming at maximizing âmanifold marginâ, MDA seeks to learn an embedding space, where manifolds with different class labels are better separated, and local data compactness within each manifold is enhanced. As a result, new testing manifold can be more reliably classified in the learned embedding space. The proposed method is evaluated on the tasks of object recognition with image sets, including face recognition and object categorization. Comprehensive comparisons and extensive experiments demonstrate the effectiveness of our method.
Nonnegative matrix factorization on manifold
 In ICDM
, 2008
"... Recently Nonnegative Matrix Factorization (NMF) has received a lot of attentions in information retrieval, computer vision and pattern recognition. NMF aims to find two nonnegative matrices whose product can well approximate the original matrix. The sizes of these two matrices are usually smaller ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
Recently Nonnegative Matrix Factorization (NMF) has received a lot of attentions in information retrieval, computer vision and pattern recognition. NMF aims to find two nonnegative matrices whose product can well approximate the original matrix. The sizes of these two matrices are usually smaller than the original matrix. This results in a compressed version of the original data matrix. The solution of NMF yields a natural partsbased representation for the data. When NMF is applied for data representation, a major disadvantage is that it fails to consider the geometric structure in the data. In this paper, we develop a graph based approach for partsbased data representation in order to overcome this limitation. We construct an affinity graph to encode the geometrical information and seek a matrix factorization which respects the graph structure. We demonstrate the success of this novel algorithm by applying it on real world problems. 1.
XuDong Zhang,
"... This paper studies global ranking problem by learning to rank methods. Conventional learning to rank methods are usually designed for ‘local ranking’, in the sense that the ranking model is defined on a single object, for example, a document in information retrieval. For many applications, this is a ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
(Show Context)
This paper studies global ranking problem by learning to rank methods. Conventional learning to rank methods are usually designed for ‘local ranking’, in the sense that the ranking model is defined on a single object, for example, a document in information retrieval. For many applications, this is a very loose approximation. Relations always exist between objects and it is better to define the ranking model as a function on all the objects to be ranked (i.e., the relations are also included). This paper refers to the problem as global ranking and proposes employing a Continuous Conditional Random Fields (CRF) for conducting the learning task. The Continuous CRF model is defined as a conditional probability distribution over ranking scores of objects conditioned on the objects. It can naturally represent the content information of objects as well as the relation information between objects, necessary for global ranking. Taking two specific information retrieval tasks as examples, the paper shows how the Continuous CRF method can perform global ranking better than baselines. 1