Results 1  10
of
50
Relational Learning via Collective Matrix Factorization
, 2008
"... Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations would then encode ..."
Abstract

Cited by 127 (4 self)
 Add to MetaCart
(Show Context)
Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations would then encode users ’ ratings of movies, movies ’ genres, and actors ’ roles in movies. A common prediction technique given one pairwise relation, for example a #users × #movies ratings matrix, is lowrank matrix factorization. In domains with multiple relations, represented as multiple matrices, we may improve predictive accuracy by exploiting information from one relation while predicting another. To this end, we propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. Each relation can have a different value type and error distribution; so, we allow nonlinear relationships between the parameters and outputs, using Bregman divergences to measure error. We extend standard alternating projection algorithms to our model, and derive an efficient Newton update for the projection. Furthermore, we propose stochastic optimization methods to deal with large, sparse matrices. Our model generalizes several existing matrix factorization methods, and therefore yields new largescale optimization algorithms for these problems. Our model can handle any pairwise relational schema and a
Mining multilabel data
 In Data Mining and Knowledge Discovery Handbook
, 2010
"... A large body of research in supervised learning deals with the analysis of singlelabel data, where training examples are associated with a single label λ from a set of disjoint labels L. However, training examples in several application domains are often associated with a set of labels Y ⊆ L. Such d ..."
Abstract

Cited by 88 (9 self)
 Add to MetaCart
(Show Context)
A large body of research in supervised learning deals with the analysis of singlelabel data, where training examples are associated with a single label λ from a set of disjoint labels L. However, training examples in several application domains are often associated with a set of labels Y ⊆ L. Such data are called multilabel.
Combining Content and Link for Classification using Matrix Factorization
, 2007
"... The world wide web contains rich textual contents that are interconnected via complex hyperlinks. This huge database violates the assumption held by most of conventional statistical methods that each web page is considered as an independent and identical sample. It is thus difficult to apply traditi ..."
Abstract

Cited by 67 (8 self)
 Add to MetaCart
The world wide web contains rich textual contents that are interconnected via complex hyperlinks. This huge database violates the assumption held by most of conventional statistical methods that each web page is considered as an independent and identical sample. It is thus difficult to apply traditional mining or learning methods for solving web mining problems, e.g., web page classification, by exploiting both the content and the link structure. The research in this direction has recently received considerable attention but are still in an early stage. Though a few methods exploit both the link structure or the content information, some of them combine the only authority information with the content information, and the others first decompose the link structure into hub and authority features, then apply them as additional document features. Being practically attractive for its great simplicity, this paper aims to design an algorithm that exploits both the content and linkage information, by carrying out a joint factorization on both the linkage adjacency matrix and the documentterm matrix, and derives a new representation for web pages in a lowdimensional factor space, without explicitly separating them as content, hub or authority factors. Further analysis can be performed based on the compact representation of web pages. In the experiments, the proposed method is compared with stateoftheart methods and demonstrates an excellent accuracy in hypertext classification on the WebKB and Cora benchmarks.
Correlated label propagation with application to multilabel learning
 IN: CVPR ’06: PROCEEDINGS OF THE 2006 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
, 2006
"... Many computer vision applications, such as scene analysis and medical image interpretation, are illsuited for traditional classification where each image can only be associated with a single class. This has stimulated recent work in multilabel learning where a given image can be tagged with multip ..."
Abstract

Cited by 59 (0 self)
 Add to MetaCart
(Show Context)
Many computer vision applications, such as scene analysis and medical image interpretation, are illsuited for traditional classification where each image can only be associated with a single class. This has stimulated recent work in multilabel learning where a given image can be tagged with multiple class labels. A serious problem with existing approaches is that they are unable to exploit correlations between class labels. This paper presents a novel framework for multilabel learning termed Correlated Label Propagation (CLP) that explicitly models interactions between labels in an efficient manner. As in standard label propagation, labels attached to training data points are propagated to test data points; however, unlike standard algorithms that treat each label independently, CLP simultaneously copropagates multiple labels. Existing work eschews such an approach since naive algorithms for label copropagation are intractable. We present an algorithm based on properties of submodular functions that efficiently finds an optimal solution. Our experiments demonstrate that CLP leads to significant gains in precision/recall against standard techniques on two realworld computer vision tasks involving several hundred labels.
A Unified View of Matrix Factorization Models
"... Abstract. We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, EPCA, MMMF, pLSI, pLSIpHITS, Bregman coclustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as m ..."
Abstract

Cited by 57 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, EPCA, MMMF, pLSI, pLSIpHITS, Bregman coclustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as minimizing a generalized Bregman divergence, and we show that (i) a straightforward alternating projection algorithm can be applied to almost any model in our unified view; (ii) the Hessian for each projection has special structure that makes a Newton projection feasible, even when there are equality constraints on the factors, which allows for matrix coclustering; and (iii) alternating projections can be generalized to simultaneously factor a set of matrices that share dimensions. These observations immediately yield new optimization algorithms for the above factorization methods, and suggest novel generalizations of these methods such as incorporating row and column biases, and adding or relaxing clustering constraints. 1
Semisupervised Multilabel Learning by Constrained Nonnegative Matrix Factorization
, 2006
"... We present a novel framework for multilabel learning that explicitly addresses the challenge arising from the large number of classes and a small size of training data. The key assumption behind this work is that two examples tend to have large overlap in their assigned class memberships if they sh ..."
Abstract

Cited by 55 (1 self)
 Add to MetaCart
We present a novel framework for multilabel learning that explicitly addresses the challenge arising from the large number of classes and a small size of training data. The key assumption behind this work is that two examples tend to have large overlap in their assigned class memberships if they share high similarity in their input patterns. We capitalize this assumption by first computing two sets of similarities, one based on the input patterns of examples, and the other based on the class memberships of the examples. We then search for the optimal assignment of class memberships to the unlabeled data that minimizes the difference between these two sets of similarities. The optimization problem is formulated as a constrained Nonnegative Matrix Factorization (NMF) problem, and an algorithm is presented to efficiently find the solution. Compared to the existing approaches for multilabel learning, the proposed approach is advantageous in that it is able to explore both the unlabeled data and the correlation among different classes simultaneously. Experiments with text categorization show that our approach performs significantly better than several stateoftheart classification techniques when the number of classes is large and the size of training data is small.
Extracting Shared Subspace for Multilabel Classification
"... Multilabel problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multipl ..."
Abstract

Cited by 54 (2 self)
 Add to MetaCart
(Show Context)
Multilabel problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multilabel classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is nonconvex. For highdimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several wellknown algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multitopic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.
Semisupervised Multilabel Learning by Solving a Sylvester Equation
"... Multilabel learning refers to the problems where an instance can be assigned to more than one category. In this paper, we present a novel Semisupervised algorithm for Multilabel learning by solving a Sylvester Equation (SMSE). Two graphs are first constructed on instance level and category level ..."
Abstract

Cited by 45 (0 self)
 Add to MetaCart
(Show Context)
Multilabel learning refers to the problems where an instance can be assigned to more than one category. In this paper, we present a novel Semisupervised algorithm for Multilabel learning by solving a Sylvester Equation (SMSE). Two graphs are first constructed on instance level and category level respectively. For instance level, a graph is defined based on both labeled and unlabeled instances, where each node represents one instance and each edge weight reflects the similarity between corresponding pairwise instances. Similarly, for category level, a graph is also built based on
Multiinstance multilabel learning
 Artificial Intelligence
"... In this paper, we propose the MIML (MultiInstance MultiLabel learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicate ..."
Abstract

Cited by 38 (16 self)
 Add to MetaCart
(Show Context)
In this paper, we propose the MIML (MultiInstance MultiLabel learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicated objects which have multiple semantic meanings. To learn from MIML examples, we propose the MimlBoost and MimlSvm algorithms based on a simple degeneration strategy, and experiments show that solving problems involving complicated objects with multiple semantic meanings in the MIML framework can lead to good performance. Consideringthat the degeneration process may lose information, we propose the DMimlSvm algorithm which tackles MIML problems directly in a regularization framework. Moreover, we show that even when we do not have access to the real objects and thus cannot capture more information from real objects by using the MIML representation, MIML is still useful. We propose the InsDif and SubCod algorithms. InsDif works by transforming singleinstances into the MIML representation for learning, while SubCod works by transforming singlelabel examples into the MIML representation for learning. Experiments show that in some tasks they are able to achieve better performance than learning the singleinstances or singlelabel examples directly.
Multilabel dimensionality reduction via dependence maximization
 In Proceedings of AAAI Conference on Artificial Intelligence(AAAI
, 2008
"... Multilabel learning deals with data associated with multiple labels simultaneously. Like other machine learning and data mining tasks, multilabel learning also suffers from the curse of dimensionality. Although dimensionality reduction has been studied for many years, multilabel dimensionality r ..."
Abstract

Cited by 35 (6 self)
 Add to MetaCart
Multilabel learning deals with data associated with multiple labels simultaneously. Like other machine learning and data mining tasks, multilabel learning also suffers from the curse of dimensionality. Although dimensionality reduction has been studied for many years, multilabel dimensionality reduction remains almost untouched. In this paper, we propose a multilabel dimensionality reduction method, MDDM, which attempts to project the original data into a lowerdimensional feature space maximizing the dependence between the original feature description and the associated class labels. Based on the HilbertSchmidt Independence Criterion, we derive a closedform solution which enables the dimensionality reduction process to be efficient. Experiments validate the performance of MDDM.