• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

A review on multi-label learning algorithms. (2014)

by M Zhang, Z Zhou
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 41
Next 10 →

Active learning by querying informative and representative examples

by Sheng-jun Huang, Rong Jin, Zhi-hua Zhou - in Advances in Neural Information Processing Systems (NIPS'10 , 2010
"... Most active learning approaches select either informative or representative unla-beled instances to query their labels. Although several active learning algorithms have been proposed to combine the two criteria for query selection, they are usu-ally ad hoc in finding unlabeled instances that are bot ..."
Abstract - Cited by 34 (4 self) - Add to MetaCart
Most active learning approaches select either informative or representative unla-beled instances to query their labels. Although several active learning algorithms have been proposed to combine the two criteria for query selection, they are usu-ally ad hoc in finding unlabeled instances that are both informative and repre-sentative. We address this challenge by a principled approach, termed QUIRE, based on the min-max view of active learning. The proposed approach provides a systematic way for measuring and combining the informativeness and represen-tativeness of an instance. Extensive experimental results show that the proposed QUIRE approach outperforms several state-of-the-art active learning approaches. 1
(Show Context)

Citation Context

...dentifying queries that are both informative and representative, which is verified by our empirical study. The second contribution of this work is to extend the QUIRE approach to multi-label learning =-=[53]-=-, a setting that is much less studied in active learning. Unlike single-label learning where one instance is assumed to be associated with only one label, in multilabel learning, instances can be assi...

Active Query Driven by Uncertainty and Diversity for Incremental Multi-Label Learning ∗

by Sheng-jun Huang, Zhi-hua Zhou
"... Abstract—In multi-label learning, it is rather expensive to label instances since they are simultaneously associated with multiple labels. Therefore, active learning, which reduces the la-beling cost by actively querying the labels of the most valuable data, becomes particularly important for multi- ..."
Abstract - Cited by 3 (3 self) - Add to MetaCart
Abstract—In multi-label learning, it is rather expensive to label instances since they are simultaneously associated with multiple labels. Therefore, active learning, which reduces the la-beling cost by actively querying the labels of the most valuable data, becomes particularly important for multi-label learning. A strong multi-label active learning algorithm usually consists of two crucial elements: a reasonable criterion to evaluate the gain of queried label, and an effective classification model, based on whose prediction the criterion can be accurately computed. In this paper, we first introduce an effective multi-label classification model by combining label ranking with threshold learning, which is incrementally trained to avoid retraining from scratch after every query. Based on this model, we then propose to exploit both uncertainty and diversity in the instance space as well as the label space, and actively query the instance-label pairs which can improve the classification model most. Experimental results demonstrate the superiority of the proposed approach to state-of-the-art methods. Keywords-active learning; multi-label learning; uncertainty; diversity I.
(Show Context)

Citation Context

...object can ∗This work was partially supported by NSFC (61073097, 61273301) and JiangsuSF (BK2011566) have multiple labels simultaneously. Multi-label learning is a framework dealing with such objects =-=[32]-=-. To label the multi-label examples, each of the multiple labels should be decided whether a proper one for an instance. Obviously, the labeling cost is even higher than that of single label learning,...

Multi-label Classification via Feature-aware Implicit Label Space Encoding

by Guiguang Ding, Mingqing Hu, Jianmin Wang
"... To tackle a multi-label classification problem with many classes, recently label space dimen-sion reduction (LSDR) is proposed. It encodes the original label space to a low-dimensional la-tent space and uses a decoding process for recov-ery. In this paper, we propose a novel method termed FaIE to pe ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
To tackle a multi-label classification problem with many classes, recently label space dimen-sion reduction (LSDR) is proposed. It encodes the original label space to a low-dimensional la-tent space and uses a decoding process for recov-ery. In this paper, we propose a novel method termed FaIE to perform LSDR via Feature-aware Implicit label space Encoding. Unlike most pre-vious work, the proposed FaIE makes no as-sumptions about the encoding process and di-rectly learns a code matrix, i.e. the encoding re-sult of some implicit encoding function, and a linear decoding matrix. To learn both matrices, FaIE jointly maximizes the recoverability of the original label space from the latent space, and the predictability of the latent space from the feature space, thus making itself feature-aware. FaIE can also be specified to learn an explicit encod-ing function, and extended with kernel tricks to handle non-linear correlations between the fea-ture space and the latent space. Extensive ex-periments conducted on benchmark datasets well demonstrate its effectiveness. 1.

Protein Function Prediction Using Dependence

by Guoxian Yu, Carlotta Domeniconi, Huzefa Rangwala, Guoji Zhang
"... Abstract. Protein function prediction is one of the fundamental tasks in the post genomic era. The vast amount of available proteomic data makes it possible to computationally annotate proteins. Most computa-tional approaches predict protein functions by using the labeled proteins and assuming that ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
Abstract. Protein function prediction is one of the fundamental tasks in the post genomic era. The vast amount of available proteomic data makes it possible to computationally annotate proteins. Most computa-tional approaches predict protein functions by using the labeled proteins and assuming that the annotation of labeled proteins is complete, and without any missing functions. However, partially annotated proteins are common in real-world scenarios, that is a protein may have some confirmed functions, and whether it has other functions is unknown. In this paper, we make use of partially annotated proteomic data, and propose an approach called Protein Function Prediction using Dependency M aximization (ProDM). ProDM works by leveraging the correlation between different function labels, the ‘guilt by association ’ rule between proteins, and maximizes the dependency between function labels and feature expression of proteins. ProDM can replenish the missing func-tions of partially annotated proteins (a seldom studied problem), and can predict functions for completely unlabeled proteins using partially anno-tated ones. An empirical study on publicly available protein-protein inter-action (PPI) networks shows that, when the number of missing functions is large, ProDM performs significantly better than other related methods with respect to various evaluation criteria. 1
(Show Context)

Citation Context

...ion can be viewed as a multilabel learning problem and evaluated using multi-label learning metrics [10,22]. Various evaluation metrics have been developed for evaluating multi-label learning methods =-=[23]-=-. Here we use five metrics: MicroF1, MacroF1, HammingLoss, RankingLoss and adapted AUC [4]. These metrics were also used to evaluate WELL [19], MLR-GL [4], and ProWL [22]. In addition, we design RAccu...

Learning from Label and Feature Heterogeneity

by Pei Yang, Jingrui He, Hongxia Yang, Haoda Fu , 2014
"... Abstract—Multiple types of heterogeneity, such as label het-erogeneity and feature heterogeneity, often co-exist in many real-world data mining applications, such as news article catego-rization, gene functionality prediction. To effectively leverage such heterogeneity, in this paper, we propose a n ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
Abstract—Multiple types of heterogeneity, such as label het-erogeneity and feature heterogeneity, often co-exist in many real-world data mining applications, such as news article catego-rization, gene functionality prediction. To effectively leverage such heterogeneity, in this paper, we propose a novel graph-based framework for Learning with both Label and Feature heterogeneities, namely L2F. It models the label correlation by requiring that any two label-specific classifiers behave similarly on the same views if the associated labels are similar, and imposes the view consistency by requiring that view-based classifiers generate similar predictions on the same examples. To solve the resulting optimization problem, we propose an iterative algorithm, which is guaranteed to converge to the global optimum. Furthermore, we analyze its generalization performance based on Rademacher complexity, which sheds light on the benefits of jointly modeling the label and feature heterogeneity. Experimental results on various data sets show the effectiveness of the proposed approach. Keywords—multi-label learning; multi-view learning; hetero-geneity; Rademacher complexity. I.
(Show Context)

Citation Context

...umber of labels per instance. Accordingly, label density normalizes label cardinality by the the number of labels. Label diversity is the number of distinct label combinations observed in the dataset =-=[22]-=-. The first dataset is the Medical dataset [11]. The Computational Medical Center organized Medical NLP Challenge 2 with a rich set of medical text corpus. This dataset is actually a collection of pat...

Towards Class-Imbalance Aware Multi-Label Learning

by Min-ling Zhang, Yu-kun Li, Xu-ying Liu
"... In multi-label learning, each object is represented by a single instance while associated with a set of class labels. Due to the huge (exponential) num-ber of possible label sets for prediction, existing ap-proaches mainly focus on how to exploit label cor-relations to facilitate the learning proces ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
In multi-label learning, each object is represented by a single instance while associated with a set of class labels. Due to the huge (exponential) num-ber of possible label sets for prediction, existing ap-proaches mainly focus on how to exploit label cor-relations to facilitate the learning process. Never-theless, an intrinsic characteristic of learning from multi-label data, i.e. the widely-existing class-imbalance among labels, has not been well inves-tigated. Generally, the number of positive train-ing instances w.r.t. each class label is far less than its negative counterparts, which may lead to per-formance degradation for most multi-label learn-ing techniques. In this paper, a new multi-label learning approach named Cross-Coupling Aggre-gation (COCOA) is proposed, which aims at lever-aging the exploitation of label correlations as well as the exploration of class-imbalance. Briefly, to induce the predictive model on each class label, one binary-class imbalance learner corresponding to the current label and several multi-class imbal-ance learners coupling with other labels are aggre-gated for prediction. Extensive experiments clearly validate the effectiveness of the proposed approach, especially in terms of imbalance-specific evalua-tion metrics such as F-measure and area under the ROC curve. 1

Model Multiple Heterogeneity via Hierarchical Multi-Latent Space Learning

by Pei Yang, Jingrui He
"... In many real world applications such as satellite image anal-ysis, gene function prediction, and insider threat detection, the data collected from heterogeneous sources often exhib-it multiple types of heterogeneity, such as task heterogene-ity, view heterogeneity, and label heterogeneity. To addres ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
In many real world applications such as satellite image anal-ysis, gene function prediction, and insider threat detection, the data collected from heterogeneous sources often exhib-it multiple types of heterogeneity, such as task heterogene-ity, view heterogeneity, and label heterogeneity. To address this problem, we propose a Hierarchical Multi-Latent Space (HiMLS) learning approach to jointly model the triple types of heterogeneity. The basic idea is to learn a hierarchical multi-latent space by which we can simultaneously leverage the task relatedness, view consistency and the label correla-tions to improve the learning performance. We first propose a multi-latent space framework to model the complex het-erogeneity, which is used as a building block to stack up a multi-layer structure so as to learn the hierarchical multi-latent space. In such a way, we can gradually learn the more abstract concepts in the higher level. Then, a deep learn-ing algorithm is proposed to solve the optimization problem. The experimental results on various data sets show the ef-fectiveness of the proposed approach.

Solving the Partial Label Learning Problem: An Instance-based Approach

by Min-ling Zhang, Fei Yu
"... In partial label learning, each training example is associated with a set of candidate labels, among which only one is valid. An intuitive strategy to learn from partial label examples is to treat all can-didate labels equally and make prediction by av-eraging their modeling outputs. Nonetheless, th ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
In partial label learning, each training example is associated with a set of candidate labels, among which only one is valid. An intuitive strategy to learn from partial label examples is to treat all can-didate labels equally and make prediction by av-eraging their modeling outputs. Nonetheless, this strategy may suffer from the problem that the mod-eling output from the valid label is overwhelmed by those from the false positive labels. In this pa-per, an instance-based approach named IPAL is pro-posed by directly disambiguating the candidate la-bel set. Briefly, IPAL tries to identify the valid label of each partial label example via an iterative label propagation procedure, and then classifies the un-seen instance based on minimum error reconstruc-tion from its nearest neighbors. Extensive experi-ments show that IPAL compares favorably against the existing instance-based as well as other state-of-the-art partial label learning approaches. 1

Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine

by Huazhen Wang, Xin Liu, Bing Lv, Fan Yang, Yanzhu Hong
"... Objective: Chronic Fatigue (CF) still remains unclear about its etiology, pathophysiology, nomenclature and diagnostic criteria in the medical community. Traditional Chinese medicine (TCM) adopts a unique diagnostic method, namely ‘bian zheng lun zhi ’ or syndrome differentiation, to diagnose the CF ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Objective: Chronic Fatigue (CF) still remains unclear about its etiology, pathophysiology, nomenclature and diagnostic criteria in the medical community. Traditional Chinese medicine (TCM) adopts a unique diagnostic method, namely ‘bian zheng lun zhi ’ or syndrome differentiation, to diagnose the CF with a set of syndrome factors, which can be regarded as the Multi-Label Learning (MLL) problem in the machine learning literature. To obtain an effective and reliable diagnostic tool, we use Conformal Predictor (CP), Random Forest (RF) and Problem Transformation method (PT) for the syndrome differentiation of CF. Methods and Materials: In this work, using PT method, CP-RF is extended to handle MLL problem. CP-RF applies RF to measure the confidence level (p-value) of each label being the true label, and then selects multiple labels whose p-values are larger than the pre-defined significance level as the region prediction. In this paper, we compare the proposed CP-RF with typical CP-NBC(Naı̈ve Bayes Classifier), CP-KNN(K-Nearest Neighbors) and ML-KNN on CF dataset, which consists of 736 cases. Specifically, 95 symptoms are used to identify CF, and four syndrome factors are employed in the syndrome differentiation, including ‘spleen deficiency’, ‘heart deficiency’, ‘liver stagnation ’ and ‘qi deficiency’. The Results: CP-RF demonstrates an outstanding performance beyond CP-NBC, CP-KNN and ML-KNN under the general metrics of subset accuracy, hamming loss, one-error, coverage, ranking loss and average precision. Furthermore, the
(Show Context)

Citation Context

... the multi-label examples would be transformed before modeling or not, the MLL algorithms can be divided into two categories: Problem Transformation methods (PT) and Algorithm Adaptation methods (AA) =-=[17,19]-=-. PT method splits the multi-label examples straightforward into single-label examples and then applies single-label machine learning algorithms to tackle the multi-pattern recognition problem. Genera...

Multi-Target Regression via Random Linear Target Combinations

by Grigorios Tsoumakas, Eleftherios Spyromitros-xioufis, Aikaterini Vrekou, Ioannis Vlahavas
"... Abstract. Multi-target regression is concerned with the simultaneous prediction of multiple continuous target variables based on the same set of input variables. It arises in several interesting industrial and envi-ronmental application domains, such as ecological modelling and energy forecasting. T ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract. Multi-target regression is concerned with the simultaneous prediction of multiple continuous target variables based on the same set of input variables. It arises in several interesting industrial and envi-ronmental application domains, such as ecological modelling and energy forecasting. This paper presents an ensemble method for multi-target regression that constructs new target variables via random linear com-binations of existing targets. We discuss the connection of our approach with multi-label classification algorithms, in particular RAkEL, which originally inspired this work, and a family of recent multi-label classi-fication algorithms that involve output coding. Experimental results on 12 multi-target datasets show that it performs significantly better than a strong baseline that learns a single model for each target using gradi-ent boosting and compares favourably to multi-objective random forest approach, which is a state-of-the-art approach. The experiments further show that our approach improves more when stronger unconditional de-pendencies exist among the targets.
(Show Context)

Citation Context

...ntly energy-related forecasting1, such as wind and solar energy production forecasting and load/price forecasting. Multi-target regression can be considered as a sibling of multi-label classification =-=[5,6]-=-, the latter dealing with multiple binary target variables, instead of continuous ones. Recent work [7] stressed the close connection among these 1 http://www.gefcom.org ar X iv :1 40 4. 50 65 v1s[ cs...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University