Results 1 - 10
of
34
A Variance Minimization Criterion to Active Learning on Graphs
"... We consider the problem of active learning over the vertices in a graph, without feature representation. Our study is based on the common graph smoothness assumption, which is formulated in a Gaussian random field model. We analyze the probability distribution over the unlabeled vertices conditioned ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
(Show Context)
We consider the problem of active learning over the vertices in a graph, without feature representation. Our study is based on the common graph smoothness assumption, which is formulated in a Gaussian random field model. We analyze the probability distribution over the unlabeled vertices conditioned on the label information, which is a multivariate normal with the mean being the harmonic solution over the field. Then we select the nodes to label such that the total variance of the distribution on the unlabeled data, as well as the expected prediction error, is minimized. In this way, the classifier we obtain is theoretically more robust. Compared with existing methods, our algorithm has the advantage of selecting data in a batch offline mode with solid theoretical support. We show improved performance over existing label selection criteria on several real world data sets. 1
Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745
, 2011
"... Information theoretic active learning has been widely studied for prob-abilistic models. For simple regression an optimal myopic policy is easily tractable. However, for other tasks and with more complex models, such as classification with nonparametric models, the optimal solution is harder to comp ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Information theoretic active learning has been widely studied for prob-abilistic models. For simple regression an optimal myopic policy is easily tractable. However, for other tasks and with more complex models, such as classification with nonparametric models, the optimal solution is harder to compute. Current approaches make approximations to achieve tractabil-ity. We propose an approach that expresses information gain in terms of predictive entropies, and apply this method to the Gaussian Process Classifier (GPC). Our approach makes minimal approximations to the full information theoretic objective. Our experimental performance compares favourably to many popular active learning algorithms, and has equal or lower computational complexity. We compare well to decision theoretic approaches also, which are privy to more information and require much more computational time. Secondly, by developing further a reformulation of binary preference learning to a classification problem, we extend our algorithm to Gaussian Process preference learning. 1
Dual active feature and sample selection for graph classification
- in KDD
, 2011
"... Graph classification has become an important and active research topic in the last decade. Current research on graph classification focuses on mining discriminative subgraph features under supervised settings. The basic assumption is that a large number of labeled graphs are available. However, labe ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
(Show Context)
Graph classification has become an important and active research topic in the last decade. Current research on graph classification focuses on mining discriminative subgraph features under supervised settings. The basic assumption is that a large number of labeled graphs are available. However, labeling graph data is quite expensive and time consuming for many real-world applications. In order to reduce the labeling cost for graph data, we address the problem of how to select the most important graph to query for the label. This problem is challenging and different from conventional active learning problems because there is no predefined feature vector. Moreover, the subgraph enumeration problem is NP-hard. The active sample selection problem and the feature selection problem are correlated for graph data. Before we can solve the active sample selection problem, we need to find a set of optimal subgraph features. To address this challenge, we demonstrate how one can simultaneously estimate the usefulness of a query graph and a set of subgraph features. The idea is to maximize the dependency between subgraph features and graph labels using an active learning framework. We propose a branch-andbound algorithm to search for the optimal query graph and optimal features simultaneously. Empirical studies on nine real-world tasks demonstrate that the proposed method can obtain better accuracy on graph data than alternative approaches.
Active Learning with Hinted Support Vector Machine
"... The abundance of real-world data and limited labeling budget calls for active learning, which is an important learning paradigm for reducing human labeling efforts. Many re-cently developed active learning algorithms consider both uncertainty and representative-ness when making querying decisions. H ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
The abundance of real-world data and limited labeling budget calls for active learning, which is an important learning paradigm for reducing human labeling efforts. Many re-cently developed active learning algorithms consider both uncertainty and representative-ness when making querying decisions. However, exploiting representativeness with uncer-tainty concurrently usually requires tackling sophisticated and challenging learning tasks, such as clustering. In this paper, we propose a new active learning framework, called hinted sampling, which takes both uncertainty and representativeness into account in a simpler way. We design a novel active learning algorithm within the hinted sampling framework with an extended support vector machine. Experimental results validate that the novel active learning algorithm can result in a better and more stable performance than that achieved by state-of-the-art algorithms.
Active Query Driven by Uncertainty and Diversity for Incremental Multi-Label Learning ∗
"... Abstract—In multi-label learning, it is rather expensive to label instances since they are simultaneously associated with multiple labels. Therefore, active learning, which reduces the la-beling cost by actively querying the labels of the most valuable data, becomes particularly important for multi- ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
Abstract—In multi-label learning, it is rather expensive to label instances since they are simultaneously associated with multiple labels. Therefore, active learning, which reduces the la-beling cost by actively querying the labels of the most valuable data, becomes particularly important for multi-label learning. A strong multi-label active learning algorithm usually consists of two crucial elements: a reasonable criterion to evaluate the gain of queried label, and an effective classification model, based on whose prediction the criterion can be accurately computed. In this paper, we first introduce an effective multi-label classification model by combining label ranking with threshold learning, which is incrementally trained to avoid retraining from scratch after every query. Based on this model, we then propose to exploit both uncertainty and diversity in the instance space as well as the label space, and actively query the instance-label pairs which can improve the classification model most. Experimental results demonstrate the superiority of the proposed approach to state-of-the-art methods. Keywords-active learning; multi-label learning; uncertainty; diversity I.
Batch-Mode Active Learning via Error Bound Minimization
"... Active learning has been proven to be quite effec-tive in reducing the human labeling efforts by ac-tively selecting the most informative examples to label. In this paper, we present a batch-mode ac-tive learning method based on logistic regression. Our key motivation is an out-of-sample bound on th ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Active learning has been proven to be quite effec-tive in reducing the human labeling efforts by ac-tively selecting the most informative examples to label. In this paper, we present a batch-mode ac-tive learning method based on logistic regression. Our key motivation is an out-of-sample bound on the estimation error of class distribution in lo-gistic regression conditioned on any fixed train-ing sample. It is different from a typical PAC-style passive learning error bound, that relies on the i.i.d. assumption of example-label pairs. In addition, it does not contain the class labels of the training sample. Therefore, it can be imme-diately used to design an active learning algo-rithm by minimizing this bound iteratively. We also discuss the connections between the pro-posed method and some existing active learn-ing approaches. Experiments on benchmark UCI datasets and text datasets demonstrate that the proposed method outperforms the state-of-the-art active learning methods significantly. 1
Large-Scale Machine Learning for Classification and Search
, 2012
"... With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for t ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for the purpose of making classification and nearest neighbor search practical on gigantic databases. Our first approach is to explore data graphs to aid classification and nearest neighbor search. A graph offers an attractive way of representing data and discovering the essential information such as the neighborhood structure. However, both of the graph construction process and graph-based learning techniques become computationally prohibitive at a large scale. To this end, we present an efficient large graph construction approach and subsequently apply it to develop scalable semi-supervised learning and unsupervised hashing algorithms. Our unique contributions on the graph-related topics include: 1. Large Graph Construction: Conventional neighborhood graphs such as kNN graphs require a quadratic time complexity, which is inadequate for large-scale applications mentioned above. To overcome this bottleneck, we present a novel graph construction approach,
Online active constraint selection for semi-supervised clustering
- in European Conference on Artificial Intelligence, Active and Incremental Learning Workshop
, 2012
"... Due to strong demand for the ability to enforce top-down struc-ture on clustering results, semi-supervised clustering methods using pairwise constraints as side information have received increasing at-tention in recent years. However, most current methods are passive in the sense that the side infor ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
Due to strong demand for the ability to enforce top-down struc-ture on clustering results, semi-supervised clustering methods using pairwise constraints as side information have received increasing at-tention in recent years. However, most current methods are passive in the sense that the side information is provided beforehand and selected randomly. This may lead to the use of constraints that are redundant, unnecessary, or even harmful to the clustering results. To overcome this, we present an active clustering framework which se-lects pairwise constraints online as clustering proceeds, and propose an online constraint selection method that actively selects pairwise constraints by identifying uncertain nodes in the data. We also pro-pose two novel methods for computing node uncertainty: one global and parametric and the other one local and nonparametric. We evalu-ate our active constraint selection method with two different semi-supervised clustering algorithms on UCI, digits, gene and image datasets, and achieve results superior to current state of the art ac-tive techniques. 1
ActNeT: Active Learning for Networked Texts in Microblogging
"... Supervised learning, e.g., classification, plays an important role in processing and organizing microblogging data. In microblogging, it is easy to mass vast quantities of unlabeled data, but would be costly to obtain labels, which are essential for supervised learning algorithms. In order to reduce ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Supervised learning, e.g., classification, plays an important role in processing and organizing microblogging data. In microblogging, it is easy to mass vast quantities of unlabeled data, but would be costly to obtain labels, which are essential for supervised learning algorithms. In order to reduce the labeling cost, active learning is an effective way to select representative and informative instances to query for labels for improving the learned model. Different from traditional data in which the instances are assumed to be independent and identically distributed (i.i.d.), instances in microblogging are networked with each other. This presents both opportunities and challenges for applying active learning to microblogging data. Inspired by social correlation theories, we investigate whether social relations can help perform effective active learning on networked data. In this paper, we propose a novel Active learning framework for the classification of Networked Texts in microblogging (ActNeT). In particular, we study how to incorporate network information into text content modeling, and design strategies to select the most representative and informative instances from microblogging for labeling by taking advantage of social network structure. Experimental results on Twitter datasets show the benefit of incorporating network information in active learning and that the proposed framework outperforms existing state-of-the-art methods. 1
Active Learning with Support Vector Machines
"... In machine learning, active learning refers to algorithms that autonomously select the data points from which they will learn. There are many data mining appli-cations in which large amounts of unlabeled data are readily available, but labels (e.g., human annotations or results from complex experime ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
In machine learning, active learning refers to algorithms that autonomously select the data points from which they will learn. There are many data mining appli-cations in which large amounts of unlabeled data are readily available, but labels (e.g., human annotations or results from complex experiments) are costly to ob-tain. In such scenarios, an active learning algorithm aims at identifying data points that, if labeled and used for training, would most improve the learned model. La-bels are then obtained only for the most promising data points. This speeds up learning and reduces labeling costs. Support vector machine (SVM) classifiers are particularly well-suited for active learning due to their convenient mathemat-ical properties. They perform linear classification, typically in a kernel-induced feature space, which makes measuring the distance of a data point from the de-cision boundary straightforward. Furthermore, heuristics can efficiently estimate how strongly learning from a data point influences the current model. This infor-mation can be used to actively select training samples. After a brief introduction to the active learning problem, we discuss different query strategies for select-ing informative data points and review how these strategies give rise to different variants of active learning with SVMs. 1