Results

**1 - 2**of**2**### An Aggressive Graph-based Selective Sampling Algorithm for Classification

"... Traditional online learning algorithms are designed for vector data only, which assume that the labels of all the training examples are provided. In this paper, we study graph classification where only limited nodes are chosen for labelling by selective sampling. Particularly, we first adapt a spec ..."

Abstract
- Add to MetaCart

Traditional online learning algorithms are designed for vector data only, which assume that the labels of all the training examples are provided. In this paper, we study graph classification where only limited nodes are chosen for labelling by selective sampling. Particularly, we first adapt a spectral-based graph regularization technique to derive a novel online learning linear algorithm which can handle graph data, although it still queries the labels of all nodes and thus is not preferred, as labelling is typically time-consuming. To address this issue, we then propose a new confidence-based query method for selective sampling. The theoretical result shows that our online learning algorithm with a fraction of queried labels can achieve a mistake bound comparable with the one learning on all labels of the nodes. In addition, the algorithm based on our proposed query strategy can achieve a mistake bound better than the one based on other query methods. However, our algorithm is conservative to update the model whenever error happens, which obviously wastes training labels that are valuable for the model. To take advantage of these labels, we further propose a novel aggressive algorithm, which can update the model aggressively even if no error occurs. The theoretical analysis shows that our aggressive approach can achieve a mistake bound better than its conservative and fully-supervised counterpart, with substantially fewer queried times. We empirically evaluate our algorithm on several real-world graph datasets and the experimental results demonstrate that our method is highly effective.

### Learning Relative Similarity from Data Streams: Active Online Learning Approaches

"... Relative similarity learning, as an important learning scheme for information retrieval, aims to learn a bi-linear similarity function from a collection of labeled instance-pairs, and the learned func-tion would assign a high similarity value for a similar instance-pair and a low value for a dissimi ..."

Abstract
- Add to MetaCart

(Show Context)
Relative similarity learning, as an important learning scheme for information retrieval, aims to learn a bi-linear similarity function from a collection of labeled instance-pairs, and the learned func-tion would assign a high similarity value for a similar instance-pair and a low value for a dissimilar pair. Existing algorithms usually assume the labels of all the pairs in data streams are al-ways made available for learning. However, this is not always re-alistic in practice since the number of possible pairs is quadratic to the number of instances in the database, and manually label-ing the pairs could be very costly and time consuming. To over-come the limitation, we propose a novel framework of active on-line similarity learning. Specifically, we propose two new algo-rithms: (i) PAAS: Passive-Aggressive Active Similarity learning; (ii) CWAS: Confidence-Weighted Active Similarity learning, and we will prove their mistake bounds in theory. We have conducted extensive experiments on a variety of real-world data sets, and we find encouraging results that validate the empirical effectiveness of the proposed algorithms.