Results 1 -
5 of
5
Adaptation Regularization: A General Framework for Transfer Learning
"... Abstract—Domain transfer learning, which learns a target classifier using labeled data from a different distribution, has shown promising value in knowledge discovery yet still been a challenging problem. Most previous works designed adaptive classifiers by exploring two learning strategies independ ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Domain transfer learning, which learns a target classifier using labeled data from a different distribution, has shown promising value in knowledge discovery yet still been a challenging problem. Most previous works designed adaptive classifiers by exploring two learning strategies independently: distribution adaptation and label propagation. In this paper, we propose a novel transfer learning framework, referred to as Adaptation Regularization based Transfer Learning (ARTL), to model them in a unified way based on the structural risk minimization principle and the regularization theory. Specifically, ARTL learns the adaptive classifier by simultaneously optimizing the structural risk functional, the joint distribution matching between domains, and the manifold consistency underlying marginal distribution. Based on the framework, we propose two novel methods using Regularized Least Squares (RLS) and Support Vector Machines (SVMs), respectively, and use the Representer theorem in reproducing kernel Hilbert space to derive corresponding solutions. Comprehensive experiments verify that ARTL can significantly outperform state-of-the-art learning methods on several public text and image datasets. Index Terms—Transfer learning, adaptation regularization, distribution adaptation, manifold regularization, generalization error 1
On Handling Negative Transfer and Imbalanced Distributions in Multiple Source Transfer Learning
"... Transfer learning has beneted many real-world applications where labeled data are abundant in source domains but scarce in the target domain. As there are usually multi-ple relevant domains where knowledge can be transferred, multiple source transfer learning (MSTL) has recently at-tracted much atte ..."
Abstract
- Add to MetaCart
(Show Context)
Transfer learning has beneted many real-world applications where labeled data are abundant in source domains but scarce in the target domain. As there are usually multi-ple relevant domains where knowledge can be transferred, multiple source transfer learning (MSTL) has recently at-tracted much attention. However, we are facing two major challenges when applying MSTL. First, without knowledge about the dierence between source and target domains, neg-ative transfer occurs when knowledge is transferred from highly irrelevant sources. Second, existence of imbalanced distributions in classes, where examples in one class domi-nate, can lead to improper judgement on the source domains' relevance to the target task. Since existing MSTL meth-ods are usually designed to transfer from relevant sources with balanced distributions, they will fail in applications where these two challenges persist. In this paper, we propose a novel two-phase framework to eectively transfer knowl-edge from multiple sources even when there exist irrelevant sources and imbalanced class distributions. First, an eec-tive Supervised Local Weight (SLW) scheme is proposed to assign a proper weight to each source domain's classier based on its ability of predicting accurately on each local region of the target domain. The second phase then learns a classier for the target domain by solving an optimiza-tion problem which concerns both training error minimiza-tion and consistency with weighted predictions gained from source domains. A theoretical analysis shows that as the number of source domains increases, the probability that the proposed approach has an error greater than a bound is be-coming exponentially small. Extensive experiments on dis-ease prediction, spam ltering and intrusion detection data sets demonstrate the signicant improvement in classica-tion performance gained by the proposed method over exist-ing MSTL approaches. 1
To cite this version:
, 2012
"... HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
- Add to MetaCart
(Show Context)
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Pre-publication draft
Semi-Supervised Domain Adaptation with Good Similarity Functions
"... Abstract. In this paper, we address the problem of domain adaptation for binary classification. This problem arises when the distributions generating the source learning data and target test data are somewhat different. From a theoretical standpoint, a classifier has better generalization guarantees ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. In this paper, we address the problem of domain adaptation for binary classification. This problem arises when the distributions generating the source learning data and target test data are somewhat different. From a theoretical standpoint, a classifier has better generalization guarantees when the two domain marginal distributions of the input space are close. Classical approaches try mainly to build new projection spaces or to reweight the source data with the ob-jective of moving closer the two distributions. We study an original direction based on a recent framework introduced by Balcan et al. enabling one to learn linear classifiers in an explicit pro-jection space based on a similarity function, not necessarily symmetric nor positive semi-definite. We propose a well founded general method for learning a low-error classifier on target data which is effective with the help of an iterative procedure compatible with Balcan et al.’s framework. A reweighting scheme of the similarity function is then introduced in order to move closer the distri-butions in a new projection space. The hyperparameters and the reweighting quality are controlled by a reverse validation procedure. Our approach is based on a linear programming formulation and shows good adaptation performances with very sparse models. We first consider the challeng-ing unsupervised case where no target label is accessible, which can be helpful when no manual annotation is possible. We also propose a generalisation to the semi-supervised case allowing us to consider some few target labels when available. Finally, we evaluate our method on a synthetic problem and on a real image annotation task.
Yahoo Labs
"... In this paper, we propose to study the problem of heterogeneous transfer ranking, a transfer learning problem with heterogeneous features in order to utilize the rich large-scale labeled data in popular languages to help the ranking task in less popular languages. We develop a large-margin algorithm ..."
Abstract
- Add to MetaCart
(Show Context)
In this paper, we propose to study the problem of heterogeneous transfer ranking, a transfer learning problem with heterogeneous features in order to utilize the rich large-scale labeled data in popular languages to help the ranking task in less popular languages. We develop a large-margin algorithm, namely LM-HTR, to solve the problem by mapping the input features in both the source domain and target domain into a shared latent space and simultaneously minimizing the feature reconstruction loss and prediction loss. We analyze the theoretical bound of the prediction loss and develop fast algorithms via stochastic gradient descent so that our model can be scalable to large-scale applications. Experiment results on two application datasets demonstrate the advantages of our algorithms over other state-of-the-art methods. 1.