Results 1 - 10
of
49
A Survey on Transfer Learning
"... A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task i ..."
Abstract
-
Cited by 59 (8 self)
- Add to MetaCart
A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as co-variate shift. We also explore some potential future issues in transfer learning research.
Learning bounds for domain adaptation
- In Advances in Neural Information Processing Systems
, 2008
"... Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classifier from a source domain with a large amount of training data to different target domain with very little training data. ..."
Abstract
-
Cited by 41 (6 self)
- Add to MetaCart
Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classifier from a source domain with a large amount of training data to different target domain with very little training data. In this work we give uniform convergence bounds for algorithms that minimize a convex combination of source and target empirical risk. The bounds explicitly model the inherent trade-off between training on a large but inaccurate source data set and a small but accurate target training set. Our theory also gives results when we have multiple source domains, each of which may have a different number of instances, and we exhibit cases in which minimizing a non-uniform combination of source risks can achieve much lower target error than standard empirical risk minimization. 1
Knowledge transfer via multiple model local structure mapping
- In International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV
, 2008
"... The effectiveness of knowledge transfer using classification algorithms depends on the difference between the distribution that generates the training examples and the one from which test examples are to be drawn. The task can be especially difficult when the training examples are from one or severa ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
The effectiveness of knowledge transfer using classification algorithms depends on the difference between the distribution that generates the training examples and the one from which test examples are to be drawn. The task can be especially difficult when the training examples are from one or several domains different from the test domain. In this paper, we propose a locally weighted ensemble framework to combine multiple models for transfer learning, where the weights are dynamically assigned according to a model’s predictive power on each test example. It can integrate the advantages of various learning algorithms and the labeled information from multiple training domains into one unified classification model, which can then be applied on a different domain. Importantly, different from many previously proposed methods, none of the base learning method is required to be specifically designed for transfer learning. We show the optimality of a locally weighted ensemble framework as a general approach to combine multiple models for domain transfer. We then propose an implementation of the local weight assignments by mapping the structures of a model onto the structures of the test domain, and then weighting each model locally according to its consistency with the neighborhood structure around the test example. Experimental results on text classification, spam filtering and intrusion detection data sets demonstrate significant improvements in classification accuracy gained by the framework. On a transfer learning task of newsgroup message categorization, the proposed locally weighted ensemble framework achieves 97 % accuracy when the best single model predicts correctly only on 73 % of the test examples. In summary, the improvement in accuracy is over 10 % and up to 30 % across different problems.
Topic-bridged PLSA for Cross-Domain Text Classification
"... In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain is expensive and time consuming, while there may be plenty of labeled data in a related but different domain. Traditional ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain is expensive and time consuming, while there may be plenty of labeled data in a related but different domain. Traditional text classification approaches are not able to cope well with learning across different domains. In this paper, we propose a novel cross-domain text classification algorithm which extends the traditional probabilistic latent semantic analysis (PLSA) algorithm to integrate labeled and unlabeled data, which come from different but related domains, into a unified probabilistic model. We call this new model Topic-bridged PLSA, or TPLSA. By exploiting the common topics between two domains, we transfer knowledge across different domains through a topic-bridge to help the text classification in the target domain. A unique advantage of our method is its ability to maximally mine knowledge that can be transferred between domains, resulting in superior performance when compared to other state-of-the-art text classification approaches. Experimental evaluation on different kinds of datasets shows that our proposed algorithm can improve the performance of cross-domain text classification significantly.
Actively transfer domain knowledge
- In European Conference on Machine Learning
, 2008
"... Abstract. When labeled examples are not readily available, active learning and transfer learning are separate efforts to obtain labeled examples for inductive learning. Active learning asks domain experts to label a small set of examples, but there is a cost incurred for each answer. While transfer ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Abstract. When labeled examples are not readily available, active learning and transfer learning are separate efforts to obtain labeled examples for inductive learning. Active learning asks domain experts to label a small set of examples, but there is a cost incurred for each answer. While transfer learning could borrow labeled examples from a different domain without incurring any labeling cost, there is no guarantee that the transferred examples will actually help improve the learning accuracy. To solve both problems, we propose a framework to actively transfer the knowledge across domains, and the key intuition is to use the knowledge transferred from other domain as often as possible to help learn the current domain, and query experts only when necessary. To do so, labeled examples from the other domain (out-of-domain) are examined on the basis of their likelihood to correctly label the examples of the current domain (in-domain). When this likelihood is low, these out-of-domain examples will not be used to label the in-domain example, but domain experts are consulted to provide class label. We derive a sampling error bound and a querying bound to demonstrate that the proposed method can effectively mitigate risk of domain difference by transferring domain knowledge only when they are useful, and query domain experts only when necessary. Experimental studies have employed synthetic datasets and two types of real world datasets, including remote sensing and text classification problems. The proposed method is compared with previously proposed transfer learning and active learning methods. Across all comparisons, the proposed approach can evidently outperform the transfer learning model in classification accuracy given different out-of-domain datasets. For example, upon the remote sensing dataset, the proposed approach achieves an accuracy around 94.5%, while the comparable transfer learning model drops to less than 89 % in most cases. The software and datasets are available from the authors. 1
Spectral Domain-Transfer Learning
- KDD'08
, 2008
"... Traditional spectral classification has been proved to be effective in dealing with both labeled and unlabeled data when these data are from the same domain. In many real world applications, however, we wish to make use of the labeled data from one domain (called in-domain) to classify the unlabeled ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Traditional spectral classification has been proved to be effective in dealing with both labeled and unlabeled data when these data are from the same domain. In many real world applications, however, we wish to make use of the labeled data from one domain (called in-domain) to classify the unlabeled data in a different domain (out-of-domain). This problem often happens when obtaining labeled data in one domain is difficult while there are plenty of labeled data from a related but different domain. In general, this is a transfer learning problem where we wish to classify the unlabeled data through the labeled data even though these data are not from the same domain. In this paper, we formulate this domain-transfer learning problem under a novel spectral classification framework, where the objective function is introduced to seek consistency between the in-domain supervision and the out-of-domain intrinsic structure. Through optimization of the cost function, the label information from the in-domain data is effectively transferred to help classify the unlabeled data from the out-of-domain. We conduct extensive experiments to evaluate our method and show that our algorithm achieves significant improvements on classification performance over many state-of-the-art algorithms.
3D Laser Scan Classification Using Web Data and Domain Adaptation
"... Abstract — Over the last years, object recognition has become a more and more active field of research in robotics. An important problem in object recognition is the need for sufficient labeled training data to learn good classifiers. In this paper we show how to significantly reduce the need for ma ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract — Over the last years, object recognition has become a more and more active field of research in robotics. An important problem in object recognition is the need for sufficient labeled training data to learn good classifiers. In this paper we show how to significantly reduce the need for manually labeled training data by leveraging data sets available on the World Wide Web. Specifically, we show how to use objects from Google’s 3D Warehouse to train classifiers for 3D laser scans collected by a robot navigating through urban environments. In order to deal with the different characteristics of the web data and the real robot data, we additionally use a small set of labeled 3D laser scans and perform domain adaptation. Our experiments demonstrate that additional data taken from the 3D Warehouse along with our domain adaptation greatly improves the classification accuracy on real laser scans. I.
Sub-kilometer crater discovery with boosting and transfer learning
- ACM TIST
"... Counting craters in remotely sensed images is the only tool that provides relative dating of remote planetary surfaces. Surveying craters requires counting a large amount of small subkilometer craters, which calls for highly efficient automatic crater detection. In this paper, we present an integrat ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Counting craters in remotely sensed images is the only tool that provides relative dating of remote planetary surfaces. Surveying craters requires counting a large amount of small subkilometer craters, which calls for highly efficient automatic crater detection. In this paper, we present an integrated framework on auto-detection of sub-kilometer craters with boosting and transfer learning. The framework contains three key components. First, we utilize mathematical morphology to efficiently identify crater candidates, the regions of an image that can potentially contain craters. Only those regions, occupying relatively small portions of the original image, are the subjects of further processing. Second, we extract and select image texture features, in combination with supervised boosting ensemble learning algorithms, to accurately classify crater candidates into craters and non-craters. Third, we integrate transfer learning into boosting, to enhance detection performance in the regions where surface morphology differs from what is characterized by the training set. Our framework is evaluated on a large test image of 37, 500 × 56, 250 m2 on Mars, which exhibits a heavily cratered Martian terrain characterized by nonuniform surface morphology. Empirical studies demonstrate that the proposed crater detection framework can achieve an F1 score above 0.85, a significant improvement over the other crater detection algorithms.
Transfer Learning
"... Abstract. Transfer learning is the improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned. While most machine learning algorithms are designed to address single tasks, the development of algorithms that facilitate transfer learning i ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract. Transfer learning is the improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned. While most machine learning algorithms are designed to address single tasks, the development of algorithms that facilitate transfer learning is a topic of ongoing interest in the machine-learning community. This chapter provides an introduction to the goals, formulations, and challenges of transfer learning. It surveys current research in this area, giving an overview of the state of the art and outlining the open problems. The survey covers transfer in both inductive learning and reinforcement learning, and discusses the issues of negative transfer and task mapping in depth.
A rich feature vector for protein-protein interaction extraction from multiple corpora
- In: EMNLP
, 2009
"... Because of the importance of proteinprotein interaction (PPI) extraction from text, many corpora have been proposed with slightly differing definitions of proteins and PPI. Since no single corpus is large enough to saturate a machine learning system, it is necessary to learn from multiple different ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Because of the importance of proteinprotein interaction (PPI) extraction from text, many corpora have been proposed with slightly differing definitions of proteins and PPI. Since no single corpus is large enough to saturate a machine learning system, it is necessary to learn from multiple different corpora. In this paper, we propose a solution to this challenge. We designed a rich feature vector, and we applied a support vector machine modified for corpus weighting (SVM-CW) to complete the task of multiple corpora PPI extraction. The rich feature vector, made from multiple useful kernels, is used to express the important information for PPI extraction, and the system with our feature vector was shown to be both faster and more accurate than the original kernelbased system, even when using just a single corpus. SVM-CW learns from one corpus, while using other corpora for support. SVM-CW is simple, but it is more effective than other methods that have been successfully applied to other NLP tasks earlier. With the feature vector and SVM-CW, our system achieved the best performance among all state-of-the-art PPI extraction systems reported so far. 1

