Results 1 -
4 of
4
A comparative study of methods for transductive transfer learning
- In ICDM Workshop on Mining and Management of Biological Data
, 2007
"... The problem of transfer learning, where information gained in one learning task is used to improve performance in another related task, is an important new area of research. In this paper we address the subproblem of domain adaptation, in which a model trained over a source domain is generalized to ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
The problem of transfer learning, where information gained in one learning task is used to improve performance in another related task, is an important new area of research. In this paper we address the subproblem of domain adaptation, in which a model trained over a source domain is generalized to perform well on a related target domain, where these two domains ’ data are distributed similarly, but not identically. Previous work has studied the supervised version of this problem in which labeled data from both source and target domains are available for training. In this work, however, we study the more challenging problem of unsupervised transductive transfer learning, where no labeled data from the target domain are available at training time, but instead, unlabeled target test data are available during training. We describe some current state-of-the-art inductive and transductive approaches involving three popular learning models, namely the maximum entropy, support vector machines and naive Bayes models. We then adapt these models to the problem of transfer learning for protein name extraction. In the process, we introduce a novel maximum entropy based technique, Iterative Feature Transformation (IFT), and show that it achieves comparable performance with state-of-the-art transductive SVMs. Finally, we compare the relative strengths and weaknesses of these models across the various learning settings, shedding light both on the algorithms examined and the difficulty of the respective problems. In addition, we show how simple relaxations, such as providing additional information like the proportion of positive examples in the test data, can significantly improve the performance of some of the transductive transfer learners. 1
Information Extraction as Link Prediction: Using Curated Citation Networks to Improve Gene Detection
"... In this paper we explore the usefulness of various types of publication-related metadata, such as citation networks and curated databases, for the task of identifying genes in academic biomedical publications. Specifically, we examine whether knowing something about which genes an author has previou ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
In this paper we explore the usefulness of various types of publication-related metadata, such as citation networks and curated databases, for the task of identifying genes in academic biomedical publications. Specifically, we examine whether knowing something about which genes an author has previously written about, combined with information about previous coauthors and citations, can help us predict which new genes the author is likely to write about in the future. Framed in this way, the problem becomes one of predicting links between authors and genes in the publication network. We show that this solely social-network based link prediction technique outperforms various baselines, including those relying only on non-social biological information.
A Comparison of Methods for Transductive Transfer Learning
"... In this paper we examine the problem of domain adaptation for protein name extraction. First we define the general problem of transfer learning and the particular subproblem of domain adaptation. We then describe some current state of the art supervised and transductive approaches involving support ..."
Abstract
- Add to MetaCart
In this paper we examine the problem of domain adaptation for protein name extraction. First we define the general problem of transfer learning and the particular subproblem of domain adaptation. We then describe some current state of the art supervised and transductive approaches involving support vector machines and maximum entropy models. Using these as inspiration, we turn to the unsupervised version of the problem and introduce a novel maximum entropy based technique, pseudo-label based rescaling (PLR), that achieves comparable performance with no labeled target data. We present the results of experimental comparisons between all the methods described and conclude with a discussion of trends observed and promising routes for future work. 1

