Results 1  10
of
139
A framework for learning predictive structures from multiple tasks and unlabeled data
 Journal of Machine Learning Research
, 2005
"... One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semisupervised learning. Although a number of such methods ar ..."
Abstract

Cited by 320 (3 self)
 Add to MetaCart
One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semisupervised learning. Although a number of such methods are proposed, at the current stage, we still don’t have a complete understanding of their effectiveness. This paper investigates a closely related problem, which leads to a novel approach to semisupervised learning. Specifically we consider learning predictive structures on hypothesis spaces (that is, what kind of classifiers have good predictive power) from multiple learning tasks. We present a general framework in which the structural learning problem can be formulated and analyzed theoretically, and relate it to learning with unlabeled data. Under this framework, algorithms for structural learning will be proposed, and computational issues will be investigated. Experiments will be given to demonstrate the effectiveness of the proposed algorithms in the semisupervised learning setting. 1.
A Survey on Transfer Learning
"... A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many realworld applications, this assumption may not hold. For example, we sometimes have a classification task i ..."
Abstract

Cited by 181 (19 self)
 Add to MetaCart
A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many realworld applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift. We also explore some potential future issues in transfer learning research.
Learning Multiple Tasks with Kernel Methods
 Journal of Machine Learning Research
, 2005
"... Editor: John ShaweTaylor We study the problem of learning many related tasks simultaneously using kernel methods and regularization. The standard singletask kernel methods, such as support vector machines and regularization networks, are extended to the case of multitask learning. Our analysis sh ..."
Abstract

Cited by 157 (10 self)
 Add to MetaCart
Editor: John ShaweTaylor We study the problem of learning many related tasks simultaneously using kernel methods and regularization. The standard singletask kernel methods, such as support vector machines and regularization networks, are extended to the case of multitask learning. Our analysis shows that the problem of estimating many task functions with regularization can be cast as a single task learning problem if a family of multitask kernel functions we define is used. These kernels model relations among the tasks and are derived from a novel form of regularizers. Specific kernels that can be used for multitask learning are provided and experimentally tested on two real data sets. In agreement with past empirical work on multitask learning, the experiments show that learning multiple related tasks simultaneously using the proposed approach can significantly outperform standard singletask learning particularly when there are many related tasks but few data per task.
MultiTask Learning for HIV Therapy Screening
"... We address the problem of learning classifiers for a large number of tasks. We derive a solution that produces resampling weights which match the pool of all examples to the target distribution of any given task. Our work is motivated by the problem of predicting the outcome of a therapy attempt for ..."
Abstract

Cited by 36 (4 self)
 Add to MetaCart
We address the problem of learning classifiers for a large number of tasks. We derive a solution that produces resampling weights which match the pool of all examples to the target distribution of any given task. Our work is motivated by the problem of predicting the outcome of a therapy attempt for a patient who carries an HIV virus with a set of observed genetic properties. Such predictions need to be made for hundreds of possible combinations of drugs, some of which use similar biochemical mechanisms. Multitask learning enables us to make predictions even for drug combinations with few or no training examples and substantially improves the overall prediction accuracy. 1.
MultiTask Feature Learning Via Efficient L2,1Norm Minimization
, 2009
"... The problem of joint feature selection across a group of related tasks has applications in many areas including biomedical informatics and computer vision. We consider the ℓ2,1norm regularized regression model for joint feature selection from multiple tasks, which can be derived in the probabilisti ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
The problem of joint feature selection across a group of related tasks has applications in many areas including biomedical informatics and computer vision. We consider the ℓ2,1norm regularized regression model for joint feature selection from multiple tasks, which can be derived in the probabilistic framework by assuming a suitable prior from the exponential family. One appealing feature of the ℓ2,1norm regularization is that it encourages multiple predictors to share similar sparsity patterns. However, the resulting optimization problem is challenging to solve due to the nonsmoothness of the ℓ2,1norm regularization. In this paper, we propose to accelerate the computation by reformulating it as two equivalent smooth convex optimization problems which are then solved via the Nesterov’s method—an optimal firstorder blackbox method for smooth convex optimization. A key building block in solving the reformulations is the Euclidean projection. We show that the Euclidean projection for the first reformulation can be analytically computed, while the Euclidean projection for the second one can be computed in linear time. Empirical evaluations on several data sets verify the efficiency of the proposed algorithms.
SemiSupervised Multitask Learning
"... A semisupervised multitask learning (MTL) framework is presented, in which M parameterized semisupervised classifiers, each associated with one of M partially labeled data manifolds, are learned jointly under the constraint of a softsharing prior imposed over the parameters of the classifiers. The ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
A semisupervised multitask learning (MTL) framework is presented, in which M parameterized semisupervised classifiers, each associated with one of M partially labeled data manifolds, are learned jointly under the constraint of a softsharing prior imposed over the parameters of the classifiers. The unlabeled data are utilized by basing classifier learning on neighborhoods, induced by a Markov random walk over a graph representation of each manifold. Experimental results on real data sets demonstrate that semisupervised MTL yields significant improvements in generalization performance over either semisupervised singletask learning (STL) or supervised MTL. 1
Bounds for linear multitask learning
 Journal of Machine Learning Research
, 2006
"... Abstract. We give dimensionfree and datadependent bounds for linear multitask learning where a common linear operator is chosen to preprocess data for a vector of task speci…c linearthresholding classi…ers. The complexity penalty of multitask learning is bounded by a simple expression involvin ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
Abstract. We give dimensionfree and datadependent bounds for linear multitask learning where a common linear operator is chosen to preprocess data for a vector of task speci…c linearthresholding classi…ers. The complexity penalty of multitask learning is bounded by a simple expression involving the margins of the taskspeci…c classi…ers, the HilbertSchmidt norm of the selected preprocessor and the HilbertSchmidt norm of the covariance operator for the total mixture of all task distributions, or, alternatively, the Frobenius norm of the total Gramian matrix for the datadependent version. The results can be compared to stateoftheart results on linear singletask learning. 1
Transfer Learning in Sign language
"... We build word models for American Sign Language (ASL) that transfer between different signers and different aspects. This is advantageous because one could use large amounts of labelled avatar data in combination with a smaller amount of labelled human data to spot a large number of words in human d ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
We build word models for American Sign Language (ASL) that transfer between different signers and different aspects. This is advantageous because one could use large amounts of labelled avatar data in combination with a smaller amount of labelled human data to spot a large number of words in human data. Transfer learning is possible because we represent blocks of video with novel intermediate discriminative features based on splits of the data. By constructing the same splits in avatar and human data and clustering appropriately, our features are both discriminative and semantically similar: across signers similar features imply similar words. We demonstrate transfer learning in two scenarios: from avatar to a frontally viewed human signer and from an avatar to human signer in a 3/4 view. 1.
Learning to Share Visual Appearance for Multiclass Object Detection
"... We present a hierarchical classification model that allows rare objects to borrow statistical strength from related objects that have many training examples. Unlike many of the existing object detection and recognition systems that treat different classes as unrelated entities, our model learns both ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
We present a hierarchical classification model that allows rare objects to borrow statistical strength from related objects that have many training examples. Unlike many of the existing object detection and recognition systems that treat different classes as unrelated entities, our model learns both a hierarchy for sharing visual appearance across 200 object categories and hierarchical parameters. Our experimental results on the challenging object localization and detection task demonstrate that the proposed model substantially improves the accuracy of the standard single object detectors that ignore hierarchical structure altogether. 1.
A machine learning approach to conjoint analysis
 Neural Information Processing Systems 17
, 2005
"... Choicebased conjoint analysis builds models of consumer preferences over products with answers gathered in questionnaires. Our main goal is to bring tools from the machine learning community to solve this problem more efficiently. Thus, we propose two algorithms to quickly and accurately estimate c ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
Choicebased conjoint analysis builds models of consumer preferences over products with answers gathered in questionnaires. Our main goal is to bring tools from the machine learning community to solve this problem more efficiently. Thus, we propose two algorithms to quickly and accurately estimate consumer preferences. 1