Results 1 - 10
of
79
A framework for learning predictive structures from multiple tasks and unlabeled data
- Journal of Machine Learning Research
, 2005
"... One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semi-supervised learning. Although a number of such methods ar ..."
Abstract
-
Cited by 202 (2 self)
- Add to MetaCart
One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semi-supervised learning. Although a number of such methods are proposed, at the current stage, we still don’t have a complete understanding of their effectiveness. This paper investigates a closely related problem, which leads to a novel approach to semi-supervised learning. Specifically we consider learning predictive structures on hypothesis spaces (that is, what kind of classifiers have good predictive power) from multiple learning tasks. We present a general framework in which the structural learning problem can be formulated and analyzed theoretically, and relate it to learning with unlabeled data. Under this framework, algorithms for structural learning will be proposed, and computational issues will be investigated. Experiments will be given to demonstrate the effectiveness of the proposed algorithms in the semi-supervised learning setting. 1.
Learning Multiple Tasks with Kernel Methods
- Journal of Machine Learning Research
, 2005
"... Editor: John Shawe-Taylor We study the problem of learning many related tasks simultaneously using kernel methods and regularization. The standard single-task kernel methods, such as support vector machines and regularization networks, are extended to the case of multi-task learning. Our analysis sh ..."
Abstract
-
Cited by 96 (5 self)
- Add to MetaCart
Editor: John Shawe-Taylor We study the problem of learning many related tasks simultaneously using kernel methods and regularization. The standard single-task kernel methods, such as support vector machines and regularization networks, are extended to the case of multi-task learning. Our analysis shows that the problem of estimating many task functions with regularization can be cast as a single task learning problem if a family of multi-task kernel functions we define is used. These kernels model relations among the tasks and are derived from a novel form of regularizers. Specific kernels that can be used for multi-task learning are provided and experimentally tested on two real data sets. In agreement with past empirical work on multi-task learning, the experiments show that learning multiple related tasks simultaneously using the proposed approach can significantly outperform standard single-task learning particularly when there are many related tasks but few data per task.
A Survey on Transfer Learning
"... A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task i ..."
Abstract
-
Cited by 58 (8 self)
- Add to MetaCart
A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as co-variate shift. We also explore some potential future issues in transfer learning research.
Bounds for linear multi-task learning
- Journal of Machine Learning Research
, 2006
"... Abstract. We give dimension-free and data-dependent bounds for linear multi-task learning where a common linear operator is chosen to preprocess data for a vector of task speci…c linear-thresholding classi-…ers. The complexity penalty of multi-task learning is bounded by a simple expression involvin ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Abstract. We give dimension-free and data-dependent bounds for linear multi-task learning where a common linear operator is chosen to preprocess data for a vector of task speci…c linear-thresholding classi-…ers. The complexity penalty of multi-task learning is bounded by a simple expression involving the margins of the task-speci…c classi…ers, the Hilbert-Schmidt norm of the selected preprocessor and the Hilbert-Schmidt norm of the covariance operator for the total mixture of all task distributions, or, alternatively, the Frobenius norm of the total Gramian matrix for the data-dependent version. The results can be compared to state-of-the-art results on linear single-task learning. 1
Multi-Task Learning for HIV Therapy Screening
"... We address the problem of learning classifiers for a large number of tasks. We derive a solution that produces resampling weights which match the pool of all examples to the target distribution of any given task. Our work is motivated by the problem of predicting the outcome of a therapy attempt for ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
We address the problem of learning classifiers for a large number of tasks. We derive a solution that produces resampling weights which match the pool of all examples to the target distribution of any given task. Our work is motivated by the problem of predicting the outcome of a therapy attempt for a patient who carries an HIV virus with a set of observed genetic properties. Such predictions need to be made for hundreds of possible combinations of drugs, some of which use similar biochemical mechanisms. Multi-task learning enables us to make predictions even for drug combinations with few or no training examples and substantially improves the overall prediction accuracy. 1.
Supervised probabilistic principal component analysis
- in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 2006), Lyle Ungar, Mark Craven, Dimitrios Gunopulos, and Tina Eliassi-Rad, Eds
, 2006
"... Principal component analysis (PCA) has been extensively applied in data mining, pattern recognition and information retrieval for unsupervised dimensionality reduction. When labels of data are available, e.g., in a classification or regression task, PCA is however not able to use this information. T ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Principal component analysis (PCA) has been extensively applied in data mining, pattern recognition and information retrieval for unsupervised dimensionality reduction. When labels of data are available, e.g., in a classification or regression task, PCA is however not able to use this information. The problem is more interesting if only part of the input data are labeled, i.e., in a semi-supervised setting. In this paper we propose a supervised PCA model called SPPCA and a semi-supervised PCA model called S 2 PPCA, both of which are extensions of a probabilistic PCA model. The proposed models are able to incorporate the label information into the projection phase, and can naturally handle multiple outputs (i.e., in multi-task learning problems). We derive an efficient EM learning algorithm for both models, and also provide theoretical justifications of the model behaviors. SPPCA and S 2 PPCA are compared with other supervised projection methods on various learning tasks, and show not only promising performance but also good scalability.
A review of recent research in metareasoning and metalearning
- AI Magazine
, 2007
"... Recent years have seen a resurgence of interest in the use of metacognition in intelligent systems. This essay is part of a small section meant to give interested researchers an overview and sampling of the kinds of work currently being pursued in this broad area. The current essay offers a review o ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Recent years have seen a resurgence of interest in the use of metacognition in intelligent systems. This essay is part of a small section meant to give interested researchers an overview and sampling of the kinds of work currently being pursued in this broad area. The current essay offers a review of recent research in two main topic areas: the monitoring and control of reasoning (metareasoning) and the monitoring and control of learning (metalearning). What is metacognition in computation? Rosie (the robot maid from the TV show The Jetsons) spends her days cooking, cleaning, ironing, and attending to the usual household tasks of late 21 st century life. Because of a bug in one of her memory chips, however, she almost always forgets to buy dog food when she goes out. She has an adequate recovery plan for this: she simply feeds Astro some of the Jetson’s dinner. But 21 st century human food is expensive, so this strategy is wasteful. Realizing this, and recognizing that she has forgotten several times, Rosie adopts a special strategy to help her remember: she sticks the spare dog collar in her
Multi-Task Feature Learning Via Efficient L2,1-Norm Minimization
, 2009
"... The problem of joint feature selection across a group of related tasks has applications in many areas including biomedical informatics and computer vision. We consider the ℓ2,1-norm regularized regression model for joint feature selection from multiple tasks, which can be derived in the probabilisti ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
The problem of joint feature selection across a group of related tasks has applications in many areas including biomedical informatics and computer vision. We consider the ℓ2,1-norm regularized regression model for joint feature selection from multiple tasks, which can be derived in the probabilistic framework by assuming a suitable prior from the exponential family. One appealing feature of the ℓ2,1-norm regularization is that it encourages multiple predictors to share similar sparsity patterns. However, the resulting optimization problem is challenging to solve due to the non-smoothness of the ℓ2,1-norm regularization. In this paper, we propose to accelerate the computation by reformulating it as two equivalent smooth convex optimization problems which are then solved via the Nesterov’s method—an optimal first-order black-box method for smooth convex optimization. A key building block in solving the reformulations is the Euclidean projection. We show that the Euclidean projection for the first reformulation can be analytically computed, while the Euclidean projection for the second one can be computed in linear time. Empirical evaluations on several data sets verify the efficiency of the proposed algorithms.
A machine learning approach to conjoint analysis
- Neural Information Processing Systems 17
, 2005
"... Choice-based conjoint analysis builds models of consumer preferences over products with answers gathered in questionnaires. Our main goal is to bring tools from the machine learning community to solve this problem more efficiently. Thus, we propose two algorithms to quickly and accurately estimate c ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Choice-based conjoint analysis builds models of consumer preferences over products with answers gathered in questionnaires. Our main goal is to bring tools from the machine learning community to solve this problem more efficiently. Thus, we propose two algorithms to quickly and accurately estimate consumer preferences. 1
Transfer Learning in Sign language
"... We build word models for American Sign Language (ASL) that transfer between different signers and different aspects. This is advantageous because one could use large amounts of labelled avatar data in combination with a smaller amount of labelled human data to spot a large number of words in human d ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We build word models for American Sign Language (ASL) that transfer between different signers and different aspects. This is advantageous because one could use large amounts of labelled avatar data in combination with a smaller amount of labelled human data to spot a large number of words in human data. Transfer learning is possible because we represent blocks of video with novel intermediate discriminative features based on splits of the data. By constructing the same splits in avatar and human data and clustering appropriately, our features are both discriminative and semantically similar: across signers similar features imply similar words. We demonstrate transfer learning in two scenarios: from avatar to a frontally viewed human signer and from an avatar to human signer in a 3/4 view. 1.

