Results 1 - 10
of
11
Semi-Supervised Learning Literature Survey
, 2006
"... We review the literature on semi-supervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semi-supervised learning. This document is a chapter ..."
Abstract
-
Cited by 268 (7 self)
- Add to MetaCart
We review the literature on semi-supervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semi-supervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions
, 2009
"... Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypi ..."
Abstract
-
Cited by 69 (17 self)
- Add to MetaCart
Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypical emotions, despite the fact that deliberate behavior differs in visual appearance, audio profile, and timing from spontaneously occurring behavior. To address this problem, efforts to develop algorithms that can process naturally occurring human affective behavior have recently emerged. Moreover, an increasing number of efforts are reported toward multimodal fusion for human affect analysis, including audiovisual fusion, linguistic and paralinguistic fusion, and multicue visual fusion based on facial expressions, head movements, and body gestures. This paper introduces and surveys these recent advances. We first discuss human emotion perception from a psychological perspective. Next, we examine available approaches for solving the problem of machine understanding of human affective behavior and discuss important issues like the collection and availability of training and test data. We finally outline some of the scientific and engineering challenges to advancing human affect sensing technology.
Semi-supervised regression with co-training style algorithms
, 2007
"... The traditional setting of supervised learning requires a large amount of labeled training examples in order to achieve good generalization. However, in many practical applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore, semi-sup ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
The traditional setting of supervised learning requires a large amount of labeled training examples in order to achieve good generalization. However, in many practical applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning has attracted much attention. Previous research on semi-supervised learning mainly focuses on semi-supervised classification. Although regression is almost as important as classification, semi-supervised regression is largely understudied. In particular, although co-training is a main paradigm in semi-supervised learning, few works has been devoted to co-training style semi-supervised regression algorithms. In this paper, a co-training style semi-supervised regression algorithm, i.e. COREG, is proposed. This algorithm uses two regressors each labels the unlabeled data for the other regressor, where the confidence in labeling an unlabeled example is estimated through the amount of reduction in mean square error over the labeled neighborhood of that example. Analysis and experiments show that COREG can effectively exploit unlabeled data to improve regression estimates.
Semi-supervised learning with very few labeled training examples
- Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07
, 2007
"... In semi-supervised learning, a number of labeled examples are usually required for training an initial weakly useful predictor which is in turn used for exploiting the unlabeled examples. However, in many real-world applications there may exist very few labeled training examples, which makes the wea ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
In semi-supervised learning, a number of labeled examples are usually required for training an initial weakly useful predictor which is in turn used for exploiting the unlabeled examples. However, in many real-world applications there may exist very few labeled training examples, which makes the weakly useful predictor difficult to generate, and therefore these semisupervised learning methods cannot be applied. This paper proposes a method working under a two-view setting. By taking advantages of the correlations between the views using canonical component analysis, the proposed method can perform semi-supervised learning with only one labeled training example. Experiments and an application to content-based image retrieval validate the effectiveness of the proposed method.
Improve Computer-Aided Diagnosis with Machine Learning Techniques Using Undiagnosed Samples
"... In computer-aided diagnosis, machine learning techniques have been widely applied to learn hypothesis from diagnosed samples in order to assist the medical experts in making diagnosis. To learn a well-performed hypothesis, a large amount of diagnosed samples are required. Although the samples can b ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
In computer-aided diagnosis, machine learning techniques have been widely applied to learn hypothesis from diagnosed samples in order to assist the medical experts in making diagnosis. To learn a well-performed hypothesis, a large amount of diagnosed samples are required. Although the samples can be easily collected from routine medical examinations, it is usually impossible for the medical experts to make diagnosis for each of the collected samples. If hypothesis could be learned in presence of a large amount of undiagnosed samples, the heavy burden on the medical experts could be released. In this paper, a new semi-supervised learning algorithm named Co-Forest is proposed. It extends the co-training paradigm by using a well-known ensemble method named Random Forest, which enables Co-Forest to estimate the labeling confidence of undiagnosed samples and produce the final hypothesis easily. Experiments on benchmark data sets verify the effectiveness of the proposed algorithm. Case studies on three medical data sets and a successful application to microcalcification detection for breast cancer diagnosis show that undiagnosed samples are helpful in building computer-aided diagnosis systems, and Co-Forest is able to enhance the performance of the hypothesis learned on only a small amount of diagnosed samples by utilizing the available undiagnosed samples.
On Multi-View Active Learning and the Combination with Semi-Supervised Learning
"... Multi-view learning has become a hot topic during the past few years. In this paper, we first characterize the sample complexity of multi-view active learning. Under the α-expansion assumption, we get an exponential improvement in the sample complexity 1 1 from usual Õ( ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Multi-view learning has become a hot topic during the past few years. In this paper, we first characterize the sample complexity of multi-view active learning. Under the α-expansion assumption, we get an exponential improvement in the sample complexity 1 1 from usual Õ(
Learning with Unlabeled Data and Its Application to Image Retrieval
"... Abstract. In many practical machine learning or data mining applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain because labeling the examples require human effort. So, learning with unlabeled data has attracted much attention during the pas ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. In many practical machine learning or data mining applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain because labeling the examples require human effort. So, learning with unlabeled data has attracted much attention during the past few years. This paper shows that how such techniques can be helpful in a difficult task, content-based image retrieval, for improving the retrieval performance by exploiting images existing in the database. 1 Learning with Unlabeled Data In the traditional setting of supervised learning, a large amount of training examples should be available for building a model with good generalization ability. It is noteworthy that these training examples should be labeled, that is, the ground-truth labels of them are known to the learner. Unfortunately, in many practical machine learning or data mining applications such as web page classification, although a large number of unlabeled training examples can be easily
Top Ontology B
"... User interaction is an important factor that affects the success of a ontology matching system but receives little consideration. We study the effect of user interaction on the performance of the matching system through an adaptive machine learning framework. Experimental results show that user inte ..."
Abstract
- Add to MetaCart
User interaction is an important factor that affects the success of a ontology matching system but receives little consideration. We study the effect of user interaction on the performance of the matching system through an adaptive machine learning framework. Experimental results show that user interaction can help to improve the matching system’s performance, with little manual annotation cost.
When Does Co-Training Work in Real Data?
, 2009
"... Co-training, a paradigm of semi-supervised learning, is promised to alleviate effectively the shortage of labeled examples in supervised learning. The standard two-view co-training requires the dataset to be described by two views of features, and previous studies have shown that co-training works ..."
Abstract
- Add to MetaCart
Co-training, a paradigm of semi-supervised learning, is promised to alleviate effectively the shortage of labeled examples in supervised learning. The standard two-view co-training requires the dataset to be described by two views of features, and previous studies have shown that co-training works well if the two views satisfy the sufficiency and independence assumptions. In practice, however, these two assumptions are often not known or ensured (even when the two views are given). More commonly, most supervised datasets are described by one set of attributes (one view). Thus, they need be split into two views in order to apply the standard twoview co-training. In this paper, we first propose a novel approach to empirically verify the two assumptions of co-training given two views. Then, we design several methods to split single view datasets into two views, in order to make co-training work reliably well. Our empirical results show that, given a whole or a large labeled training set, our view verification and splitting methods are quite effective. Unfortunately, co-training is called for precisely when the labeled training set is small. However, given small labeled training sets, we show that the two co-training assumptions are difficult to verify, and view splitting is unreliable. Our conclusions for co-training’s effectiveness are mixed. If two views are given, and known to satisfy the two assumptions, co-training works well. Otherwise, based on small labeled training sets, verifying the assumptions or splitting single view into two views are unreliable, thus it is uncertain whether the standard co-training would work or not.
Relevance Feature Mapping for Content-Based Image Retrieval
"... This paper presents a ranking framework for content-based image retrieval using relevance feature mapping. Each relevance feature measures the relevance of an image to some profile underlying the image database. The framework is a two-stage process. In the off-line modeling stage, it constructs a co ..."
Abstract
- Add to MetaCart
This paper presents a ranking framework for content-based image retrieval using relevance feature mapping. Each relevance feature measures the relevance of an image to some profile underlying the image database. The framework is a two-stage process. In the off-line modeling stage, it constructs a collection of models which maps all images in the database to the relevance feature space. In the on-line retrieval stage, it assigns a weight to every relevance feature based on the query image, and then ranks images in the database according to their weighted average feature values. The framework also incorporates relevance feedback which modifies the ranking based on the feedbacks through reweighted features. We show that the power of the proposed framework is coming from the relevance features. Experiments on a large image database validate the efficacy and efficiency of the proposed framework.

