Results 1 -
9 of
9
Soft Margins for AdaBoost
, 1998
"... Recently ensemble methods like AdaBoost were successfully applied to character recognition tasks, seemingly defying the problems of overfitting. This paper shows that although AdaBoost rarely overfits in the low noise regime it clearly does so for higher noise levels. Central for understanding this ..."
Abstract
-
Cited by 199 (22 self)
- Add to MetaCart
Recently ensemble methods like AdaBoost were successfully applied to character recognition tasks, seemingly defying the problems of overfitting. This paper shows that although AdaBoost rarely overfits in the low noise regime it clearly does so for higher noise levels. Central for understanding this fact is the margin distribution and we find that AdaBoost achieves -- doing gradient descent in an error function with respect to the margin -- asymptotically a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns (here an interesting overlap emerge to Support Vectors). This is clearly a sub-optimal strategy in the noisy case, and regularization, i.e. a mistrust in the data, must be introduced in the algorithm to alleviate the distortions that a difficult pattern (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original AdaBoost algorithm to achieve a soft margin -- a ...
Weighted-support vector machines for predicting membrane protein types based on pseudo-amino acid composition. Protein
- Sel
, 2004
"... Membrane proteins are generally classified into the following five types: (1) type I membrane protein, (2) type II membrane protein, (3) multipass transmembrane proteins, (4) lipid chain-anchored membrane proteins, and (5) GPIanchored membrane proteins. Prediction of membrane protein types has becom ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Membrane proteins are generally classified into the following five types: (1) type I membrane protein, (2) type II membrane protein, (3) multipass transmembrane proteins, (4) lipid chain-anchored membrane proteins, and (5) GPIanchored membrane proteins. Prediction of membrane protein types has become one of the growing hot topics in bioinformatics. Currently, we are facing two critical challenges in this area. One is how to take into account the extremely complicated sequence-order effects; the other is how to deal with the highly uneven sizes of the subsets in a training dataset. In this paper, stimulated by the concept of using the pseudo-amino-acid composition (Chou, K.C.: PROTEINS: Structure, Function, and Genetics, 43: 246-255, 2001; ibid. 2001, 44, 60) to incorporate the sequence-order effects, the spectral analysis technique is introduced to represent the statistical sample of a protein. Based on such a framework, the weighted support vector machine (SVM) algorithm is applied. The new approach has a remarkable power in dealing with the bias caused by the situation when one subset in the training dataset
Image Replica Detection based on Support Vector Classifier
- IN PROC. SPIE APPLICATIONS OF DIGITAL IMAGE PROCESSING XXVIII
, 2005
"... In this paper, we propose a technique for image replica detection. By replica, we mean equivalent versions of a given reference image, e.g. after it has undergone operations such as compression, filtering or resizing. Applications of this technique include discovery of copyright infringement or dete ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
In this paper, we propose a technique for image replica detection. By replica, we mean equivalent versions of a given reference image, e.g. after it has undergone operations such as compression, filtering or resizing. Applications of this technique include discovery of copyright infringement or detection of illicit content. The technique
A Linear Classification Model Based on Conditional Geometric Score
- Pacific Journal of Optimization
"... Abstract. We propose a two-class linear classification model by taking into account the Euclidean distance from each data point to the discriminant hyperplane and introducing a risk measure which is known as the conditional value-at-risk in financial risk management. It is formulated as a nonconvex ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. We propose a two-class linear classification model by taking into account the Euclidean distance from each data point to the discriminant hyperplane and introducing a risk measure which is known as the conditional value-at-risk in financial risk management. It is formulated as a nonconvex programming problem and we present a solution method for obtaining either a globally or a locally optimal solution by examining the special structure of the problem. Also, this model is proved to be equivalent to the ν-support vector classification under some parameter setting, and numerical experiments show that the proposed model has better predictive accuracy in general. Key words. classification model, discriminant hyperplane, conditional value-at-risk, nonconvex programming.
Adaptive Sampling Based Large-Scale Stochastic Resource Control
, 2006
"... We consider closed-loop solutions to stochastic optimization problems of resource allocation type. They concern with the dynamic allocation of reusable resources over time to non-preemtive interconnected tasks with stochastic durations. The aim is to minimize the expected value of a regular performa ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We consider closed-loop solutions to stochastic optimization problems of resource allocation type. They concern with the dynamic allocation of reusable resources over time to non-preemtive interconnected tasks with stochastic durations. The aim is to minimize the expected value of a regular performance measure. First, we formulate the problem as a stochastic shortest path problem and argue that our formulation has favorable properties, e.g., it has finite horizon, it is acyclic, thus, all policies are proper, and moreover, the space of control policies can be safely restricted. Then, we propose an iterative solution. Essentially, we apply a reinforcement learning based adaptive sampler to compute a suboptimal control policy. We suggest several approaches to enhance this solution and make it applicable to largescale problems. The main improvements are: (1) the value function is maintained by feature-based support vector regression; (2) the initial exploration is guided by rollout algorithms; (3) the state space is partitioned by clustering the tasks while keeping the precedence constraints satisfied; (4) the action space is decomposed and, consequently, the number of available actions in a state is decreased; and, finally, (5) we argue that the sampling can be effectively distributed among several processors. The effectiveness of the approach is demonstrated by experimental results on both artificial (benchmark) and real-world (industry related) data.
Identification of Image Variations based on Equivalence Classes
- IN VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP) 2005, SPIE
, 2005
"... This paper presents a fingerprinting method based on equivalence classes. An equivalence class is composed of a reference image and all its variations (or replicas). For each reference image, a decision function is built. The latter determines if a given image belongs to its corresponding equivalenc ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper presents a fingerprinting method based on equivalence classes. An equivalence class is composed of a reference image and all its variations (or replicas). For each reference image, a decision function is built. The latter determines if a given image belongs to its corresponding equivalence class. This function is built in three steps: synthesis, projection, and analysis. In the first step, the reference image is replicated using di#erent image operators (like JPEG compression, average filtering, etc). During the projection step, the replicas are projected onto a distance space. In the final step, the distance space is analyzed, using machine learning algorithms, and the decision function is built. In this study, three machine learning approaches are compared: orthotope, support vectors machine (SVM), and support vectors data description (SVDD). The orthotope is a computationally e#cient ad-hoc method. It consists in building a generalized rectangle in the distance space. The SVM and SVDD are two more general learning algorithms.
Incremental Sparsification for Real-time Online Model Learning
"... Online model learning in real-time is required by many applications such as in robot tracking control. It poses a difficult problem, as fast and incremental online regression with large data sets is the essential component which cannot be achieved by straightforward usage of off-the-shelf machine le ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Online model learning in real-time is required by many applications such as in robot tracking control. It poses a difficult problem, as fast and incremental online regression with large data sets is the essential component which cannot be achieved by straightforward usage of off-the-shelf machine learning methods (such as Gaussian process regression or support vector regression). In this paper, we propose a framework for online, incremental sparsification with a fixed budget designed for large scale real-time model learning. The proposed approach combines a sparsification method based on an independence measure with a large scale database. In combination with an incremental learning approach such as sequential support vector regression, we obtain a regression method which is applicable in real-time online learning. It exhibits competitive learning accuracy when compared with standard regression techniques. Implementation on a real robot emphasizes the applicability of the proposed approach in real-time online model learning for real world systems. 1
Incremental Online Sparsification for Model Learning in Real-time Robot Control
"... For many applications such as compliant, accurate robot tracking control, dynamics models learned from data can help to achieve both compliant control performance as well as high tracking quality. Online learning of these dynamics models allows the robot controller to adapt itself to changes in the ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
For many applications such as compliant, accurate robot tracking control, dynamics models learned from data can help to achieve both compliant control performance as well as high tracking quality. Online learning of these dynamics models allows the robot controller to adapt itself to changes in the dynamics (e.g., due to time-variant nonlinearities or unforeseen loads). However, online learning in real-time applications – as required in control – cannot be realized by straightforward usage of off-the-shelf machine learning methods such as Gaussian process regression or support vector regression. In this paper, we propose a framework for online, incremental sparsification with a fixed budget designed for fast real-time model learning. The proposed approach employs a sparsification method based on an independence measure. In combination with an incremental learning approach such as incremental Gaussian process regression, we obtain a model approximation method which is applicable in real-time online learning. It exhibits competitive learning accuracy when compared with standard regression techniques. Implementation on a real Barrett WAM robot demonstrates the applicability of the approach in real-time online model learning for real world systems.

