Results 1  10
of
47
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 828 (3 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
The Sample Complexity of Pattern Classification With Neural Networks: The Size of the Weights is More Important Than the Size of the Network
, 1997
"... Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the ne ..."
Abstract

Cited by 211 (15 self)
 Add to MetaCart
Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a twolayer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A³ p (log n)=m (ignori...
Building Text Classifiers using Positive and Unlabeled Examples
 In Proc. of the ICDM’03
, 2003
"... This paper studies the problem of building text classifiers using positive and unlabeled examples. The key feature of this problem is that there is no negative example for learning. Recently, a few techniques for solving this problem were proposed in the literature. These techniques are based on the ..."
Abstract

Cited by 112 (16 self)
 Add to MetaCart
(Show Context)
This paper studies the problem of building text classifiers using positive and unlabeled examples. The key feature of this problem is that there is no negative example for learning. Recently, a few techniques for solving this problem were proposed in the literature. These techniques are based on the same idea, which builds a classifier in two steps. Each existing technique uses a different method for each step. In this paper, we first introduce some new methods for the two steps, and perform a comprehensive evaluation of all possible combinations of methods of the two steps. We then propose a more principled approach to solving the problem based on a biased formulation of SVM, and show experimentally that it is more accurate than the existing techniques. 1.
Robust Decision Trees: Removing Outliers from Databases
 In Knowledge Discovery and Data Mining
, 1995
"... Finding and removing outliers is an important problem in data mining. Errors in large databases can be extremely common, so an important property of a data mining algorithm is robustness with respect to errors in the database. Most sophisticated methods in machine learning address this problem ..."
Abstract

Cited by 74 (0 self)
 Add to MetaCart
Finding and removing outliers is an important problem in data mining. Errors in large databases can be extremely common, so an important property of a data mining algorithm is robustness with respect to errors in the database. Most sophisticated methods in machine learning address this problem to some extent, but not fully, and can be improved by addressing the problem more directly. In this paper we examine C4.5, a decision tree algorithm that is already quite robust  few algorithms have been shown to consistently achieve higher accuracy. C4.5 incorporates a pruning scheme that partially addresses the outlier removal problem. In our RobustC4.5 algorithm we extend the pruning method to fully remove the effect of outliers, and this results in improvement on many databases. In U. M. Fayyad and R. Uthurusamy, editors, Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 174179, AAAI Press, Menlo Park, CA, 1995. Introduction...
Discovering Informative Patterns and Data Cleaning
, 1996
"... We present a method for discovering informative patterns from data. With this method, large databases can be reduced to only a few representative data entries. Our framework also encompasses methods for cleaning databases containing corrupted data. Both online and offline algorithms are proposed a ..."
Abstract

Cited by 60 (1 self)
 Add to MetaCart
We present a method for discovering informative patterns from data. With this method, large databases can be reduced to only a few representative data entries. Our framework also encompasses methods for cleaning databases containing corrupted data. Both online and offline algorithms are proposed and experimentally checked on databases of handwritten images. The generality of the framework makes it an attractive candidate for new applications in knowledge discovery. Keywords: knowledge discovery, machine learning, informative patterns, data cleaning, information gain. 4.1
Support Vector Machines for Automated Gait Classification
, 2005
"... Ageing influences gait patterns causing constant threats to control of locomotor balance. Automated recognition of gait changes has many advantages including, early identification of atrisk gait and monitoring the progress of treatment outcomes. In this paper, we apply an artificial intelligence t ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
(Show Context)
Ageing influences gait patterns causing constant threats to control of locomotor balance. Automated recognition of gait changes has many advantages including, early identification of atrisk gait and monitoring the progress of treatment outcomes. In this paper, we apply an artificial intelligence technique [support vector machines (SVM)] for the automatic recognition of youngold gait types from their respective gaitpatterns. Minimum foot clearance (MFC) data of 30 young and 28 elderly participants were analyzed using a PEAK2D motion analysis system during a 20min continuous walk on a treadmill at selfselected walking speed. Gait features extracted from individual MFC histogramplot and Poincaréplot images were used to train the SVM. Crossvalidation test results indicate that the generalization performance of the SVM was on average 83.3 % @ P WA to recognize young and elderly gait patterns, compared to a neural network’s accuracy of US H S H%. A “hillclimbing ” feature selection algorithm demonstrated that a small subset (3–5) of gait features extracted from MFC plots could differentiate the gait patterns with 90 % accuracy. Performance of the gait classifier was evaluated using areas under the receiver operating characteristic plots. Improved performance of the classifier was evident when trained with reduced number of selected good features and with radial basis function kernel. These results suggest that SVMs can function as an efficient gait classifier for recognition of young and elderly gait patterns, and has the potential for wider applications in gait identification for fallsrisk minimization in the elderly.
Reducing communication for distributed learning in neural networks
 In Proc. ICANN'2002
, 2002
"... ..."
(Show Context)
Support Vector Machines for Phoneme Classification
, 2001
"... In this thesis, Support Vector Machines (SVMs) are applied to the problem of phoneme classification. Given a sequence of acoustic observations and 40 phoneme targets, the task is to classify each observation to one of these targets. Since this task involves multiple classes, one of the main hurdles ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
In this thesis, Support Vector Machines (SVMs) are applied to the problem of phoneme classification. Given a sequence of acoustic observations and 40 phoneme targets, the task is to classify each observation to one of these targets. Since this task involves multiple classes, one of the main hurdles SVMs must overcome is to extend the inherently binary SVMs to the multiclass case. To do this, several methods are proposed, and their generalisation abilities are measured. It is found that even though some generalisation is lost in the transition, this can still lead to effective classifiers. In addition, a refinement to the SVMs is made to derive estimated posterior probabilities from classifications. Since almost all speech recognition systems are based on statistical models, this is necessary if SVMs are to be used in a full speech recognition system. The best accuracy found was 71.4%, which is competitive with the best results found in literature.
Fast Support Vector Machine Classification of very large Datasets
 University of Freiburg, Department of Computer
"... Abstract. In many classification applications, Support Vector Machines (SVMs) have proven to be highly performing and easy to handle classifiers with very good generalization abilities. However, one drawback of the SVM is its rather high classification complexity which scales linearly with the numbe ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In many classification applications, Support Vector Machines (SVMs) have proven to be highly performing and easy to handle classifiers with very good generalization abilities. However, one drawback of the SVM is its rather high classification complexity which scales linearly with the number of Support Vectors (SVs). This is due to the fact that for the classification of one sample, the kernel function has to be evaluated for all SVs. To speed up classification, different approaches have been published, most which of try to reduce the number of SVs. In our work, which is especially suitable for very large datasets, we follow a different approach: as we showed in [12], it is effectively possible to approximate large SVM problems by decomposing the original problem into linear subproblems, where each subproblem can be evaluated in Ω(1). This approach is especially successful, when the assumption holds that a large classification problem can be split into mainly easy and only a few hard subproblems. On standard benchmark datasets, this approach achieved great speedups while suffering only sightly in terms of classification accuracy and generalization ability. In this contribution, we extend the methods introduced in [12] using not only linear, but also nonlinear subproblems for the decomposition of the original problem which further increases the classification performance with only a little loss in terms of speed. An implementation of our method is available in [13]. Due to page limitations, we had to move some of theoretic details (e.g. proofs) and extensive experimental results to a technical report [14]. 1
Robust Linear Discriminant Trees
 In AI&Statistics95 [7
"... We present a new method for the induction of classification trees with linear discriminants as the partitioning function at each internal node. This paper presents two main contributions: first, a novel objective function called soft entropy which is used to identify optimal coefficients for the lin ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
We present a new method for the induction of classification trees with linear discriminants as the partitioning function at each internal node. This paper presents two main contributions: first, a novel objective function called soft entropy which is used to identify optimal coefficients for the linear discriminants, and second, a novel method for removing outliers called iterative refiltering which boosts performance on many datasets. These two ideas are presented in the context of a single learning algorithm called DTSEPIR, which is compared with the CART and OC1 algorithms. 36.1 Introduction Recursive partitioning classifiers, or decision trees, are an important nonparametric function representation in statistics and machine learning (Friedman 1977, Breiman, Friedman, Olshen & Stone 1984, Quinlan 1986, Quinlan 1993). Their wide and successful use in fielded applications and their simple intuitive appeal make decision tree learning algorithms an important area of study. In this p...