Results 1  10
of
12
On the Generalisation of Soft Margin Algorithms
 IEEE Transactions on Information Theory
, 2000
"... Generalisation bounds depending on the margin of a classier are a relatively recent development. They provide an explanation of the performance of stateoftheart learning systems such as Support Vector Machines (SVM) [12] and Adaboost [24]. The diculty with these bounds has been either their lack ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
(Show Context)
Generalisation bounds depending on the margin of a classier are a relatively recent development. They provide an explanation of the performance of stateoftheart learning systems such as Support Vector Machines (SVM) [12] and Adaboost [24]. The diculty with these bounds has been either their lack of robustness or their looseness. The question of whether the generalisation of a classier can be more tightly bounded in terms of a robust measure of the distribution of margin values has remained open for some time. The paper answers this open question in the armative and furthermore the analysis leads to bounds that motivate the previously heuristic soft margin SVM algorithms as well as justifying the use of the quadratic loss in neural network training algorithms. The results are extended to give bounds for the probability of failing to achieve a target accuracy in regression prediction, with a statistical analysis of Ridge Regression and Gaussian Processes as a special case. The analysis presented in the paper has also lead to new boosting algorithms described elsewhere [7].
Combinatorics of random processes and sections of convex bodies, preprint available at ArXiV http://front.math.ucdavis.edu, Banach Space Bulletin http://www.math.okstate.edu/~alspach/banach and our webpages, http://www.math.ucdavis.edu/~vershynin and http
"... We find a sharp combinatorial bound for the metric entropy of sets in R n and general classes of functions. This solves two basic combinatorial conjectures on the empirical processes. 1. A class of functions satisfies the uniform Central Limit Theorem if the square root of its combinatorial dimensio ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We find a sharp combinatorial bound for the metric entropy of sets in R n and general classes of functions. This solves two basic combinatorial conjectures on the empirical processes. 1. A class of functions satisfies the uniform Central Limit Theorem if the square root of its combinatorial dimension is integrable. 2. The uniform entropy is equivalent to the combinatorial dimension under minimal regularity. Our method also constructs a nicely bounded coordinate section of a symmetric convex body in R n. In the operator theory, this essentially proves for all normed spaces the restricted invertibility principle of Bourgain and Tzafriri. 1
Mathematical Programming Approaches To Machine Learning And Data Mining
, 1998
"... Machine learning problems of supervised classification, unsupervised clustering and parsimonious approximation are formulated as mathematical programs. The feature selection problem arising in the supervised classification task is effectively addressed by calculating a separating plane by minimizing ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Machine learning problems of supervised classification, unsupervised clustering and parsimonious approximation are formulated as mathematical programs. The feature selection problem arising in the supervised classification task is effectively addressed by calculating a separating plane by minimizing separation error and the number of problem features utilized. The support vector machine approach is formulated using various norms to measure the margin of separation. The clustering problem of assigning m points in ndimensional real space to k clusters is formulated as minimizing a piecewiselinear concave function over a polyhedral set. This problem is also formulated in a novel fashion by minimizing the sum of squared distances of data points to nearest cluster planes characterizing the k clusters. The problem of obtaining a parsimonious solution to a linear system where the right hand side vector may be corrupted by noise is formulated as minimizing the system residual plus either the number of nonzero elements in the solution vector or the norm of the solution vector. The feature selection problem, the clustering problem and the parsimonious approximation problem can all be stated as the minimization of a concave function over a polyhedral region and are solved by a theoretically justifiable, fast and finite successive linearization algorithm. Numerical tests indicate the utility and efficiency of these formulations on realworld databases. In particular, the feature selection approach via concave minimization computes a separatingplane based classifier that improves upon the generalization ability of a separating plane computed without feature suppression. This approach produces ii classifiers utilizing fewer original problem features than the support vector machin...
On the Performance of Learning Machines for Bankruptcy Detection
 In Proceedings of IEEE International Conference on Computational Cybernetics (ICCC
, 2004
"... Abstract — Predicting the financial health of companies is a problem of great importance to various stakeholders in the increasingly globalized economy. We apply several learning machines methods to the problem of bankrupcy prediction of private companies. Financial data obtained from Diana, a datab ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract — Predicting the financial health of companies is a problem of great importance to various stakeholders in the increasingly globalized economy. We apply several learning machines methods to the problem of bankrupcy prediction of private companies. Financial data obtained from Diana, a database containing 780,000 financial statements of French companies, are used to perform experiments. Classification accuracy is evaluated with respect to Artificial Neural Networks, Linear Genetic Programming and Support Vector Machines. We analyze both type I (bankrupted companies misclassified as healthy) and type II (healthy companies misclassified as bankrupted) errors on three datasets containing balanced and unbalanced class distribution. Linear Genetic Programming has the best accuracy in the balanced data while Support Vector Machines is more stable for the unbalanced dataset. Our results, though preliminary in nature, demonstrate the tremendous potential of using learning machines in solving important economics problems such as predicting bankruptcy with accuracy.
DÉPARTEMENT D’INFORMATIQUE ET DE GÉNIE LOGICIEL FACULTÉ DES SCIENCES ET DE GÉNIE
"... à la Faculté des études supérieures et postdoctorales de l’Université Laval dans le cadre du programme de doctorat en informatique pour l’obtention du grade de PhilosophiæDoctor (Ph.D.) ..."
Abstract
 Add to MetaCart
(Show Context)
à la Faculté des études supérieures et postdoctorales de l’Université Laval dans le cadre du programme de doctorat en informatique pour l’obtention du grade de PhilosophiæDoctor (Ph.D.)
Movie reviews: do words add up to a sentiment?
, 2010
"... Sentiment analysis, the automatic extraction of opinion from text, has been enjoying some attention in the media during the national elections. In this thesis, we will discuss the classification of movie reviews as ’thumbs up ’ or ’thumbs down’. Movie reviews are interesting and difficult because o ..."
Abstract
 Add to MetaCart
Sentiment analysis, the automatic extraction of opinion from text, has been enjoying some attention in the media during the national elections. In this thesis, we will discuss the classification of movie reviews as ’thumbs up ’ or ’thumbs down’. Movie reviews are interesting and difficult because of the wide range of topics in movies. The reviews are HTML web pages, which poses an interesting challenge for preprocessing and noise removal. We describe the reviews as ’bags of words ’ and use support vector machines (SVMs) for classification, as well as transductive support vector machines, which require less training data. To model topics in the reviews, a latent semantic analysis (LSA) was done on a large set of movie reviews. The results show that it is hard to improve SVM performance with latent semantic analysis. The discussion of the results provide some insights into why no performance increase was achieved. i ii
Kernel Methods and Support Vector Machines
, 2009
"... The tutorial is intended to give a broad introduction to the kernel approach to pattern analysis. This will cover: • Why linear pattern functions? • Why kernel approach? • How to plug and play with the different components of a kernelbased pattern analysis system? Chicago/TTI Summer School, June 20 ..."
Abstract
 Add to MetaCart
The tutorial is intended to give a broad introduction to the kernel approach to pattern analysis. This will cover: • Why linear pattern functions? • Why kernel approach? • How to plug and play with the different components of a kernelbased pattern analysis system? Chicago/TTI Summer School, June 2009 1 What won’t be included: • Other approaches to Pattern Analysis • Complete History • Bayesian view of kernel methods • More recent developments