Results 11  20
of
36
Evaluating Machine Learning Models for Engineering Problems
 Artificial Intelligence in Engineering
, 1999
"... : The use of machine learning (ML), and in particular, artificial neural networks (ANN), in engineering applications has increased dramatically over the last years. However, by and large, the development of such applications or their report lack proper evaluation. Deficient evaluation practice was o ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
: The use of machine learning (ML), and in particular, artificial neural networks (ANN), in engineering applications has increased dramatically over the last years. However, by and large, the development of such applications or their report lack proper evaluation. Deficient evaluation practice was observed in the general neural networks community and again in engineering applications through a survey we conducted of articles published in AI in Engineering and elsewhere. This deficient status hinders understanding and prevents progress. This paper goal is to remedy this situation. First, several evaluation methods are discussed with their relative qualities. Second, these qualities are illustrated by using the methods to evaluate ANN performance in two engineering problems. Third, a systematic evaluation procedure for ML is discussed. This procedure will lead to better evaluation of studies, and consequently to improved research and practice in the area of ML in engineering applications...
Model selection using Rademacher Penalization
 In Proceedings of the Second ICSC Symposia on Neural Computation (NC2000). ICSC Adademic
, 2000
"... In this paper we describe the use of Rademacher penalization for model selection. As in Vapnik's Guaranteed Risk Minimization (GRM), Rademacher penalization attemps to balance the complexity of the model with its t to the data by minimizing the sum of the training error and a penalty term, which is ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
In this paper we describe the use of Rademacher penalization for model selection. As in Vapnik's Guaranteed Risk Minimization (GRM), Rademacher penalization attemps to balance the complexity of the model with its t to the data by minimizing the sum of the training error and a penalty term, which is an upper bound on the absolute dierence between the training error and the generalization error. However, while the GRM penalty is universal, the computation of the Rademacher penalty is data driven which means that it depends on the distribution of the data and hence one can expect better performance for particular instances of learning problems. We present experimental evidence that shows that Rademacher penalization can be used as an eective method of model selection in learning problems. In particular wehave shown that for the intervals model selection problem, Rademacher penalization outperforms GRM and cross validation (CV) over a wide range of sample sizes. Our experiments also sho...
Asymptotic optimality of likelihoodbased crossvalidation
 STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY
, 2003
"... Likelihoodbased crossvalidation is a statistical tool for selecting a density estimate based on n i.i.d. observations from the true density among a collection of candidate density estimators. General examples are the selection of a model indexing a maximum likelihood estimator, and the selection o ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
Likelihoodbased crossvalidation is a statistical tool for selecting a density estimate based on n i.i.d. observations from the true density among a collection of candidate density estimators. General examples are the selection of a model indexing a maximum likelihood estimator, and the selection of a bandwidth indexing a nonparametric (e.g. kernel) density estimator. In this article, we establish a finite sample result for a general class of likelihoodbased crossvalidation procedures (as indexed by the type of sample splitting used, e.g. Vfold crossvalidation). This result implies that the crossvalidation selector performs asymptotically as well (w.r.t. to the KullbackLeibler distance to the true density) as a benchmark model selector which is optimal for each given dataset and depends on the true density. Crucial conditions of our theorem are that the size of the validation sample converges to infinity, which excludes leaveoneout crossvalidation, and that the candidate density estimates are bounded away from zero and infinity. We illustrate these asymptotic results and the practical performance of likelihoodbased crossvalidation for the purpose of bandwidth selection with a simulation study. Moreover, we use likelihoodbased crossvalidation in the context of regulatory motif detection in DNA sequences.
Towards Perceptual Intelligence: Statistical Modeling of Human Individual and Interactive Behaviors
 Prediction of Human Behavior, IEEE Intelligent Vehicles
, 1995
"... This thesis presents a computational framework for the automatic recognition and prediction of different kinds of human behaviors from video cameras and other sensors, via perceptually intelligent systems that automatically sense and correctly classify human behaviors, by means of Machine Perception ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
This thesis presents a computational framework for the automatic recognition and prediction of different kinds of human behaviors from video cameras and other sensors, via perceptually intelligent systems that automatically sense and correctly classify human behaviors, by means of Machine Perception and Machine Learning techniques. In the thesis I develop the statistical machine learning algorithms (dynamic graphical models) necessary for detecting and recognizing individual and interactive behaviors. In the case of the interactions two Hidden Markov Models (HMMs) are coupled in a novel architecture called Coupled Hidden Markov Models (CHMMs) that explicitly captures the interactions between them. The algorithms for learning the parameters from data as well as for doing inference with those models are developed and described. Four systems that experimentally evaluate the proposed paradigm are presented: (1) LAFTER, an automatic face detection and tracking system with facial expression recognition; (2) a TaiChi gesture recognition system; (3) a pedestrian surveillance system that recognizes typical human to human interactions; (4) and a SmartCar for driver maneuver recognition. These systems capture human behaviors of different nature and increasing complexity: first, isolated, singleuser facial expressions, then, twohand gestures and humantohuman interactions,...
Performance Prediction for Exponential Language Models
"... We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set crossentropy for ngram language models. We build models over varying domains, data set sizes, and ngram orders, an ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set crossentropy for ngram language models. We build models over varying domains, data set sizes, and ngram orders, and perform linear regression to see whether we can model test set performance as a simple function of training set performance and various model statistics. Remarkably, we find a simple relationship that predicts test set performance with a correlation of 0.9997. We analyze why this relationship holds and show that it holds for other exponential language models as well, including classbased models and minimum discrimination information models. Finally, we discuss how this relationship can be applied to improve language model performance. 1
Selected Training Exemplars for Neural Network Learning
, 1994
"... The dissertation of Mark Plutowski is approved, and it is acceptable in quality and form for publication on microfilm: CoChair CoChair ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
The dissertation of Mark Plutowski is approved, and it is acceptable in quality and form for publication on microfilm: CoChair CoChair
An Empirical Investigation of Learning from the Semantic Web
, 2002
"... The Semantic Web is a vision of a machine readable Web of resources, interlinked and connected through metadata with common ontologies. In this paper we explore the impact such a Semantic Web would have on Machine Learning algorithms used for user profiling and personalisation. Our hypothesis is th ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
The Semantic Web is a vision of a machine readable Web of resources, interlinked and connected through metadata with common ontologies. In this paper we explore the impact such a Semantic Web would have on Machine Learning algorithms used for user profiling and personalisation. Our hypothesis is that learning from the Semantic Web should outperform traditional learning from today 's World Wide Web for both performance and accuracy. In this paper we present results obtained with two different datasets markedup with semantic metadata; using these we have investigated different instance representations and various learning techniques. Our initial results with the Nave Bayes and KNN algorithms were disappointing, leading us to examine the use of the Progol algorithm.
VC Theory of Large Margin MultiCategory Classifiers
"... In the context of discriminant analysis, Vapnik’s statistical learning theory has mainly been developed in three directions: the computation of dichotomies with binaryvalued functions, the computation of dichotomies with realvalued functions, and the computation of polytomies with functions taking ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
In the context of discriminant analysis, Vapnik’s statistical learning theory has mainly been developed in three directions: the computation of dichotomies with binaryvalued functions, the computation of dichotomies with realvalued functions, and the computation of polytomies with functions taking their values in finite sets, typically the set of categories itself. The case of classes of vectorvalued functions used to compute polytomies has seldom been considered independently, which is unsatisfactory, for three main reasons. First, this case encompasses the other ones. Second, it cannot be treated appropriately through a naïve extension of the results devoted to the computation of dichotomies. Third, most of the classification problems met in practice involve multiple categories. In this paper, a VC theory of large margin multicategory classifiers is introduced. Central in this theory are generalized VC dimensions called the γΨdimensions. First, a uniform convergence bound on the risk of the classifiers of interest is derived. The capacity measure involved in this bound is a covering number. This covering number can be upper bounded in terms of the γΨdimensions thanks to generalizations of Sauer’s lemma, as is illustrated in the specific case of the scalesensitive Natarajan dimension. A bound on this latter dimension is then computed for the class of functions on which multiclass SVMs are based. This makes it possible to apply the structural risk minimization inductive principle to those machines.
A Scaling Law for the ValidationSet TrainingSet Size Ratio
 AT & T Bell Laboratories
, 1997
"... We address the problem of determining what fraction of the training set should be reserved as development test set or validation set. We determine that the ratio of the validation set size over the training set size scales like the square root of two complexity parameters: the complexity of the seco ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We address the problem of determining what fraction of the training set should be reserved as development test set or validation set. We determine that the ratio of the validation set size over the training set size scales like the square root of two complexity parameters: the complexity of the second level of inference (minimizing the validation error) over the complexity of the first level of inference (minimizing the error rate on the training set). Keywords: Crossvalidation; Learning Theory; Statistics; Machine Learning; Pattern Recognition; Training Set; Validation Set; Test Set; Experiment Design. Introduction The problem often arises when organizing benchmarks in pattern recognition to determine what size test set will give statistically significant results. In a companion paper [1], we tackled the problem from the point of view of the benchmark organizer: From a corpus of available data, how much data should be reserved for the benchmark test set? In this paper, we tackle th...
Handling Uncertainty When You're Handling Uncertainty: Model Selection and Error Bars for Belief Networks
, 2000
"... Belief networks are a common way of handling uncertainty in AI. A belief network represents the joint distribution of a set of random variables. When network parameters are estimated from a sample, the parameter values are also random variables whose distribution is given by the sampling distributio ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Belief networks are a common way of handling uncertainty in AI. A belief network represents the joint distribution of a set of random variables. When network parameters are estimated from a sample, the parameter values are also random variables whose distribution is given by the sampling distribution of the true model (Frequentist perspective) or the posterior distribution over the parameter space (Bayesian perspective). The uncertainty in parameter values has implications for both inference and learning. In learning network structure from data, a fundamental issue is how to handle the biasvariance tradeoff  increasing model complexity decreases bias but increases the variance in parameter values. We compare model selection criteria for handling the biasvariance tradeoff in structure learning, on theoretical and empirical grounds. We also look at the issue of the uncertainty in belief network inference. Once constructed, belief networks are typically used to answer queries about mar...