Results 1 - 10
of
71
A tutorial on support vector machines for pattern recognition
- Data Mining and Knowledge Discovery
, 1998
"... The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SV ..."
Abstract
-
Cited by 1656 (11 self)
- Add to MetaCart
The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.
Kernel partial least squares regression in reproducing kernel hilbert space
- Journal of Machine Learning Research
, 2001
"... A family of regularized least squares regression models in a Reproducing Kernel Hilbert Space is extended by the kernel partial least squares (PLS) regression model. Similar to principal components regression (PCR), PLS is a method based on the projection of input (explanatory) variables to the late ..."
Abstract
-
Cited by 75 (5 self)
- Add to MetaCart
A family of regularized least squares regression models in a Reproducing Kernel Hilbert Space is extended by the kernel partial least squares (PLS) regression model. Similar to principal components regression (PCR), PLS is a method based on the projection of input (explanatory) variables to the latent variables (components). However, in contrast to PCR, PLS creates the components by modeling the relationship between input and output variables while maintaining most of the information in the input variables. PLS is useful in situations where the number of explanatory variables exceeds the number of observations and/or a high level of multicollinearity among those variables is assumed. Motivated by this fact we will provide a kernel PLS algorithm for construction of nonlinear regression models in possibly high-dimensional feature spaces. We give the theoretical description of the kernel PLS algorithm and we experimentally compare the algorithm with the existing kernel PCR and kernel ridge regression techniques. We will demonstrate that on the data sets employed kernel PLS achieves the same results as kernel PCR but uses significantly fewer, qualitatively different components. 1.
Artificial selection for increased wheel-running behavior in house mice
- BEHAV. GENET
, 1998
"... Replicated within-family selection for increased voluntary wheel running in outbred house mice (Mus domesticus; Hsd:ICR strain) was applied with four high-selected and four control lines (10 families/line). Mice were housed individually with access to activity wheels for a period of 6 days, and sele ..."
Abstract
-
Cited by 36 (15 self)
- Add to MetaCart
Replicated within-family selection for increased voluntary wheel running in outbred house mice (Mus domesticus; Hsd:ICR strain) was applied with four high-selected and four control lines (10 families/line). Mice were housed individually with access to activity wheels for a period of 6 days, and selection was based on the mean number of revolutions run on days 5 and 6. Prior to selection, heritabilities of mean revolutions run per day (rev/day), average running velocity (rpm), and number of minutes during which any activity occurred (min/day) were estimated by midparent-offspring regression. Heritabilities were 0.18, 0.28, and 0.14, respectively; the estimate for min/day did not differ significantly from zero. Ten generations of selection for increased rev/day resulted in an average 75 % increase in activity in the four selected lines, as compared with control lines. Realized heritability averaged 0.19 (range, 0.12-0.24 for the high-activity lines), or 0.28 when adjusted for within-family selection. Rev/day increased mainly through changes in rpm rather than min/day. These lines will be studied for correlated responses in exercise physiology capacities and will be made available to other researchers on request.
An Empirical Comparison of Static Concurrency Analysis Techniques
, 1996
"... This paper reports the results of an empirical comparison of several static analysis tools for evaluating properties of concurrent software and also reports the results of our attempts to build predictive models for each of the tools based on program and property characteristics. Although this area ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
This paper reports the results of an empirical comparison of several static analysis tools for evaluating properties of concurrent software and also reports the results of our attempts to build predictive models for each of the tools based on program and property characteristics. Although this area seems well suited to empirical investigation, we encountered a number of significant issues that make designing a sound and unbiased study surprisingly difficult. These experiment design issues are also discussed in this paper.
Charting presence in virtual environments and its effects on performance
- DEPARTMENT OF INDUSTRIAL & SYSTEMS ENGINEERING. PH.D. DISSERTATION. VIRGINIA TECH
, 1996
"... Virtual reality (VR) involves an attempt to create an illusion that the user of the VR system is actually present in a synthetic (usually computer-generated) environment. Little is known about how various system parameters affect the illusion of presence in a virtual environment (VE). In particular, ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Virtual reality (VR) involves an attempt to create an illusion that the user of the VR system is actually present in a synthetic (usually computer-generated) environment. Little is known about how various system parameters affect the illusion of presence in a virtual environment (VE). In particular, there seem to be very little quantitative data on which to base VR system design decisions. Also, while presence (or immersion) in VEs is a primary goal of VR, not much is known about how this variable affects task performance. The goal of this research was to provide a ratio-scale measure of perceived presence in a VE, to explore the effects of a number of environmental parameters on this measure and construct empirical models of these effects, and to relate perceived presence to user performance. This was done by manipulating eleven independent variables in a series of three experiments. The independent variables manipulated were scene update rate, visual display resolution, field of view, sound, textures, head-tracking, stereopsis, virtual personal risk, number of possible interactions, presence of a second user, and environmental detail. Participants performed a set of five tasks in the VE and rated perceived presence at the end of each set using the technique of freemodulus magnitude estimation. The amount of time spent in the VE was also recorded. The results
A Multi-resolution Manifold Distance for Invariant Image Similarity
"... Accounting for spatial image transformations is a requirement for multimedia problems such as video classification and retrieval, face/object recognition or the creation of image mosaics from video sequences. We analyze a transformation invariant metric recently proposed in the machine learning lite ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Accounting for spatial image transformations is a requirement for multimedia problems such as video classification and retrieval, face/object recognition or the creation of image mosaics from video sequences. We analyze a transformation invariant metric recently proposed in the machine learning literature to measure the distance between image manifolds-the tangent distance (TD)- and show that it is closely related to alignment techniques from the motion analysis literature. Exposing these relationships results in benefits for the two domains. On one hand, it allows leveraging on the knowledge acquired in the alignment literature to build better classifiers. On the other, it provides a new interpretation of alignment techniques as one component of a decomposition that has interesting properties for the classification of video. In particular, we embed the TD into a multi-resolution framework that makes it significantly less prone to local minima. The new metric- multi-resolution tangent distance (MRTD)- can be easily combined with robust estimation procedures, and exhibits significantly higher invariance to image transformations than the TD and the Euclidean distance (ED). For classification, this translates into significant improvements in face recognition accuracy. For video characterization, it leads to a decomposition of image dissimilarity into “differences due to camera motion ” plus “differences due to scene activity ” that is useful for classification. Experimental results on a movie database indicate that the distance could be used as a basis for the extraction of semantic primitives such as action and romance.
Segmented regression estimators for massive data sets
- In Second SIAM International Conference on Data Mining
, 2002
"... We describe two methodologies for obtaining segmented regression estimators from massive training data sets. The first methodology, called Linear Regression Tree (LRT), is used for continuous response variables, and the second and complementary methodology, called Naive Bayes Tree (NBT), is used for ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
We describe two methodologies for obtaining segmented regression estimators from massive training data sets. The first methodology, called Linear Regression Tree (LRT), is used for continuous response variables, and the second and complementary methodology, called Naive Bayes Tree (NBT), is used for categorical response variables. These are implemented in the IBM ProbE TM (Probabilistic Estimation) data mining engine, which is an object-oriented framework for building classes of segmented predictive models from massive training data sets. Based on this methodology, an application called ATM-SE TM for direct-mail targeted marketing has been developed jointly with Fingerhut Business Intelligence [1]).
A delay damage model selection algorithm for NARX neural networks
- IEEE TRANSACTIONS ON SIGNAL PROCESSING
, 1997
"... Recurrent neural networks have become popular models for system identification and time series prediction. Nonlinear autoregressive models with exogenous inputs (NARX) neural network models are a popular subclass of recurrent networks and have been used in many applications. Although embedded memory ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Recurrent neural networks have become popular models for system identification and time series prediction. Nonlinear autoregressive models with exogenous inputs (NARX) neural network models are a popular subclass of recurrent networks and have been used in many applications. Although embedded memory can be found in all recurrent network models, it is particularly prominent in NARX models. We show that using intelligent memory order selection through pruning and good initial heuristics significantly improves the generalization and predictive performance of these nonlinear systems on problems as diverse as grammatical inference and time series prediction.
Making Correct Statistical Inferences using a Wrong Probability Model
, 1995
"... Large sample methods for estimating the variance of parameter estimates for hypothesistesting purposes (White, 1982) and statistical tests for model selection (Vuong, 1989) when the statistical model is wrong (i.e., misspecified) are reviewed. A parallel distributed processing (PDP) statistical mode ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Large sample methods for estimating the variance of parameter estimates for hypothesistesting purposes (White, 1982) and statistical tests for model selection (Vuong, 1989) when the statistical model is wrong (i.e., misspecified) are reviewed. A parallel distributed processing (PDP) statistical model for analyzing categorical time series data is then proposed, and a theorem establishing when the quasi-maximum likelihood estimates of the model are unique is stated and proved. Analyses of Golden et al.'s (1993) categorical time-series data with respect to the proposed PDP model showed that White's asymptotic statistical theory yielded results which were more consistent with boot-strap estimates than classical methods of statistical inference. Making Correct Statistical Inferences 2 Ideally, a statistical analysis should be as "model-independent" as possible making a minimal number of assumptions about the nature of the data generating process. Such an analysis is exemplary of the class...

