Results 1  10
of
174
Generalized Additive Models
, 1990
"... Liklihood based regression models, such as the normal linear regression model and the linear logistic model, assume a linear (or some other parametric) form for the covariate effects. We introduce the Local Scotinq procedure which replaces the liner form C Xjpj by a sum of smooth functions C Sj(Xj)a ..."
Abstract

Cited by 1314 (33 self)
 Add to MetaCart
Liklihood based regression models, such as the normal linear regression model and the linear logistic model, assume a linear (or some other parametric) form for the covariate effects. We introduce the Local Scotinq procedure which replaces the liner form C Xjpj by a sum of smooth functions C Sj(Xj)a The Sj(.) ‘s are unspecified functions that are estimated using scatterplot smoothers. The technique is applicable to any likelihoodbased regression model: the class of Generalized Linear Models contains many of these. In this class, the Locul Scoring procedure replaces the linear predictor VI = C Xj@j by the additive predictor C ai ( hence, the name Generalized Additive Modeb. Local Scoring can also be applied to nonstandard models like Cox’s proportional hazards model for survival data. In a number of real data examples, the Local Scoring procedure proves to be useful in uncovering nonlinear covariate effects. It has the advantage of being completely automatic, i.e. no “detective work ” is needed on the part of the statistician. In a further generalization, the technique is modified to estimate the form of the link function for generalized linear models. The Local Scoring procedure is shown to be asymptotically equivalent to Local Likelihood estimation, another technique for estimating smooth covariate functions. They are seen to produce very similar results with real data, with Local Scoring being considerably faster. As a theoretical underpinning, we view Local Scoring and Local Likelihood as empirical maximizers of the ezpected loglikelihood, and this makes clear their connection to standard maximum likelihood estimation. A method for estimating the “degrees of freedom ” of the procedures is also given.
Hierarchical mixtures of experts and the EM algorithm
 Neural Computation
, 1994
"... We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood ..."
Abstract

Cited by 723 (19 self)
 Add to MetaCart
We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood problem; in particular, we present an ExpectationMaximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an online learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain. 1
Minimum redundancy feature selection from microarray gene expression data. Journal of bioinformatics and computational biology
, 2005
"... Selecting a small subset of genes out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their differential expressions among phenotypes and pick the topranked genes. We observe that feature sets ..."
Abstract

Cited by 118 (7 self)
 Add to MetaCart
Selecting a small subset of genes out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their differential expressions among phenotypes and pick the topranked genes. We observe that feature sets so obtained have certain redundancy and study methods to minimize it. Feature sets obtained through the minimum redundancy – maximum relevance framework represent broader spectrum of characteristics of phenotypes than those obtained through standard ranking methods; they are more robust, generalize well to unseen data, and lead to significantly improved classifications in extensive experiments on 5 gene expressions data sets. 1.
Approximate Bayes Factors and Accounting for Model Uncertainty in Generalized Linear Models
, 1993
"... Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors ..."
Abstract

Cited by 96 (28 self)
 Add to MetaCart
Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors is suggested, both to represent the situation where there is not much prior information, and to assess the sensitivity of the results to the prior distribution. The methods can be used when the dispersion parameter is unknown, when there is overdispersion, to compare link functions, and to compare error distributions and variance functions. The methods can be used to implement the Bayesian approach to accounting for model uncertainty. I describe an application to inference about relative risks in the presence of control factors where model uncertainty is large and important. Software to implement the
A comparison of numerical optimizers for logistic regression
, 2003
"... Logistic regression is a workhorse of statistics and is closely related to methods used in Machine Learning, including the Perceptron and the Support Vector Machine. This note compares eight different algorithms for computing the maximum aposteriori parameter estimate. A full derivation of each alg ..."
Abstract

Cited by 84 (0 self)
 Add to MetaCart
Logistic regression is a workhorse of statistics and is closely related to methods used in Machine Learning, including the Perceptron and the Support Vector Machine. This note compares eight different algorithms for computing the maximum aposteriori parameter estimate. A full derivation of each algorithm is given. In particular, a new derivation of Iterative Scaling is given which applies more generally than the conventional one. A new derivation is also given for the Modified Iterative Scaling algorithm of Collins et al. (2002). Most of the algorithms operate in the primal space, but can also work in dual space. All algorithms are compared in terms of computational complexity by experiments on large data sets. The fastest algorithms turn out to be conjugate gradient ascent and quasiNewton algorithms, which far outstrip Iterative Scaling and its variants. 1
Taking Steps: The Influence of a Walking Technique on Presence in Virtual Reality
, 1995
"... This paper presents an interactive technique for moving through an immersive virtual environment (or "virtual reality"). The technique is suitable for applications where locomotion is restricted to ground level. The technique is derived from the idea that presence in virtual environments may be enha ..."
Abstract

Cited by 77 (9 self)
 Add to MetaCart
This paper presents an interactive technique for moving through an immersive virtual environment (or "virtual reality"). The technique is suitable for applications where locomotion is restricted to ground level. The technique is derived from the idea that presence in virtual environments may be enhanced the stronger the match between proprioceptive information from human body movements, and sensory feedback from the computer generated displays. The technique is an attempt to simulate body movements associated with walking. The participant "walks in place" to move through the virtual environment across distances greater than the physical limitations imposed by the electromagnetic tracking devices. A neural network is used to analyse the stream of coordinates from the headmounted display, to determine whether or not the participant is walking on the spot. Whenever it determines the walking behaviour, the participant is moved through virtual space in the direction of gaze. We discuss tw...
Walking > WalkinginPlace > Flying, in Virtual Environments
, 1999
"... A study by Slater, et al., [1995] indicated that naive subjects in an immersive virtual environment experience a higher subjective sense of presence when they locomote by walkinginplace (virtual walking) than when they pushbuttonfly (along the floor plane). We replicated their study, adding real ..."
Abstract

Cited by 65 (12 self)
 Add to MetaCart
A study by Slater, et al., [1995] indicated that naive subjects in an immersive virtual environment experience a higher subjective sense of presence when they locomote by walkinginplace (virtual walking) than when they pushbuttonfly (along the floor plane). We replicated their study, adding real walking as a third condition. Our study confirmed their findings. We also found that real walking is significantly better than both virtual walking and flying in ease (simplicity, straightforwardness, naturalness) as a mode of locomotion. The greatest difference in subjective presence was between flyers and both kinds of walkers. In addition, subjective presence was higher for real walkers than virtual walkers, but the difference was statistically significant only in some models. Followon studies show virtual walking can be substantially improved by detecting footfalls with a head accelerometer. As in the Slater study, subjective presence significantly correlated with subjects' degree of...
"Is This Document Relevant? ...Probably": A Survey of Probabilistic Models in Information Retrieval
, 2001
"... This article surveys probabilistic approaches to modeling information retrieval. The basic concepts of probabilistic approaches to information retrieval are outlined and the principles and assumptions upon which the approaches are based are presented. The various models proposed in the developmen ..."
Abstract

Cited by 63 (14 self)
 Add to MetaCart
This article surveys probabilistic approaches to modeling information retrieval. The basic concepts of probabilistic approaches to information retrieval are outlined and the principles and assumptions upon which the approaches are based are presented. The various models proposed in the development of IR are described, classified, and compared using a common formalism. New approaches that constitute the basis of future research are described
Categorical Data Analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models
"... This paper identifies several serious problems with the widespread use of ANOVAs for the analysis of categorical outcome variables such as forcedchoice variables, questionanswer accuracy, choice in production (e.g. in syntactic priming research), et cetera. I show that even after applying the arc ..."
Abstract

Cited by 62 (6 self)
 Add to MetaCart
This paper identifies several serious problems with the widespread use of ANOVAs for the analysis of categorical outcome variables such as forcedchoice variables, questionanswer accuracy, choice in production (e.g. in syntactic priming research), et cetera. I show that even after applying the arcsinesquareroot transformation to proportional data, ANOVA can yield spurious results. I discuss conceptual issues underlying these problems and alternatives provided by modern statistics. Specifically, I introduce ordinary logit models (i.e. logistic regression), which are wellsuited to analyze categorical data and offer many advantages over ANOVA. Unfortunately, ordinary logit models do not include random effect modeling. To address this issue, I describe mixed logit models (Generalized Linear Mixed Models for binomially distributed outcomes, Breslow & Clayton, 1993), which combine the advantages of ordinary logit models with the ability to account for random subject and item effects in one step of analysis. Throughout the paper, I use a psycholinguistic data set to compare the different statistical methods.