Results 11  20
of
701
Bivariate Quantile Smoothing Splines
 Biometrika
, 1998
"... It has long been recognized that the mean provides an inadequate summary while the set of quantiles can supply a more complete description of a sample. We introduce bivariate quantile smoothing splines, which belong to the space of bilinear tensorproduct splines, as nonparametric estimators for the ..."
Abstract

Cited by 128 (21 self)
 Add to MetaCart
It has long been recognized that the mean provides an inadequate summary while the set of quantiles can supply a more complete description of a sample. We introduce bivariate quantile smoothing splines, which belong to the space of bilinear tensorproduct splines, as nonparametric estimators for the conditional quantile functions in a two dimensional design space. The estimators can be computed using standard linear programming techniques and can further be used as building blocks for conditional quantile estimations in higher dimensions. For moderately large data sets, we recommend using penalized bivariate Bsplines as approximate solutions. We use real and simulated data to illustrate the proposed methodology. KEY WORDS: Conditional quantile; Linear program; Nonparametric regression; Robust regression; Schwarz information criterion; Tensorproduct spline. Xuming He is Associate Professor and Stephen Portnoy is Professor, Department of Statistics, University of Illinois, 725 S Wri...
Nonlinear Gated Experts for Time Series: Discovering Regimes and Avoiding Overfitting
, 1995
"... this paper: ftp://ftp.cs.colorado.edu/pub/TimeSeries/MyPapers/experts.ps.Z, ..."
Abstract

Cited by 110 (5 self)
 Add to MetaCart
this paper: ftp://ftp.cs.colorado.edu/pub/TimeSeries/MyPapers/experts.ps.Z,
Convergence results for the EM Approach to Mixtures of Experts Architectures
 NEURAL NETWORKS
, 1995
"... The ExpectationMaximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architectur ..."
Abstract

Cited by 109 (6 self)
 Add to MetaCart
(Show Context)
The ExpectationMaximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architecture of Jordan and Jacobs (1992). They showed empirically that the EM algorithm for these architectures yields significantly faster convergence than gradient ascent. In the current paper we provide a theoretical analysis of this algorithm. We show that the algorithm can be regarded as a variable metric algorithm with its searching direction having a positive projection on the gradient of the log likelihood. We also analyze the convergence of the algorithm and provide an explicit expression for the convergence rate. In addition, we describe an acceleration technique that yields a significant speedup in simulation experiments.
Smoothing Spline ANOVA for Exponential Families, with Application to the Wisconsin Epidemiological Study of Diabetic Retinopathy
 ANN. STATIST
, 1995
"... Let y i ; i = 1; \Delta \Delta \Delta ; n be independent observations with the density of y i of the form h(y i ; f i ) = exp[y i f i \Gammab(f i )+c(y i )], where b and c are given functions and b is twice continuously differentiable and bounded away from 0. Let f i = f(t(i)), where t = (t 1 ; \De ..."
Abstract

Cited by 99 (45 self)
 Add to MetaCart
(Show Context)
Let y i ; i = 1; \Delta \Delta \Delta ; n be independent observations with the density of y i of the form h(y i ; f i ) = exp[y i f i \Gammab(f i )+c(y i )], where b and c are given functions and b is twice continuously differentiable and bounded away from 0. Let f i = f(t(i)), where t = (t 1 ; \Delta \Delta \Delta ; t d ) 2 T (1)\Omega \Delta \Delta \Delta\Omega T (d) = T , the T (ff) are measureable spaces of rather general form, and f is an unknown function on T with some assumed `smoothness' properties. Given fy i ; t(i); i = 1; \Delta \Delta \Delta ; ng, it is desired to estimate f(t) for t in some region of interest contained in T . We develop the fitting of smoothing spline ANOVA models to this data of the form f(t) = C + P ff f ff (t ff ) + P ff!fi f fffi (t ff ; t fi ) + \Delta \Delta \Delta. The components of the decomposition satisfy side conditions which generalize the usual side conditions for parametric ANOVA. The estimate of f is obtained as the minimizer...
Fast HighDimensional Approximation with Sparse Occupancy Trees
, 2010
"... Abstract This paper is concerned with scattered data approximation in high dimensions: Given a data set X ⊂ R d of N data points x i along with values y i ∈ R d , i = 1, . . . , N , and viewing the y i as values y i = f (x i ) of some unknown function f , we wish to return for any query point x ∈ R ..."
Abstract

Cited by 94 (9 self)
 Add to MetaCart
(Show Context)
Abstract This paper is concerned with scattered data approximation in high dimensions: Given a data set X ⊂ R d of N data points x i along with values y i ∈ R d , i = 1, . . . , N , and viewing the y i as values y i = f (x i ) of some unknown function f , we wish to return for any query point x ∈ R d an approximationf (x) to y = f (x). Here the spatial dimension d should be thought of as large. We wish to emphasize that we do not seek a representation off in terms of a fixed set of trial functions but definef through recovery schemes which, in the first place, are designed to be fast and to deal efficiently with large data sets. For this purpose we propose new methods based on what we call sparse occupancy trees and piecewise linear schemes based on simplex subdivisions.
Learning the Empirical Hardness of Optimization Problems: The case of combinatorial auctions
 In CP
, 2002
"... We propose a new approach to understanding the algorithmspecific empirical hardness of optimization problems. In this work we focus on the empirical hardness of the winner determination probleman optimization problem arising in combinatorial auctionswhen solved by ILOG's CPLEX software. ..."
Abstract

Cited by 78 (23 self)
 Add to MetaCart
(Show Context)
We propose a new approach to understanding the algorithmspecific empirical hardness of optimization problems. In this work we focus on the empirical hardness of the winner determination probleman optimization problem arising in combinatorial auctionswhen solved by ILOG's CPLEX software. We consider nine widelyused problem distributions and sample randomly from a continuum of parameter settings for each distribution. First, we contrast the overall empirical hardness of the different distributions. Second, we identify a large number of distributionnonspecific features of data instances and use statistical regression techniques to learn, evaluate and interpret a function from these features to the predicted hardness of an instance.
Graphical models and automatic speech recognition
 Mathematical Foundations of Speech and Language Processing
, 2003
"... Graphical models provide a promising paradigm to study both existing and novel techniques for automatic speech recognition. This paper first provides a brief overview of graphical models and their uses as statistical models. It is then shown that the statistical assumptions behind many pattern recog ..."
Abstract

Cited by 78 (15 self)
 Add to MetaCart
(Show Context)
Graphical models provide a promising paradigm to study both existing and novel techniques for automatic speech recognition. This paper first provides a brief overview of graphical models and their uses as statistical models. It is then shown that the statistical assumptions behind many pattern recognition techniques commonly used as part of a speech recognition system can be described by a graph – this includes Gaussian distributions, mixture models, decision trees, factor analysis, principle component analysis, linear discriminant analysis, and hidden Markov models. Moreover, this paper shows that many advanced models for speech recognition and language processing can also be simply described by a graph, including many at the acoustic, pronunciation, and languagemodeling levels. A number of speech recognition techniques born directly out of the graphicalmodels paradigm are also surveyed. Additionally, this paper includes a novel graphical analysis regarding why derivative (or delta) features improve hidden Markov modelbased speech recognition by improving structural discriminability. It also includes an example where a graph can be used to represent language model smoothing constraints. As will be seen, the space of models describable by a graph is quite large. A thorough exploration of this space should yield techniques that ultimately will supersede the hidden Markov model.
DimensionAdaptive TensorProduct Quadrature
 Computing
, 2003
"... We consider the numerical integration of multivariate functions defined over the unit hypercube. Here, we especially address the highdimensional case, where in general the curse of dimension is encountered. Due to the concentration of measure phenomenon, such functions can often be well approxi ..."
Abstract

Cited by 74 (12 self)
 Add to MetaCart
(Show Context)
We consider the numerical integration of multivariate functions defined over the unit hypercube. Here, we especially address the highdimensional case, where in general the curse of dimension is encountered. Due to the concentration of measure phenomenon, such functions can often be well approximated by sums of lowerdimensional terms. The problem, however, is to find a good expansion given little knowledge of the integrand itself.