Results 11  20
of
189
Prediction risk and architecture selection for neural networks
, 1994
"... Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimati ..."
Abstract

Cited by 75 (2 self)
 Add to MetaCart
Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimating the quality of model predictions and for model selection. Prediction risk estimation and model selection are especially important for problems with limited data. Techniques for estimating prediction risk include data resampling algorithms such as nonlinear cross–validation (NCV) and algebraic formulae such as the predicted squared error (PSE) and generalized prediction error (GPE). We show that exhaustive search over the space of network architectures is computationally infeasible even for networks of modest size. This motivates the use of heuristic strategies that dramatically reduce the search complexity. These strategies employ directed search algorithms, such as selecting the number of nodes via sequential network construction (SNC) and pruning inputs and weights via sensitivity based pruning (SBP) and optimal brain damage (OBD) respectively.
Fast implementations of nonparametric curve estimators
, 1993
"... Recent proposals for implementation of kernel based nonparametric curve estimators are seen to be faster than naive direct implementations by factors up into the hundreds. The main ideas behind two different approaches of this type are made clear. Careful speed comparisons in a variety of settings, ..."
Abstract

Cited by 68 (11 self)
 Add to MetaCart
Recent proposals for implementation of kernel based nonparametric curve estimators are seen to be faster than naive direct implementations by factors up into the hundreds. The main ideas behind two different approaches of this type are made clear. Careful speed comparisons in a variety of settings, and using a variety of machines and software is done. Various issues on computational accuracy and stability are also discussed. The fast methods are seen to be somewhat better than methods traditionally considered very fast, such as LOWESS and smoothing splines. 1
Local polynomial kernel regression for generalized linear models and quasilikelihood functions
 Journal of the American Statistical Association,90
, 1995
"... were introduced as a means of extending the techniques of ordinary parametric regression to several commonlyused regression models arising from nonnormal likelihoods. Typically these models have a variance that depends on the mean function. However, in many cases the likelihood is unknown, but the ..."
Abstract

Cited by 57 (7 self)
 Add to MetaCart
were introduced as a means of extending the techniques of ordinary parametric regression to several commonlyused regression models arising from nonnormal likelihoods. Typically these models have a variance that depends on the mean function. However, in many cases the likelihood is unknown, but the relationship between mean and variance can be specified. This has led to the consideration of quasilikelihood methods, where the conditionalloglikelihood is replaced by a quasilikelihood function. In this article we investigate the extension of the nonparametric regression technique of local polynomial fitting with a kernel weight to these more general contexts. In the ordinary regression case local polynomial fitting has been seen to possess several appealing features in terms of intuitive and mathematical simplicity. One noteworthy feature is the better performance near the boundaries compared to the traditional kernel regression estimators. These properties are shown to carryover to the generalized linear model and quasilikelihood model. The end result is a class of kernel type estimators for smoothing in quasilikelihood models. These estimators can be viewed as a straightforward generalization of the usual parametric estimators. In addition, their simple asymptotic distributions allow for simple interpretation
Analysis and Decomposition of Spatial Variation in Integrated Circuit Processes and Devices
 IEEE Transactions on Semiconductor Manufacturing
, 1997
"... Variation is a key concern in semiconductor manufacturing and is manifest in several forms. Spatial variation across each wafer results from equipment or process limitations, and variation within each die may be exacerbated further by complex pattern dependencies. Spatial variation information is im ..."
Abstract

Cited by 54 (5 self)
 Add to MetaCart
Variation is a key concern in semiconductor manufacturing and is manifest in several forms. Spatial variation across each wafer results from equipment or process limitations, and variation within each die may be exacerbated further by complex pattern dependencies. Spatial variation information is important not only for process optimization and control, but also for design of circuits that are robust to such variation. Systematic and random components of the variation must be identified, and models relating the spatial variation to specific process and pattern causes are needed. In this work, extraction and modeling methods are described for waferlevel, dielevel, and waferdie interaction contributions to spatial variation. Waferlevel estimation methods include filtering, spline, and regression based approaches. Dielevel (or intradie) variation can be extracted using spatial Fourier transform methods; important issues include spectral interpolation and sampling requirements. Finally, the interaction between wafer and dielevel effects is important to fully capture and separate systematic versus random variation; spline and frequencybased methods are proposed for this modeling. Together, these provide an effective collection of methods to identify and model spatial variation for future use in process control to reduce systematic variation, and in process/device design to produce more robust circuits.
Statistical Prediction of Task Execution Times Through Analytic Benchmarking for Scheduling in a Heterogeneous Environment
 IEEE Transactions on Computers
, 1999
"... In this paper, a method for estimating task execution times is presented, in order to facilitate dynamic scheduling in a heterogeneous metacomputing environment. Execution time is treated as a random variable and is statistically estimated from past observations. This method predicts the execution t ..."
Abstract

Cited by 44 (2 self)
 Add to MetaCart
In this paper, a method for estimating task execution times is presented, in order to facilitate dynamic scheduling in a heterogeneous metacomputing environment. Execution time is treated as a random variable and is statistically estimated from past observations. This method predicts the execution time as a function of several parameters of the input data, and does not require any direct information about the algorithms used by the tasks or the architecture of the machines. Techniques based upon the concept of analytic benchmarking/code profiling [7] are used to accurately determine the performance differences between machines, allowing observations to be shared between machines. Experimental results using real data are presented.
The Relationship between PAC, the Statistical Physics framework, the Bayesian framework, and the VC framework
"... This paper discusses the intimate relationships between the supervised learning frameworks mentioned in the title. In particular, it shows how all those frameworks can be viewed as particular instances of a single overarching formalism. In doing this many commonly misunderstood aspects of those fram ..."
Abstract

Cited by 40 (7 self)
 Add to MetaCart
This paper discusses the intimate relationships between the supervised learning frameworks mentioned in the title. In particular, it shows how all those frameworks can be viewed as particular instances of a single overarching formalism. In doing this many commonly misunderstood aspects of those frameworks are explored. In addition the strengths and weaknesses of those frameworks are compared, and some novel frameworks are suggested (resulting, for example, in a "correction" to the familiar biasplusvariance formula).
Selecting the Number of Knots For Penalized Splines
, 2000
"... Penalized splines, or Psplines, are regression splines fit by leastsquares with a roughness penaly. Psplines have much in common with smoothing splines, but the type of penalty used with a Pspline is somewhat more general than for a smoothing spline. Also, the number and location of the knots ..."
Abstract

Cited by 40 (7 self)
 Add to MetaCart
Penalized splines, or Psplines, are regression splines fit by leastsquares with a roughness penaly. Psplines have much in common with smoothing splines, but the type of penalty used with a Pspline is somewhat more general than for a smoothing spline. Also, the number and location of the knots of a Pspline is not fixed as with a smoothing spline. Generally, the knots of a Pspline are at fixed quantiles of the independent variable and the only tuning parameter to choose is the number of knots. In this article, the effects of the number of knots on the performance of Psplines are studied. Two algorithms are proposed for the automatic selection of the number of knots. The myoptic algorithm stops when no improvement in the generalized cross validation statistic (GCV) is noticed with the last increase in the number of knots. The full search examines all candidates in a fixed sequence of possible numbers of knots and chooses the candidate that minimizes GCV. The myoptic algo...
Wavelet Methods For Curve Estimation
, 1994
"... The theory of wavelets is a developing branch of mathematics with a wide range of potential applications. Compactly supported wavelets are particularly interesting because of their natural ability to represent data with intrinsically local properties. They are useful for the detection of edges and ..."
Abstract

Cited by 37 (7 self)
 Add to MetaCart
The theory of wavelets is a developing branch of mathematics with a wide range of potential applications. Compactly supported wavelets are particularly interesting because of their natural ability to represent data with intrinsically local properties. They are useful for the detection of edges and singularities in image and sound analysis, and for data compression. However, most of the wavelet based procedures currently available do not explicitly account for the presence of noise in the data. A discussion of how this can be done in the setting of some simple nonparametric curve estimation problems is given. Wavelet analogues of some familiar kernel and orthogonal series estimators are introduced and their finite sample and asymptotic properties are studied. We discover that there is a fundamental instability in the asymptotic variance of wavelet estimators caused by the lack of translation invariance of the wavelet transform. This is related to the properties of certain lacunary seq...
Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria
 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 2006
"... The empirical assessment of test techniques plays an important role in software testing research. One common practice is to seed faults in subject software, either manually or by using a program that generates all possible mutants based on a set of mutation operators. The latter allows the systemati ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
The empirical assessment of test techniques plays an important role in software testing research. One common practice is to seed faults in subject software, either manually or by using a program that generates all possible mutants based on a set of mutation operators. The latter allows the systematic, repeatable seeding of large numbers of faults, thus facilitating the statistical analysis of fault detection effectiveness of test suites; however, we do not know whether empirical results obtained this way lead to valid, representative conclusions. Focusing on four common control and data flow criteria (Block, Decision, CUse, and PUse), this paper investigates this important issue based on a middle size industrial program with a comprehensive pool of test cases and known faults. Based on the data available thus far, the results are very consistent across the investigated criteria as they show that the use of mutation operators is yielding trustworthy results: Generated mutants can be used to predict the detection effectiveness of real faults. Applying such a mutation analysis, we then investigate the relative cost and effectiveness of the abovementioned criteria by revisiting fundamental questions regarding the relationships between fault detection, test suite size, and control/data flow coverage. Although such questions have been partially investigated in previous studies, we can use a large number of mutants, which helps decrease the impact of random variation in our analysis and allows us to use a different analysis approach. Our results are then compared with published studies, plausible reasons for the differences are provided, and the research leads us to suggest a way to tune the mutation analysis process to possible differences in fault detection probabilities in a specific environment.
Spatiallyadaptive penalties for spline fitting
 Australian and New Zealand Journal of Statistics
, 2000
"... We study spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. Our estimates are pth degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
We study spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. Our estimates are pth degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty on the jumps of the pth derivative at the knots. To be spatially adaptive, the logarithm of the penalty is itself a linear spline but with relatively few knots and with values at the knots chosen to minimize GCV. This locallyadaptive spline estimator is compared with other spline estimators in the literature such as cubic smoothing splines and knotselection techniques for leastsquares regression. Our estimator can be interpreted as an empirical Bayes estimate for a prior allowing spatial heterogeneity. In cases of spatially heterogeneous regression functions,