Results 11  20
of
206
Making Inferences with Small Numbers of Training Sets
, 2002
"... This paper discusses a potential methodological problem with empirical studies assessing project effort prediction systems. Frequently a holdout strategy is deployed so that the data set is split into a training and a validation set. Inferences are then made concerning the relative accuracy of t ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
This paper discusses a potential methodological problem with empirical studies assessing project effort prediction systems. Frequently a holdout strategy is deployed so that the data set is split into a training and a validation set. Inferences are then made concerning the relative accuracy of the different prediction techniques under examination. Typically this is done on very small numbers of sampled training sets.
Arbitrating Among Competing Classifiers Using Learned Referees
 KNOWLEDGE AND INFORMATION SYSTEMS
, 1998
"... The situation in which the results of several different classifiers and learning algorithms are obtainable for a single classification problem is common. In this paper, we propose a method that takes a collection of existing classifiers and learning algorithms, together with a set of available da ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
The situation in which the results of several different classifiers and learning algorithms are obtainable for a single classification problem is common. In this paper, we propose a method that takes a collection of existing classifiers and learning algorithms, together with a set of available data, and creates a combined classifier that takes advantage of all of these sources of knowledge. The basic idea is that each classifier has a particular subdomain for which it is most reliable. Therefore, we induce a referee for each classifier, which describes its area of expertise. Given such a description, we arbitrate between the component classifiers by using the most reliable classifier for the examples in each subdomain. In experiments in several domains, we found such arbitration to be significantly more effective than various voting techniques which do not seek out subdomains of expertise. Our results further suggest that the more finegrained the analysis of the areas of expertise of the competing classifiers, the more effectively they can be combined. In particular, we find that classification accuracy increases greatly when using intermediate subconcepts from the classifiers themselves as features for the induction of referees.
A generalized approach to portfolio optimization: Improving performance by constraining portfolio norms
 Management Science
"... doi 10.1287/mnsc.1080.0986 ..."
Statistical strategies for avoiding false discoveries in metabolomics and related experiments
, 2006
"... Many metabolomics, and other highcontent or highthroughput, experiments are set up such that the primary aim is the discovery of biomarker metabolites that can discriminate, with a certain level of certainty, between nominally matched ‘case ’ and ‘control ’ samples. However, it is unfortunately ve ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
Many metabolomics, and other highcontent or highthroughput, experiments are set up such that the primary aim is the discovery of biomarker metabolites that can discriminate, with a certain level of certainty, between nominally matched ‘case ’ and ‘control ’ samples. However, it is unfortunately very easy to find markers that are apparently persuasive but that are in fact entirely spurious, and there are wellknown examples in the proteomics literature. The main types of danger are not entirely independent of each other, but include bias, inadequate sample size (especially relative to the number of metabolite variables and to the required statistical power to prove that a biomarker is discriminant), excessive false discovery rate due to multiple hypothesis testing, inappropriate choice of particular numerical methods, and overfitting (generally caused by the failure to perform adequate validation and crossvalidation). Many studies fail to take these into account, and thereby fail to discover anything of true significance (despite their claims). We summarise these problems, and provide pointers to a substantial existing literature that should assist in the improved design and evaluation of metabolomics experiments, thereby allowing robust scientific conclusions to be drawn from the available data. We provide a list of some of the simpler checks that might improve one’s confidence that a candidate biomarker is not simply a statistical artefact, and suggest a series of preferred tests and visualisation tools that can assist readers and authors in assessing papers. These tools can be applied to individual metabolites by using multiple univariate tests performed in parallel across all metabolite peaks. They may also be applied to the validation of multivariate models. We stress in
Performance characterisation in computer vision: The role of statistics in testing and design
 Imaging and Vision Systems: Theory, Assessment and Applications. NOVA Science Books
, 1993
"... We consider the relationship between the performance characteristics of vision algorithms and algorithm design. In the first part we discuss the issues involved in testing. A description of good practice is given covering test objectives, test data, test metrics and the test protocol. In the second ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
We consider the relationship between the performance characteristics of vision algorithms and algorithm design. In the first part we discuss the issues involved in testing. A description of good practice is given covering test objectives, test data, test metrics and the test protocol. In the second part we discuss aspects of good algorithmic design including understanding of the statistical properties of data and common algorithmic operations, and suggest how some common problems may be overcome. 1
Artificial neural networks in bankruptcy prediction: General framework and crossvalidation analysis
, 1999
"... In this paper, we present a general framework for understanding the role of artificial neural networks (ANNs) in bankruptcy prediction. We give a comprehensive review of neural network applications in this area and illustrate the link between neural networks and traditional Bayesian classification t ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
In this paper, we present a general framework for understanding the role of artificial neural networks (ANNs) in bankruptcy prediction. We give a comprehensive review of neural network applications in this area and illustrate the link between neural networks and traditional Bayesian classification theory. The method of crossvalidation is used to examine the betweensample variation of neural networks for bankruptcy prediction. Based on a matched sample of 220 firms, our findings indicate that neural networks are significantly better than logistic regression models in prediction as well as classification rate estimation. In addition, neural networks are robust to sampling variations in overall classi
Parametric and Nonparametric Methods for the Statistical Evaluation of Human ID Algorithms
 of Human ID Algorithms. Workshop on Empirical Evaluation Methods in Computer Vision
, 2001
"... This paper reviews some of the major issues associated with the statistical evaluation of Human Identification algorithms, emphasizing comparisons between algorithms on the same set of sample images. A general notation is developed and common performance metrics are defined. A simple success/failur ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
This paper reviews some of the major issues associated with the statistical evaluation of Human Identification algorithms, emphasizing comparisons between algorithms on the same set of sample images. A general notation is developed and common performance metrics are defined. A simple success/failure evaluation methodology where recognition rate depends upon a binomially distributed random variable, recognition count, is developed and the conditions under which this model is appropriate are discussed. Some nonparametric techniques are also introduced, including bootstrapping. When applied to estimating the distribution of recognition count for a single set of i.i.d. sampled probe images, bootstrapping is noted as equivalent to the parametric binomial model. Bootstrapping applied to recognition rate over resampled sets of images can be problematic. Specifically, sampling with replacement to form image probe sets is shown to introduce a conflict between assumptions required by bootstrapping and the way recognition rate is computed. In part to overcome this difficulty with bootstrapping, a different nonparametric Monte Carlo method is introduced, and its utility illustrated with an extended example. This method permutes the choice of gallery and probe images. It is used to answer two questions. Question 1: How much does recognition rate vary when comparing images of individuals taken on different days using the same camera? Question 2: When is the observed difference in recognition rates for two distinct algorithms significant relative to this variation? Two important general features of nonparametric methods are illustrated by the Monte Carlo study. First, within some broad limits, resampling generates sample distributions for any statistic of interest. Second, through c...
Predictive Approaches For Choosing Hyperparameters in Gaussian Processes
 Neural Computation
, 1999
"... Gaussian Processes are powerful regression models specified by parametrized mean and covariance functions. Standard approaches to estimate these parameters (known by the name Hyperparameters) are Maximum Likelihood (ML) and Maximum APosterior (MAP) approaches. In this paper, we propose and investiga ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Gaussian Processes are powerful regression models specified by parametrized mean and covariance functions. Standard approaches to estimate these parameters (known by the name Hyperparameters) are Maximum Likelihood (ML) and Maximum APosterior (MAP) approaches. In this paper, we propose and investigate predictive approaches, namely, maximization of Geisser's Surrogate Predictive Probability (GPP) and minimization of mean square error with respect to GPP (referred to as Geisser's Predictive mean square Error (GPE)) to estimate the hyperparameters. We also derive results for the standard CrossValidation (CV) error and make a comparison. These approaches are tested on a number of problems and experimental results show that these approaches are strongly competitive to existing approaches.
Higher Prices from Entry: Pricing of Brand Name Drugs,” Mimeo
, 1996
"... When a new firm enters a market and starts selling a spatiallydifferentiated product, the prices of existing products may rise due to a better match between consumers and products. Entry may have three unusual effects. First, the new price is above the monopoly price if the two firms collude and ma ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
When a new firm enters a market and starts selling a spatiallydifferentiated product, the prices of existing products may rise due to a better match between consumers and products. Entry may have three unusual effects. First, the new price is above the monopoly price if the two firms collude and may be above the monopoly price even if the firms play Bertrand. Second, the Bertrand and collusive price may be identical. Third, prices, combined profits, and consumer surplus may all rise with entry. Consistent with our theory, the real prices of some antiulcer drugs rose as new products entered the market. Higher Prices from Entry: Pricing of BrandName Drugs When a new firm starts marketing a product that is spatially differentiated from existing products, the price of existing products may rise whether or not the firms collude. We assume that a brand’s location in product space is exogenously determined, and the firm’s only choice variable is price. Using a spatial model, we show that the effect of entry on price depends on how close together products are located in characteristic space. To illustrate this logic, we suppose that a firm enters a market that previously had one firm. If the new product is located at the same point in characteristic space as the original one, the two goods are perfect substitutes so that price must fall if the firms act noncooperatively.
Locality of sampling and diversity in parallel system workloads
 In 21st Intl. Conf. Supercomputing
, 2007
"... Observing the workload on a computer system during a short (but not too short) time interval may lead to distributions that are significantly different from those that would be observed over much longer intervals. Rather than describing such phenomena using involved nonstationary models, we propose ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
Observing the workload on a computer system during a short (but not too short) time interval may lead to distributions that are significantly different from those that would be observed over much longer intervals. Rather than describing such phenomena using involved nonstationary models, we propose a simple global distribution coupled with a localized sampling process. We quantify the effect by the maximal deviation of the distribution as observed over a limited slice of time from the global distribution, and find that in real workload data from parallel supercomputers this deviation is significantly larger than would be observed at random. Likewise, we find that the workloads at different sites also differ from each other. These findings motivate the development of adaptive systems, which adjust their parameters as they learn about their workloads, and also the development of parameterized workload models that exhibit such locality of sampling, which are required in order to evaluate adaptive systems. 1