Results 1  10
of
289
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statist ..."
Abstract

Cited by 561 (10 self)
 Add to MetaCart
(Show Context)
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statisticallybased learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
Locally weighted learning
 ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract

Cited by 499 (53 self)
 Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
Powerlaw distributions in empirical data
 ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111
, 2009
"... Powerlaw distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and manmade phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
Abstract

Cited by 234 (3 self)
 Add to MetaCart
(Show Context)
Powerlaw distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and manmade phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the tail of the distribution. In particular, standard methods such as leastsquares fitting are known to produce systematically biased estimates of parameters for powerlaw distributions and should not be used in most circumstances. Here we describe statistical techniques for making accurate parameter estimates for powerlaw data, based on maximum likelihood methods and the KolmogorovSmirnov statistic. We also show how to tell whether the data follow a powerlaw distribution at all, defining quantitative measures that indicate when the power law is a reasonable fit to the data and when it is not. We demonstrate these methods by applying them to twentyfour realworld data sets from a range of different disciplines. Each of the data sets has been conjectured previously to follow a powerlaw distribution. In some cases we find these conjectures to be consistent with the data while in others the power law is ruled out.
Sparse Reconstruction by Separable Approximation
, 2008
"... Finding sparse approximate solutions to large underdetermined linear systems of equations is a common problem in signal/image processing and statistics. Basis pursuit, the least absolute shrinkage and selection operator (LASSO), waveletbased deconvolution and reconstruction, and compressed sensing ( ..."
Abstract

Cited by 185 (27 self)
 Add to MetaCart
(Show Context)
Finding sparse approximate solutions to large underdetermined linear systems of equations is a common problem in signal/image processing and statistics. Basis pursuit, the least absolute shrinkage and selection operator (LASSO), waveletbased deconvolution and reconstruction, and compressed sensing (CS) are a few wellknown areas in which problems of this type appear. One standard approach is to minimize an objective function that includes a quadratic (ℓ2) error term added to a sparsityinducing (usually ℓ1) regularization term. We present an algorithmic framework for the more general problem of minimizing the sum of a smooth convex function and a nonsmooth, possibly nonconvex regularizer. We propose iterative methods in which each step is obtained by solving an optimization subproblem involving a quadratic term with diagonal Hessian (which is therefore separable in the unknowns) plus the original sparsityinducing regularizer. Our approach is suitable for cases in which this subproblem can be solved much more rapidly than the original problem. In addition to solving the standard ℓ2 − ℓ1 case, our framework yields an efficient solution technique for other regularizers, such as an ℓ∞norm regularizer and groupseparable (GS) regularizers. It also generalizes immediately to the case in which the data is complex rather than real. Experiments with CS problems show that our approach is competitive with the fastest known methods for the standard ℓ2 − ℓ1 problem, as well as being efficient on problems with other separable regularization terms.
Gibbs motif sampling: detection of bacterial outer membrane protein repeats
 Protein Science
, 1995
"... The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motifencoding regions in sequences and optimally partitions them into distinct motif ..."
Abstract

Cited by 125 (13 self)
 Add to MetaCart
(Show Context)
The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motifencoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helixturnhelix DNAbinding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric 0barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membranespanning 0strands. These &strands occur on the membrane interface (as opposed to the trimeric interface) of the &barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles.
Neural Networks and Statistical Models
, 1994
"... There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard s ..."
Abstract

Cited by 114 (1 self)
 Add to MetaCart
There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard statistical software. This paper explains what neural networks are, translates neural network jargon into statistical jargon, and shows the relationships between neural networks and statistical models such as generalized linear models, maximum redundancy analysis, projection pursuit, and cluster analysis.
Identifying Mislabeled Training Data
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 1999
"... This paper presents a new approach to identifying and eliminating mislabeled training instances for supervised learning. The goal of this approach is to improve classification accuracies produced by learning algorithms by improving the quality of the training data. Our approach ..."
Abstract

Cited by 86 (1 self)
 Add to MetaCart
This paper presents a new approach to identifying and eliminating mislabeled training instances for supervised learning. The goal of this approach is to improve classification accuracies produced by learning algorithms by improving the quality of the training data. Our approach
Software development cost estimation approaches – A survey
 Annals of Software Engineering
, 2000
"... This paper summarizes several classes of software cost estimation models and techniques: parametric models, expertisebased techniques, learningoriented techniques, dynamicsbased models, regressionbased models, and compositeBayesian techniques for integrating expertisebased and regressionbased ..."
Abstract

Cited by 82 (7 self)
 Add to MetaCart
(Show Context)
This paper summarizes several classes of software cost estimation models and techniques: parametric models, expertisebased techniques, learningoriented techniques, dynamicsbased models, regressionbased models, and compositeBayesian techniques for integrating expertisebased and regressionbased models. Experience to date indicates that neuralnet and dynamicsbased techniques are less mature than the other classes of techniques, but that all classes of techniques are challenged by the rapid pace of change in software technology. The primary conclusion is that no single technique is best for all situations, and that a careful comparison of the results of several approaches is most likely to produce realistic estimates. 1.
Modelling Forest Growth and Yield: Applications to Mixed Tropical Forests
, 1994
"... Growth models assist forest researchers and managers in many ways. Some important uses include the ability to predict future yields and to explore silvicultural options. Models provide an efficient way to prepare resource forecasts, but a more important role may be their ability to explore managemen ..."
Abstract

Cited by 77 (43 self)
 Add to MetaCart
Growth models assist forest researchers and managers in many ways. Some important uses include the ability to predict future yields and to explore silvicultural options. Models provide an efficient way to prepare resource forecasts, but a more important role may be their ability to explore management options and silvicultural alternatives. For example, foresters may wish to know the longterm effect on both the forest and on future harvests, of a particular silvicultural decision, such as changing the cutting limits for harvesting. With a growth model, they can examine the likely outcomes, both with the intended and alternative cutting limits, and can make their decision objectively. The process of developing a growth model may also offer interesting new insights into stand dynamics.
Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model
 The American Statistician
, 2000
"... In the presence of heteroscedasticity, OLS estimates are unbiased, but the usual tests of significance are generally inappropriate and their use can lead to incorrect inferences. Tests based on a heteroscedasticity consistent covariance matrix (HCCM), however, are consistent even in the presence o ..."
Abstract

Cited by 61 (0 self)
 Add to MetaCart
(Show Context)
In the presence of heteroscedasticity, OLS estimates are unbiased, but the usual tests of significance are generally inappropriate and their use can lead to incorrect inferences. Tests based on a heteroscedasticity consistent covariance matrix (HCCM), however, are consistent even in the presence of heteroscedasticity of an unknown form. Most applications that use a HCCM appear to rely on the asymptotic version known as HC0. Our Monte Carlo simulations show that HC0 often results in incorrect inferences when N ≤ 250, while three relatively unknown, small sample versions of the HCCM, and especially a version known as HC3, work well even for N ’s as small as 25. We recommend that: 1) data analysts should correct for heteroscedasticity using a HCCM whenever there is reason to suspect heteroscedasticity; 2) the decision to use a HCCMbased tests should not be determined by a screening test for heteroscedasticity; and 3) when N ≤ 250, the HCCM known as HC3 should be used. Since HC3 is simple to compute, we encourage authors of statistical software to add this estimator to their programs. 1