Results 1  10
of
43
Efficient Clustering of HighDimensional Data Sets with Application to Reference Matching
, 2000
"... Many important problems involve clustering large datasets. Although naive implementations of clustering are computationally expensive, there are established efficient techniques for clustering when the dataset has either (1) a limited number of clusters, (2) a low feature dimensionality, or (3) a sm ..."
Abstract

Cited by 256 (12 self)
 Add to MetaCart
Many important problems involve clustering large datasets. Although naive implementations of clustering are computationally expensive, there are established efficient techniques for clustering when the dataset has either (1) a limited number of clusters, (2) a low feature dimensionality, or (3) a small number of data points. However, there has been much less work on methods of efficiently clustering datasets that are large in all three ways at once, for example, having millions of data points that exist in many thousands of dimensions representing many thousands of clusters. We present a new technique for clustering these large, highdimensional datasets. The key idea involves using a cheap, approximate distance measure to efficiently divide the data into overlapping subsets we call canopies. Then clustering is performed by measuring exact distances only between points that occur in a common canopy. Using canopies, large clustering problems that were formerly impossible become practical. Under reasonable assumptions about the cheap distance metric, this reduction in computational cost comes without any loss in clustering accuracy. Canopies can be applied to many domains and used with a variety of clustering approaches, including Greedy Agglomerative Clustering, Kmeans and ExpectationMaximization. We present experimental results on grouping bibliographic citations from the reference sections of research papers. Here the canopy approach reduces computation time over a traditional clustering approach by more than an order of magnitude and decreases error in comparison to a previously used algorithm by 25%.
Minimax Entropy Principle and Its Application to Texture Modeling
, 1997
"... This article proposes a general theory and methodology, called the minimax entropy principle, for building statistical models for images (or signals) in a variety of applications. This principle consists of two parts. The first is the maximum entropy principle for feature binding (or fusion): for a ..."
Abstract

Cited by 193 (39 self)
 Add to MetaCart
This article proposes a general theory and methodology, called the minimax entropy principle, for building statistical models for images (or signals) in a variety of applications. This principle consists of two parts. The first is the maximum entropy principle for feature binding (or fusion): for a certain set of feature statistics, a distribution can be built to bind these feature statistics together by maximizing the entropy over all distributions that reproduce these feature statistics. The second part is the minimum entropy principle for feature selection: among all plausible sets of feature statistics, we choose the set whose maximum entropy distribution has the minimum entropy. Computational and inferential issues in both parts are addressed, in particular, a feature pursuit procedure is proposed for approximately selecting the optimal set of features. The model complexity is restricted because of the sample variation in the observed feature statistics. The minimax entropy principle is applied to texture modeling, where a novel Markov random field (MRF) model, called FRAME (Filter, Random field, And Minimax Entropy), is derived, and encouraging results are obtained in experiments on a variety of texture images. Relationship between our theory and the mechanisms of neural computation is also discussed.
Prediction risk and architecture selection for neural networks
, 1994
"... Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimati ..."
Abstract

Cited by 75 (2 self)
 Add to MetaCart
Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimating the quality of model predictions and for model selection. Prediction risk estimation and model selection are especially important for problems with limited data. Techniques for estimating prediction risk include data resampling algorithms such as nonlinear cross–validation (NCV) and algebraic formulae such as the predicted squared error (PSE) and generalized prediction error (GPE). We show that exhaustive search over the space of network architectures is computationally infeasible even for networks of modest size. This motivates the use of heuristic strategies that dramatically reduce the search complexity. These strategies employ directed search algorithms, such as selecting the number of nodes via sequential network construction (SNC) and pruning inputs and weights via sensitivity based pruning (SBP) and optimal brain damage (OBD) respectively.
Key Concepts in Model Selection: Performance and Generalizability
 Journal of Mathematical Psychology
, 2000
"... methods of model selection, and how do they work? Which methods perform better than others, and in what circumstances? These questions rest on a number of key concepts in a relatively underdeveloped field. The aim of this essay is to explain some background concepts, highlight some of the results in ..."
Abstract

Cited by 39 (12 self)
 Add to MetaCart
methods of model selection, and how do they work? Which methods perform better than others, and in what circumstances? These questions rest on a number of key concepts in a relatively underdeveloped field. The aim of this essay is to explain some background concepts, highlight some of the results in this special issue, and to add my own. The standard methods of model selection include classical hypothesis testing, maximum likelihood, Bayes method, minimum description length, crossvalidation and Akaike’s information criterion. They all provide an implementation of Occam’s razor, in which parsimony or simplicity is balanced against goodnessoffit. These methods primarily take account of the sampling errors in parameter estimation, although their relative success at this task depends on the circumstances. However, the aim of model selection should also include the ability of a model to generalize to predictions in a different domain. Errors of extrapolation, or generalization, are different from errors of parameter estimation. So, it seems that simplicity and parsimony may be an additional factor in managing these errors, in which case the standard methods of model selection are incomplete implementations of Occam’s razor. 1. WHAT IS MODEL SELECTION? William of Ockham (1285 1347/49) will always be remembered for his famous postulations of Ockham’s razor (also spelled ‘Occam’), which states that entities are not to be multiplied beyond necessity. In a similar vein, Sir Isaac Newton’s first rule of hypothesizing instructs us that we are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances. While they This paper is derived from a presentation at the Methods of Model Selection symposium at Indiana University
Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo  Towards a "Trichromacy" Theory of Texture
, 1999
"... This article presents a mathematical denition of texture { the Julesz ensemble h), which is the set of all images (defined on Z²) that share identical statistics h. Then texture modeling is posed as an inverse problem: given a set of images sampled from an unknown Julesz ensemble h ), we search f ..."
Abstract

Cited by 32 (13 self)
 Add to MetaCart
This article presents a mathematical denition of texture { the Julesz ensemble h), which is the set of all images (defined on Z²) that share identical statistics h. Then texture modeling is posed as an inverse problem: given a set of images sampled from an unknown Julesz ensemble h ), we search for the statistics h which define the ensemble. A Julesz ensemble h) has an associated probability distribution q(I; h), which is uniform over the images in the ensemble and has zero probability outside. In a companion paper [32], q(I; h) is shown to be the limit distribution of the FRAME (Filter, Random Field, And Minimax Entropy) model[35] as the image lattice ! Z². This conclusion establishes the intrinsic link between the scientific definition of texture on Z² and the mathematical models of texture on finite lattices. It brings two advantages to computer vision. 1). The engineering practice of synthesizing texture images by matching statistics has been put on a mathematical fou...
Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models
 Journal of Econometrics
, 2001
"... Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models
Estimation for nonlinear stochastic differential equations by a local linearization method. Stochastic Analysis and Applications 16
, 1998
"... This paper proposes a new local linearization method which approximates a nonlinear stochastic differential equation by a linear stochastic differential equation. Using this method, we can estimate parameters of the nonlinear stochastic differential equation from discrete observations by the maximum ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
This paper proposes a new local linearization method which approximates a nonlinear stochastic differential equation by a linear stochastic differential equation. Using this method, we can estimate parameters of the nonlinear stochastic differential equation from discrete observations by the maximum likelihood technique. We conduct the numerical experiments to evaluate the finite sample performance of identification of the new method, and compare it with the two known methods; the original local linearization method and the Euler methods. From the results of experiments, the new method shows much better performance than the other two methods particularly when the sampling interval is large. 1. INTRODUCTION. It
Diagnostic Checking Periodic Autoregression Models with Application
, 1994
"... An overview of model building with periodic autoregression (PAR) models is given emphasizing the three stages of model development: identification, estimation and diagnostic checking. New results on the distribution of residual autocorrelations and suitable diagnostic checks are derived. The validit ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
An overview of model building with periodic autoregression (PAR) models is given emphasizing the three stages of model development: identification, estimation and diagnostic checking. New results on the distribution of residual autocorrelations and suitable diagnostic checks are derived. The validity of these checks is demonstrated by simulation. The methodology discussed is illustrated with an application. It is pointed out that the PAR approach to model development o#ers some important advantages over the more general approach using periodic autoregressive movingaverage (PARMA) models. I have written S functions for the periodic autoregressive modelling methods discussed in my paper. Complete S style documentation for each function is provided. To obtain, email the following message: send pear from S to statlib@temper.stat.cmu.edu or use anonymous ftp to connect to fisher.stats.uwo.ca and download the shar archive file, pear.sh, located in the directory pub/pear. Key words. Periodically correlated time series; periodic autoregressive movingaverage models; portmanteau test; residual autocorrelation. 1.
Pixonbased multiresolution image reconstruction and the Quantification of Picture Information Content
, 1995
"... This paper reviews pixonbased image reconstruction, which in it current formulation uses a multiresolution language to quantify an image’s Algorithmic Information Content (AIC) using Bayesian techniques. Each pixon (or its generalization, the informaton) represents a fundamental quanta of an image’ ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
This paper reviews pixonbased image reconstruction, which in it current formulation uses a multiresolution language to quantify an image’s Algorithmic Information Content (AIC) using Bayesian techniques. Each pixon (or its generalization, the informaton) represents a fundamental quanta of an image’s AIC, and an image’s pixon basis represents the minimum degreesoffreedom necessary to describe the image within the accuracy of the noise. We demonstrate with a number of examples that pixonbased image reconstruction yields results consistently superior to popular competing methods, including Maximum Likelihood and Maximum Entropy methods. Typical improvements