Results 1  10
of
113
Data Clustering: A Review
 ACM COMPUTING SURVEYS
, 1999
"... Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exp ..."
Abstract

Cited by 1308 (13 self)
 Add to MetaCart
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify crosscutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Simulating Normalized Constants: From Importance Sampling to Bridge Sampling to Path Sampling
 Statistical Science, 13, 163–185. COMPARISON OF METHODS FOR COMPUTING BAYES FACTORS 435
, 1998
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 146 (4 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
A Rigorous Framework for Optimization of Expensive Functions by Surrogates
, 1998
"... The goal of the research reported here is to develop rigorous optimization algorithms to apply to some engineering design problems for which direct application of traditional optimization approaches is not practical. This paper presents and analyzes a framework for generating a sequence of approxima ..."
Abstract

Cited by 133 (17 self)
 Add to MetaCart
The goal of the research reported here is to develop rigorous optimization algorithms to apply to some engineering design problems for which direct application of traditional optimization approaches is not practical. This paper presents and analyzes a framework for generating a sequence of approximations to the objective function and managing the use of these approximations as surrogates for optimization. The result is to obtain convergence to a minimizer of an expensive objective function subject to simple constraints. The approach is widely applicable because it does not require, or even explicitly approximate, derivatives of the objective. Numerical results are presented for a 31variable helicopter rotor blade design example and for a standard optimization test example. Key Words: Approximation concepts, surrogate optimization, response surfaces, pattern search methods, derivativefree optimization, design and analysis of computer experiments (DACE), computational engineering. # ...
Detecting Features in Spatial Point Processes with . . .
, 1995
"... We consider the problem of detecting features in spatial point processes in the presence of substantial clutter. One example is the detection of mine elds using reconnaissance aircraft images that erroneously identify many objects that are not mines. Another is the detection of seismic faults on the ..."
Abstract

Cited by 82 (31 self)
 Add to MetaCart
We consider the problem of detecting features in spatial point processes in the presence of substantial clutter. One example is the detection of mine elds using reconnaissance aircraft images that erroneously identify many objects that are not mines. Another is the detection of seismic faults on the basis of earthquake catalogs: earthquakes tend to be clustered close to the faults, but there are many that are farther away. Our solution uses modelbased clustering based on a mixture model for the process, in which features are assumed to generate points according to highly linear multivariate normal densities, and the clutter arises according to a spatial Poisson process. Very nonlinear features are represented by several highly linear multivariate normal densities, giving a piecewise linear representation. The model is estimated in two stages. In the rst stage, hierarchical modelbased clustering is used to provide a rst estimate of the features. In the second stage, this clustering is re ned using the EM algorithm. The number of features is found using an approximation to the posterior probability of each number of features. For the minefield
On Conditional and Intrinsic Autoregressions
, 1995
"... This paper discusses standard and intrinsic autoregressions and describes how the problems that arise can be alleviated using Dempster's (1972) algorithm or an appropriate modification. The approach partly represents a synthesis of standard geostatistical and Gaussian Markov random field formulation ..."
Abstract

Cited by 76 (6 self)
 Add to MetaCart
This paper discusses standard and intrinsic autoregressions and describes how the problems that arise can be alleviated using Dempster's (1972) algorithm or an appropriate modification. The approach partly represents a synthesis of standard geostatistical and Gaussian Markov random field formulations. Some nonspatial applications are also mentioned. Some key words: Agricultural experiments; Bayesian image analysis; Conditional autoregressions; Dempster's algorithm; Geographical epidemiology; Geostatistics; Intrinsic autoregressions; Multiway tables; Prior distributions; Spatial statistics; Surface reconstruction; Texture analysis. 1 Introduction
Spatial Econometrics
 PALGRAVE HANDBOOK OF ECONOMETRICS: VOLUME 1, ECONOMETRIC THEORY
, 2001
"... Spatial econometric methods deal with the incorporation of spatial interaction and spatial structure into regression analysis. The field has seen a recent and rapid growth spurred both by theoretical concerns as well as by the need to be able to apply econometric models to emerging large geocoded da ..."
Abstract

Cited by 64 (5 self)
 Add to MetaCart
Spatial econometric methods deal with the incorporation of spatial interaction and spatial structure into regression analysis. The field has seen a recent and rapid growth spurred both by theoretical concerns as well as by the need to be able to apply econometric models to emerging large geocoded data bases. The review presented in this chapter outlines the basic terminology and discusses in some detail the specification of spatial effects, estimation of spatial regression models, and specification tests for spatial effects.
NeighborhoodBased Models for Social Networks
 Sociological Methodology
, 2002
"... Harrison White and several anonymous reviewers for valuable comments on the work. We argue that social networks can be modeled as the outcome of processes that occur in overlapping local regions of the network, termed local social neighborhoods. Each neighborhood is conceived as a possible site of i ..."
Abstract

Cited by 55 (4 self)
 Add to MetaCart
Harrison White and several anonymous reviewers for valuable comments on the work. We argue that social networks can be modeled as the outcome of processes that occur in overlapping local regions of the network, termed local social neighborhoods. Each neighborhood is conceived as a possible site of interaction and corresponds to a subset of possible network ties. In this paper, we discuss hypotheses about the form of these neighborhoods, and we present two new and theoretically plausible ways in which neighborhoodbased models for networks can be constructed. In the first, we introduce the notion of a setting structure, a directly hypothesized (or observed) set of exogenous constraints on possible neighborhood forms. In the second, we propose higherorder neighborhoods that are generated, in part, by the outcome of interactive network processes themselves. Applications of both approaches to model construction are presented, and the developments are considered within a general conceptual framework of locale for social networks. We show how assumptions about neighborhoods can be cast within a hierarchy of increasingly complex models; these models represent a progressively greater capacity for network processes to “reach ” across a network through long cycles or semipaths. We argue that this class of models holds new promise for the development of empirically plausible models for networks and networkbased processes. 2 1.
Objective Bayesian Analysis of Spatially Correlated Data
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Spatially varying phenomena are often modeled using Gaussian random fields, specified by their mean function and covariance function. The spatial correlation structure of these models is commonly specified to be of a certain form (e.g., spherical, power exponential, rational quadratic, or Matérn) wi ..."
Abstract

Cited by 52 (7 self)
 Add to MetaCart
Spatially varying phenomena are often modeled using Gaussian random fields, specified by their mean function and covariance function. The spatial correlation structure of these models is commonly specified to be of a certain form (e.g., spherical, power exponential, rational quadratic, or Matérn) with a small number of unknown parameters. We consider objective Bayesian analysis of such spatial models, when the mean function of the Gaussian random field is specified as in a linear model. It is thus necessary to determine an objective (or default) prior distribution for the unknown mean and covariance parameters of the random field. We first
Nearest Neighbor Clutter Removal for Estimating Features in Spatial Point Processes
 Journal of the American Statistical Association
, 1996
"... We consider the problem of detecting features in spatial point processes in the presence of substantial clutter. One example is the detection of minefields using reconnaissance aircraft images that identify many objects that are not mines. Our solution uses K \Gammath nearest neighbor distances of p ..."
Abstract

Cited by 51 (15 self)
 Add to MetaCart
We consider the problem of detecting features in spatial point processes in the presence of substantial clutter. One example is the detection of minefields using reconnaissance aircraft images that identify many objects that are not mines. Our solution uses K \Gammath nearest neighbor distances of points in the process to classify them as clutter or otherwise. The observed K \Gammath nearest neighbor distances are modeled as a mixture distribution, the parameters of which are estimated by a simple EM algorithm. This method allows for detection of generally shaped features, that need not be path connected. In the minefield example this method yields high detection and low false positive rates. Another application, to outlining seismic faults, is considered, with some success. The method works well in high dimensions. The method can also be used to produce very high breakdownpoint robust estimators of a covariance matrix. KEY WORDS: Breakdown point; Edge effects; EM algorithm; Image ana...
Asymptotic distributions of quasimaximum likelihood estimates for spatial autoregressive models. Econometrica
, 2004
"... This paper investigates asymptotic properties of the maximim likelihood estimator and the quasimaximum likelihood estimator for the spatial autoregressive model. The rates of convergence of those estimators may depend on some general features of the spatial weights matrix of the model. It is import ..."
Abstract

Cited by 48 (8 self)
 Add to MetaCart
This paper investigates asymptotic properties of the maximim likelihood estimator and the quasimaximum likelihood estimator for the spatial autoregressive model. The rates of convergence of those estimators may depend on some general features of the spatial weights matrix of the model. It is important to make the distinction with different spatial scenarios. Under the scenario that each unit will be influenced by only a few neighboring units, the estimators may have √ nrate of convergence and be asymptotic normal. When each unit can be influenced by many neighbors, irregularity of the information matrix may occur and various components of the estimators may have different rates of convergence.