## Gaussian Processes for Active Data Mining of Spatial Aggregates (2005)

### Cached

### Download Links

Venue: | In Proceedings of the SIAM International Conference on Data Mining |

Citations: | 18 - 1 self |

### BibTeX

@INPROCEEDINGS{Ramakrishnan05gaussianprocesses,

author = {Naren Ramakrishnan and Chris Bailey-kellogg and Satish Tadepalli and Varun N. Pandey and Varun N. P},

title = {Gaussian Processes for Active Data Mining of Spatial Aggregates},

booktitle = {In Proceedings of the SIAM International Conference on Data Mining},

year = {2005}

}

### OpenURL

### Abstract

We present an active data mining mechanism for qualitative analysis of spatial datasets, integrating identification and analysis of structures in spatial data with targeted collection of additional samples. The mechanism is designed around the spatial aggregation language (SAL) for qualitative spatial reasoning, and seeks to uncover high-level spatial structures from only a sparse set of samples. This approach is important for applications in domains such as aircraft design, wireless system simulation, fluid dynamics, and sensor networks. The mechanism employs Gaussian processes, a formal mathematical model for reasoning about spatial data, in order to build surrogate models from sparse data, reason about the uncertainty of estimation at unsampled points, and formulate objective criteria for closing-the-loop between data collection and data analysis. It optimizes sample selection using entropy-based functionals defined over spatial aggregates instead of the traditional approach of sampling to minimize estimated variance. We apply this mechanism on a global optimization benchmark comprising a testbank of 2D functions, as well as on data from wireless system simulations. The results reveal that the proposed sampling strategy makes more judicious use of data points by selecting locations that clarify high-level structures in data, rather than choosing points that merely improve quality of function approximation.

### Citations

1566 |
An Introduction to Support Vector Machines (and Other Kernel-Based Learning Methods
- Cristianini, Shawe-Taylor
- 2000
(Show Context)
Citation Context ...−1.5 −2 1 0.5 0 −0.5 −1 −1 Figure 3: de Boor’s ‘pocket’ function in 2D, depicting contours around basins of local minima. • Since Gaussian processes are re-statements of kernel-based learning methods =-=[7]-=- this work helps bridge the qualitative nature of SAL algorithms with rigorous quantitative methodologies necessary to evaluate and assess active mining strategies. This work assumes a moderate backgr... |

529 | Active learning with statistical models
- Cohn, Ghahramani, et al.
- 1996
(Show Context)
Citation Context ...f contexts [4, 25]. A sampling strategy typically embodies a human assessment of where might be a good location to collect data [1, 13] or is derived from the optimization of specific design criteria =-=[5, 17, 22]-=-. Many of these strategies, however, are either based on utility of data for function approximation purposes [24], or are meant to be used with specific data mining algorithms and tasks (e.g., classif... |

509 | Support vector machine active learning with applications to text classification
- Tong, D
(Show Context)
Citation Context ...tly, we will not need to sample the entire configuration space, only enough so as to identify a region with acceptable confidence. Active data selection has been investigated in a variety of contexts =-=[4, 25]-=-. A sampling strategy typically embodies a human assessment of where might be a good location to collect data [1, 13] or is derived from the optimization of specific design criteria [5, 17, 22]. Many ... |

336 |
Design and Analysis of Computer Experiments
- Sacks, Welch, et al.
- 1989
(Show Context)
Citation Context ...ct data [1, 13] or is derived from the optimization of specific design criteria [5, 17, 22]. Many of these strategies, however, are either based on utility of data for function approximation purposes =-=[24]-=-, or are meant to be used with specific data mining algorithms and tasks (e.g., classification [10]). In this paper, we present a formal framework that casts spatial data mining as uncovering successi... |

330 | Information-based Objective Functions for Active Data
- MacKay
- 1992
(Show Context)
Citation Context ...f contexts [4, 25]. A sampling strategy typically embodies a human assessment of where might be a good location to collect data [1, 13] or is derived from the optimization of specific design criteria =-=[5, 17, 22]-=-. Many of these strategies, however, are either based on utility of data for function approximation purposes [24], or are meant to be used with specific data mining algorithms and tasks (e.g., classif... |

262 |
Mining Geostatistics
- Journel, Huijbregs
- 1978
(Show Context)
Citation Context ...use of Gaussian processes in machine learning and data mining is a relatively new development, although their origins can be traced to spatial statistics and the practice of modeling known as kriging =-=[14]-=-. In contrast to global approximation techniques such as least-squares fitting, GPs are local approximation techniques, akin to nearest-neighbor procedures. In contrast to function approximation techn... |

196 | Prediction with gaussian processes: From linear regression to linear prediction and beyond
- Williams
- 1997
(Show Context)
Citation Context ...less simulation example above). Our active mining mechanism is based on the spatial aggregation language (SAL; [3]), a generic data mining framework for spatial datasets, and Gaussian processes (GPs; =-=[27]-=-), a powerful unifying theory for approximating and reasoning about datasets. Gaussian processes provide the ‘glue’ that enables us to perform active mining on spatial aggregates. In particular, they ... |

147 |
NETLAB: Algorithms for Pattern Recognition
- Nabney
- 2003
(Show Context)
Citation Context ... characteristics. We have chosen a stationary structure above under the assumption that the covariance is translation invariant. Various other functions have been studied in the literature (e.g., see =-=[18, 19, 24]-=-), all of which satisfy the required property of positive definiteness. The simplest covariance function yields a diagonal matrix, but this means that no data sample can have an influence on other loc... |

141 |
Chameleon: Hierarchical clustering using dynamic modeling
- Karypis, Han, et al.
- 1999
(Show Context)
Citation Context ...inism in aggregation procedures) and integrate this model with the GP model for the data fields. Instantiating SAL to popular spatial mining algorithms investigated in the data mining community (e.g. =-=[15, 20]-=-) and applying them in an active mining context is a final direction we are pursuing. These and similar ideas will help establish the many ways in which mathematical models of data approximation can b... |

124 | Monte Carlo implementation of Gaussian process models for Bayesian regression and classification
- Neal
- 1997
(Show Context)
Citation Context ... characteristics. We have chosen a stationary structure above under the assumption that the covariance is translation invariant. Various other functions have been studied in the literature (e.g., see =-=[18, 19, 24]-=-), all of which satisfy the required property of positive definiteness. The simplest covariance function yields a diagonal matrix, but this means that no data sample can have an influence on other loc... |

98 |
Bayesian prediction of Deterministic Functions, with Applications to Design and Analysis of Computer Experiments
- Currin, Mitchell, et al.
- 1991
(Show Context)
Citation Context ...ciable effect on future samplings, and the variance-based metric favors the outer envelope of the design space. 4.2 Entropy-Based Functionals: It is a classical result in experiment design (e.g., see =-=[8]-=-) that, for Gaussian priors, the variance-reducing design is actually equivalent to the design minimizing the (expected) posterior entropy of the distribution t D|D , where D denotes the unsampled loc... |

86 | Clarans: A method for clustering objects for spatial data mining, in
- Han
(Show Context)
Citation Context ...inism in aggregation procedures) and integrate this model with the GP model for the data fields. Instantiating SAL to popular spatial mining algorithms investigated in the data mining community (e.g. =-=[15, 20]-=-) and applying them in an active mining context is a final direction we are pursuing. These and similar ideas will help establish the many ways in which mathematical models of data approximation can b... |

70 | Incorporating diversity in active learning with support vector machines
- Brinker
- 2003
(Show Context)
Citation Context ...tly, we will not need to sample the entire configuration space, only enough so as to identify a region with acceptable confidence. Active data selection has been investigated in a variety of contexts =-=[4, 25]-=-. A sampling strategy typically embodies a human assessment of where might be a good location to collect data [1, 13] or is derived from the optimization of specific design criteria [5, 17, 22]. Many ... |

69 | Computer Experiments
- Koehler, Owen
- 1996
(Show Context)
Citation Context ...he variance-reducing design is actually equivalent to the design minimizing the (expected) posterior entropy of the distribution t D|D , where D denotes the unsampled locations in D. For a proof, see =-=[16]-=-. This criterion is also equivalent to the D-optimality design criterion in spatial statistics, under the assumption that the noise factor on all measurements is the same. MacKay generalizes this idea... |

37 |
Query-based learning applied to partially trained multi-layer perceptrons
- Hwang, Choi, et al.
- 1991
(Show Context)
Citation Context ...e confidence. Active data selection has been investigated in a variety of contexts [4, 25]. A sampling strategy typically embodies a human assessment of where might be a good location to collect data =-=[1, 13]-=- or is derived from the optimization of specific design criteria [5, 17, 22]. Many of these strategies, however, are either based on utility of data for function approximation purposes [24], or are me... |

35 | Spatial Aggregation: Theory and Applications
- Yip, Zhao
- 1996
(Show Context)
Citation Context ...c. 5 evaluates the mechanism using both synthetic and real-world datasets. Sec. 6 provides a discussion and reviews related work. 2 Spatial Aggregation Language The Spatial Aggregation Language (SAL) =-=[3, 28, 30]-=- is a generic framework to study the design and implementation of spatial data mining algorithms. SAL is centered on a field ontology, in which the spatial data input is a field mapping from one conti... |

32 | Spatial Aggregation: Language and Applications
- Bailey-Kellogg, Zhao, et al.
- 1996
(Show Context)
Citation Context ...y well to a wide range of data sets with more abstract notions of space (such as the wireless simulation example above). Our active mining mechanism is based on the spatial aggregation language (SAL; =-=[3]-=-), a generic data mining framework for spatial datasets, and Gaussian processes (GPs; [27]), a powerful unifying theory for approximating and reasoning about datasets. Gaussian processes provide the ‘... |

14 | AmbiguityDirected Sampling for Qualitative Analysis of Sparse Data from Spatially Distributed Physical Systems
- Bailey-Kellogg, Ramakrishnan
(Show Context)
Citation Context ...e confidence. Active data selection has been investigated in a variety of contexts [4, 25]. A sampling strategy typically embodies a human assessment of where might be a good location to collect data =-=[1, 13]-=- or is derived from the optimization of specific design criteria [5, 17, 22]. Many of these strategies, however, are either based on utility of data for function approximation purposes [24], or are me... |

14 | Relation-Based Aggregation: Finding Objects in Large Spatial Datasets
- Huang, Zhao
- 1999
(Show Context)
Citation Context ...ng the mechanisms required to uncover multi-level structures in spatial datasets in applications ranging from decentralized control design [2] and object manipulation [30] to analysis of weather data =-=[12]-=-, diffusion-reaction morphogenesis [21], and matrix perturbation analysis [22]. The identification of structures in a field is a form of data reduction: a relatively information-rich field representat... |

13 | Influence-Based Model Decomposition for Reasoning about Spatially Distributed Physical Systems
- Bailey-Kellogg, Zhao
(Show Context)
Citation Context ...gher level. This vocabulary has proved effective for expressing the mechanisms required to uncover multi-level structures in spatial datasets in applications ranging from decentralized control design =-=[2]-=- and object manipulation [30] to analysis of weather data [12], diffusion-reaction morphogenesis [21], and matrix perturbation analysis [22]. The identification of structures in a field is a form of d... |

13 | STA: Spatio-Temporal Aggregation with Applications to Analysis of DiffusionReaction Phenomena
- Ordóñez, Zhao
- 2000
(Show Context)
Citation Context ...ulti-level structures in spatial datasets in applications ranging from decentralized control design [2] and object manipulation [30] to analysis of weather data [12], diffusion-reaction morphogenesis =-=[21]-=-, and matrix perturbation analysis [22]. The identification of structures in a field is a form of data reduction: a relatively information-rich field representation is abstracted into a more concise s... |

10 |
Algorithm 829: Software for generation of classes of test functions with known local and global minima for global optimization
- Gaviano, Kvasov, et al.
- 2003
(Show Context)
Citation Context ...d (bottom) entropy-based sampling. (left) number of pockets found and (right) negative log-likelihood. 5.1 Synthetic Datasets: For the synthetic benchmark, we adopted the suite of test functions from =-=[11]-=-, an ACM TOMS algorithm to readily generate classes of functions with known local and global minima. The algorithm systematically distorts a convex quadratic function with cubic or quintic polynomials... |

8 | Sampling Strategies for Mining in Data-Scarce Domains
- Ramakrishnan, Bailey-Kellogg
- 2002
(Show Context)
Citation Context ...f contexts [4, 25]. A sampling strategy typically embodies a human assessment of where might be a good location to collect data [1, 13] or is derived from the optimization of specific design criteria =-=[5, 17, 22]-=-. Many of these strategies, however, are either based on utility of data for function approximation purposes [24], or are meant to be used with specific data mining algorithms and tasks (e.g., classif... |

8 |
Imagistic Reasoning
- Yip, Zhao, et al.
- 1995
(Show Context)
Citation Context ... is a field mapping from one continuum to another (e.g. 2-D temperature field: R 2 → R 1 ; 3-D fluid flow field: R 3 → R 3 ). SAL programs employ vision-like routines, in an imagistic reasoning style =-=[29]-=-, to uncover and manipulate multi-layer geometric and topological structures in fields. Due to continuity, fields exhibit regions of uniformity, and these regions can be abstracted as higher-level str... |

7 | Adding Constrained Discontinuities to Gaussian Process Models of Wind Fields
- Cornford, Nabney, et al.
- 1998
(Show Context)
Citation Context ... issue of how to model data fields given only (or also) derivative information or when the underlying function is not smooth or differentiable. Other investigators have done related work in this area =-=[6]-=-. Third, we assume here that the model (of flow classes) posited by SAL is correct, and use this information to drive the sampling. To overcome this assumption, we must create a probabilistic model of... |

5 |
Comment on ‘Design and Analysis of Computer Experiments
- Easterling
- 1989
(Show Context)
Citation Context ...ta at the edges of the input space.’ Continuing the sampling, we see that the 13point design actually has the samples organized in a diagonal design (a layout that has been referred to as ‘whimsical’ =-=[9]-=-). The emphasis on overall quality of function approximation more than data mining is evident from the fact that it takes over 30 points before the SAL-based pocket finder can infer that there are fou... |

4 | Using Hierarchical Data Mining to Characterize Performance of Wireless System Configurations - Verstak, Ramakrishnan, et al. - 2002 |

4 | Data Mining with Sparse Grids using Simplicial Basis Functions
- Garcke, Griebel
- 2001
(Show Context)
Citation Context ...f these strategies, however, are either based on utility of data for function approximation purposes [24], or are meant to be used with specific data mining algorithms and tasks (e.g., classification =-=[10]-=-). In this paper, we present a formal framework that casts spatial data mining as uncovering successive multi-level aggregates of data, and uses properties of higher-level structures to help close the... |

3 | Bailey-Kellogg: Gaussian Process Models of Spatial Aggregation Algorithms
- Ramakrishnan, C
- 2003
(Show Context)
Citation Context ... correspondence abstraction, clustering, and uncovering regions of uniformity can be expressed as multi-level spatial aggregate computations. 1.1 Contributions: This paper builds on our prior work in =-=[1, 23]-=- by presenting a novel integration of Gaussian processes with SAL: • While classical active mining work in spatial modeling focuses on quality of function approximation, the mechanism presented here f... |

3 | Physics-Based Encapsulation in Embedded Software for Distributed Sensing and Control Applications
- Zhao, Bailey-Kellogg, et al.
(Show Context)
Citation Context ...c. 5 evaluates the mechanism using both synthetic and real-world datasets. Sec. 6 provides a discussion and reviews related work. 2 Spatial Aggregation Language The Spatial Aggregation Language (SAL) =-=[3, 28, 30]-=- is a generic framework to study the design and implementation of spatial data mining algorithms. SAL is centered on a field ontology, in which the spatial data input is a field mapping from one conti... |

2 | Spatial Aggregation for Qualitative Assessment of Scientific Computations - Bailey-Kellogg, Ramakrishnan - 2004 |

1 |
et al. Using Hierarchical Data Mining to Characterize Performance of Wireless System Configurations
- Verstak
- 2002
(Show Context)
Citation Context ... them will actually contain suboptimal configurations. We adopt an experimental methodology similar to that in the previous case studies, and created an ‘oracle’ from the simulation data described in =-=[26]-=-. 0 1slog(BER) −1 −2 −3 −4 −5 10 20 SNR2, dB 30 40 40 30 20 SNR1, dB Figure 10: Estimates of BER performance in a space of wireless system configurations. Fig. 10 demonstrates that the dataset is quit... |