• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

AmbiguityDirected Sampling for Qualitative Analysis of Sparse Data from Spatially Distributed Physical Systems (2001)

by C Bailey-Kellogg, N Ramakrishnan
Venue:In Proc. IJCAI
Add To MetaCart

Tools

Sorted by:
Results 1 - 8 of 8

Qualitative spatial reasoning: extracting and reasoning with spatial aggregates

by Chris Bailey-kellogg - AI Magazine , 2003
"... Reasoning about spatial data is a key task in many applications, including geographic information systems, meteorological and fluid flow analysis, computer-aided design, and protein structure databases. Such applications often require the identification and manipulation of qualitative spatial repres ..."
Abstract - Cited by 10 (0 self) - Add to MetaCart
Reasoning about spatial data is a key task in many applications, including geographic information systems, meteorological and fluid flow analysis, computer-aided design, and protein structure databases. Such applications often require the identification and manipulation of qualitative spatial representations, for example, to detect whether one “object ” will soon occlude another in a digital image, or to efficiently determine relationships between a proposed road and wetland regions in a geographic data set. Qualitative spatial reasoning (QSR) provides representational primitives (a spatial “vocabulary”) and inference mechanisms for these tasks. This paper first reviews representative work on QSR for data-poor scenarios, where the goal is to design representations that can answer qualitative queries without much numerical information. It then turns to the data-rich case, where the goal is to derive and manipulate qualitative spatial representations that efficiently and correctly abstract important spatial aspects of the underlying data, for use in subsequent tasks. This paper focuses on how a particular QSR system, Spatial Aggregation (SA), can help answer spatial queries for scientific and engineering data sets. A case study application of weather analysis illustrates the effective representation and reasoning supported by both data-poor and data-rich forms of QSR.

Gaussian Processes for Active Data Mining of Spatial Aggregates

by Naren Ramakrishnan, Chris Bailey-kellogg, Satish Tadepalli, Varun N. Pandey, Varun N. P - In Proceedings of the SIAM International Conference on Data Mining , 2005
"... We present an active data mining mechanism for qualitative analysis of spatial datasets, integrating identification and analysis of structures in spatial data with targeted collection of additional samples. The mechanism is designed around the spatial aggregation language (SAL) for qualitative ..."
Abstract - Cited by 9 (1 self) - Add to MetaCart
We present an active data mining mechanism for qualitative analysis of spatial datasets, integrating identification and analysis of structures in spatial data with targeted collection of additional samples. The mechanism is designed around the spatial aggregation language (SAL) for qualitative spatial reasoning, and seeks to uncover high-level spatial structures from only a sparse set of samples. This approach is important for applications in domains such as aircraft design, wireless system simulation, fluid dynamics, and sensor networks. The mechanism employs Gaussian processes, a formal mathematical model for reasoning about spatial data, in order to build surrogate models from sparse data, reason about the uncertainty of estimation at unsampled points, and formulate objective criteria for closing-the-loop between data collection and data analysis. It optimizes sample selection using entropy-based functionals defined over spatial aggregates instead of the traditional approach of sampling to minimize estimated variance. We apply this mechanism on a global optimization benchmark comprising a testbank of 2D functions, as well as on data from wireless system simulations. The results reveal that the proposed sampling strategy makes more judicious use of data points by selecting locations that clarify high-level structures in data, rather than choosing points that merely improve quality of function approximation.

Sampling Strategies for Mining in Data-Scarce Domains

by Ramakrishnan, Chris Bailey-kellogg , 2002
"... description Redescribe Redescribe Update objects N-graph classes Equivalence Spatial Figure 2. The Spatial Aggregation Language's multilayer spatial aggregates, uncovered by a uniform vocabulary of operators using domain knowledge. We can express several scientific data mining tasks---such as vector ..."
Abstract - Cited by 8 (6 self) - Add to MetaCart
description Redescribe Redescribe Update objects N-graph classes Equivalence Spatial Figure 2. The Spatial Aggregation Language's multilayer spatial aggregates, uncovered by a uniform vocabulary of operators using domain knowledge. We can express several scientific data mining tasks---such as vector field bundling, contour aggregation, correspondence abstraction, clustering, and uncovering regions of uniformity---as multilevel computations with SAL aggregates. of the hierarchy (streamlines bundled with convergent flows). The SALpLCPMK compMKM different streamline aggregations from a neighborhood grap and chooses one on the basis of how well its curvature matches the direction of the vectors it aggregates. If data is scarce, some of these classification decisions could be ambiguous ---multipP streamline aggregations might exist. In such a case, we would want to choose a new datasampM that reduces the ambiguity and clarifies what the correct classification should be. This is the essence of oursampLLE methodology: using SAL aggregates, we identify an information -theoretic measure (here, ambiguity) that can drive stages of future data collection. For instance, we can summarize the ambiguous streamline classifications as a 2D ambiguity distribution that has aspAL for every location where we detected an ambiguity. Ambiguity reduction is apMEELE of minimizing (or maximizing, as the case may be) a functional involving thecompNKN ambiguity. The functional could be the underlying data field'sentrop , as the ambiguity distribution reveals. Such a minimization will lead us to select a data ptaC that clarifies the distribution of streamlines, and hence that more effectively uses data for data miningpningCLR This methodology's net effect is that we cancapLBM apPMBPCDPY design's desi...

Gaussian Process Models of Spatial Aggregation Algorithms

by Naren Ramakrishnan - In Proc. IJCAI , 2003
"... Multi-level spatial aggregates are important for data mining in a variety of scientific and engineering applications, from analysis of weather data (aggregating temperature and pressure data into ridges and fronts) to performance analysis of wireless systems (aggregating simulation results into conf ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
Multi-level spatial aggregates are important for data mining in a variety of scientific and engineering applications, from analysis of weather data (aggregating temperature and pressure data into ridges and fronts) to performance analysis of wireless systems (aggregating simulation results into configuration space regions exhibiting particular performance characteristics). In many of these applications, data collection is expensive and time consuming, so effort must be focused on gathering samples at locations that will be most important for the analysis. This requires that we be able to functionally model a data mining algorithm in order to assess the impact of potential samples on the mining of suitable spatial aggregates. This paper describes a novel Gaussian process approach to modeling multi-layer spatial aggregation algorithms, and demonstrates the ability of the resulting models to capture the essential underlying qualitative behaviors of the algorithms. By helping cast classical spatial aggregation algorithms in a rigorous quantitative framework, the Gaussian process models support diverse uses such as directed sampling, characterizing the sensitivity of a mining algorithm to particular parameters, and understanding how variations in input data fields percolate up through a spatial aggregation hierarchy. 1

Active Data Mining of Correspondence for Qualitative Assessment of Scientific Computations

by Chris Bailey-kellogg, Naren Ramakrishnan - In Proc. QR
"... Active data mining constructs and evaluates possible models explaining a dataset, and reasons about the cost and impact of additional samples on refining and selecting among the models. It is particularly appropriate for applications characterized by expensive data collection, from either experi ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Active data mining constructs and evaluates possible models explaining a dataset, and reasons about the cost and impact of additional samples on refining and selecting among the models. It is particularly appropriate for applications characterized by expensive data collection, from either experiment or simulation. This paper develops an active mining mechanism based on a multi-level, qualitative analysis of correspondence.

Spatial Aggregation for Qualitative Assessment of Scientific Computations

by Chris Bailey-Kellogg Purdue, Chris Bailey-kellogg, Naren Ramakrishnan - In Proc. AAAI , 2004
"... Qualitative assessment of scientific computations is an emerging application area that applies a data-driven approach to characterize, at a high level, phenomena including conditioning of matrices, sensitivity to various types of error propagation, and algorithmic convergence behavior. This pape ..."
Abstract - Add to MetaCart
Qualitative assessment of scientific computations is an emerging application area that applies a data-driven approach to characterize, at a high level, phenomena including conditioning of matrices, sensitivity to various types of error propagation, and algorithmic convergence behavior. This paper develops a spatial aggregation approach that formalizes such analysis in terms of model selection utilizing spatial structures extracted from matrix perturbation datasets. We focus in particular on the characterization of matrix eigenstructure, both analyzing sensitivity of computations with spectral portraits and determining eigenvalue multiplicity with Jordan portraits. Our approach employs spatial reasoning to overcome noise and sparsity by detecting mutually reinforcing interpretations, and to guide subsequent data sampling. It enables quantitative evaluation of properties of a scientific computation in terms of confidence in a model, explainable in terms of the sampled data and domain knowledge about the underlying mathematical structure. Not only is our methodology more rigorous than the common approach of visual inspection, but it also is often substantially more efficient, due to well-defined stopping criteria. Results show that the mechanism efficiently samples perturbation space and successfully uncovers high-level properties of matrices.

Proc. 18th Int’l Workshop on Qualitative Reasoning, 2004. Gaussian Processes for Active Data Mining of Spatial Aggregates

by Naren Ramakrishnan, Chris Bailey-kellogg, Satish Tadepalli, Varun N. P
"... We present an active data mining mechanism for qualitative analysis of spatial datasets, integrating identification and analysis of structures in spatial data with targeted collection of additional samples. The mechanism is designed around the spatial aggregation language (SAL) for qualitative spati ..."
Abstract - Add to MetaCart
We present an active data mining mechanism for qualitative analysis of spatial datasets, integrating identification and analysis of structures in spatial data with targeted collection of additional samples. The mechanism is designed around the spatial aggregation language (SAL) for qualitative spatial reasoning, and seeks to uncover high-level spatial structures from only a sparse set of samples. This approach is important for applications in domains such as aircraft design, wireless system simulation, fluid dynamics, and sensor networks. The mechanism employs Gaussian processes, a formal mathematical model for reasoning about spatial data, in order to build surrogate models from sparse data, reason about the uncertainty of estimation at unsampled points, and formulate objective criteria for closing-the-loop between data collection and data analysis. It optimizes sample selection using entropy-based functionals defined over spatial aggregates instead of the traditional approach of sampling to minimize estimated variance. We apply this mechanism on a global optimization benchmark comprising a testbank of 2D functions, as well as on data from wireless system simulations. The results reveal that the proposed sampling strategy makes more judicious use of data points by selecting locations that clarify high-level structures in data, rather than choosing points that merely improve quality of function approximation.

D ATA M INING SAMPLING STRATEGIES FOR MINING IN DATA-SCARCE DOMAINS

by Chris Bailey-kellogg
"... A novel framework leverages physical properties for mining in data-scarce domains. It interleaves bottom-up data mining with top-down data collection, leading to effective and explainable sampling strategies. ..."
Abstract - Add to MetaCart
A novel framework leverages physical properties for mining in data-scarce domains. It interleaves bottom-up data mining with top-down data collection, leading to effective and explainable sampling strategies.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University