Results 1 - 10
of
22
Digital Dynamic Telepathology -- the Virtual Microscope
, 1998
"... this paper, we concentrate on how the system manipulates and displays high power, high resolution histopathology datasets. The Virtual Microscope employs a client/server architecture. The client software runs on an end user's PC or workstation, while the database software for storing, retrieving and ..."
Abstract
-
Cited by 58 (31 self)
- Add to MetaCart
this paper, we concentrate on how the system manipulates and displays high power, high resolution histopathology datasets. The Virtual Microscope employs a client/server architecture. The client software runs on an end user's PC or workstation, while the database software for storing, retrieving and processing the microscope image data runs on a variety of platforms. The database software can run on a PC at the end user's site, or on a potentially remote high performance parallel or distributed computer. The database software is further decomposed into two parts -- a frontend process that accepts queries from clients and one or more backend processes that store and retrieve the
Querying Very Large Multi-dimensional Datasets in ADR
, 1999
"... Applications that make use of very large scientific datasets have become an increasingly important subset of scientific applications. In these applications, datasets are often multi-dimensional, i.e., data items are associated with points in a multi-dimensional attribute space, and access to data ..."
Abstract
-
Cited by 25 (9 self)
- Add to MetaCart
Applications that make use of very large scientific datasets have become an increasingly important subset of scientific applications. In these applications, datasets are often multi-dimensional, i.e., data items are associated with points in a multi-dimensional attribute space, and access to data items is described by range queries. The basic processing involves mapping input data items to output data items, and some form of aggregation of all the input data items that project to the each output data item. We have developed an infrastructure, called the Active Data Repository (ADR), that integrates storage, retrieval and processing of multi-dimensional datasets on distributed-memory parallel architectures with multiple disks attached to each node. In this paper we address efficient execution of range queries on distributed memory parallel machines within ADR framework. We present three potential strategies, and evaluate them under different application scenarios and machine co...
Object-relational Queries into Multidimensional Databases with the Active Data Repository
, 1999
"... As computational power and storage capacity increase, processing and analyzing large volumes of multi-dimensional datasets play an increasingly important role in many domains of scientific research. Scientific applications that make use of very large scientific datasets have several important charac ..."
Abstract
-
Cited by 22 (7 self)
- Add to MetaCart
As computational power and storage capacity increase, processing and analyzing large volumes of multi-dimensional datasets play an increasingly important role in many domains of scientific research. Scientific applications that make use of very large scientific datasets have several important characteristics: datasets consist of complex data and are usually multi-dimensional; applications usually retrieve a subset of all the data available in the dataset; various applicationspecific operations are performed on the data items retrieved. Such applications can be supported by object-relational database management systems (OR-DBMSs). In addition to providing functionality to define new complex datatypes and user-defined functions, an OR-DBMS for scientific datasets should contain runtime support that will provide optimized storage for very large datasets and an execution environment for user-defined functions involving expensive operations. In this paper we describe an infrastructure, the ...
A middleware for developing parallel data mining implementations
- In Proceedings of the first SIAM conference on Data Mining
, 2001
"... Data mining is an interdisciplinary field, having applications in diverse areas like bioinformatics, medical informatics, scientific data analysis, financial analysis, consumer profiling, etc. In each of these application domains, the amount of data available for analysis has exploded in recent year ..."
Abstract
-
Cited by 17 (10 self)
- Add to MetaCart
Data mining is an interdisciplinary field, having applications in diverse areas like bioinformatics, medical informatics, scientific data analysis, financial analysis, consumer profiling, etc. In each of these application domains, the amount of data available for analysis has exploded in recent years, making the scalability of data
Query Planning for Range Queries with User-defined Aggregation on Multi-dimensional Scientific Datasets
, 1999
"... Applications that make use of very large scientific datasets have become an increasingly important subset of scientific applications. In these applications, the datasets are often multi-dimensional, i.e., data items are associated with points in a multi-dimensional attribute space. The processing is ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
Applications that make use of very large scientific datasets have become an increasingly important subset of scientific applications. In these applications, the datasets are often multi-dimensional, i.e., data items are associated with points in a multi-dimensional attribute space. The processing is usually highly stylized, with the basic processing steps consisting of (1) retrieval of a subset of all available data in the input dataset via a range query, (2) projection of each input data item to one or more output data items, and (3) some form of aggregation of all the input data items that project to the each output data item. We have developed an infrastructure, called the Active Data Repository (ADR), that integrates storage, retrieval and processing of multi-dimensional datasets on shared-nothing architectures. In this paper we address query planning and execution strategies for range queries with user-defined processing. We evaluate three potential query planning strategies withi...
Compiler Supported High-level Abstractions for Sparse Disk-Resident Datasets
, 2001
"... Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. The complexity and irregularity of datasets in many domains make the task of developing such processing applications tedious and error-prone. ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. The complexity and irregularity of datasets in many domains make the task of developing such processing applications tedious and error-prone.
Compiling Data Intensive Applications with Spatial Coordinates
- In Proceedings of Languages and Compiler for Parallel Computing
, 2000
"... Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We are developing a compiler which processes data intensive applications written in a dialect of Java and compiles them for efficient execution on cluster of workstations or ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We are developing a compiler which processes data intensive applications written in a dialect of Java and compiles them for efficient execution on cluster of workstations or distributed memory machines.
Compiler and Runtime Analysis for Efficient Communication in Data Intensive Applications
"... Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We are developing a compiler that processes data intensive applications written in a dialect of Java and compiles them for efficient execution on distributed memory parallel ma ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We are developing a compiler that processes data intensive applications written in a dialect of Java and compiles them for efficient execution on distributed memory parallel machines.
Arraystore: A storage manager for complex parallel array processing
, 2011
"... We present the design, implementation, and evaluation of ArrayStore, a new storage manager for complex, parallel array processing. ArrayStore builds on prior work in the area of multidimensional data storage, but considers the new problem of supporting a parallel and more varied workload comprising ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We present the design, implementation, and evaluation of ArrayStore, a new storage manager for complex, parallel array processing. ArrayStore builds on prior work in the area of multidimensional data storage, but considers the new problem of supporting a parallel and more varied workload comprising not only range-queries, but also binary operations such as joins and complex user-defined functions. This paper makes two key contributions. First, it examines several existing single-site storage management strategies and array partitioning strategies to identify which combination is best suited for the array-processing workload above. Second, it develops a new and efficient storagemanagement mechanism that enables parallel processing of operations that must access data from adjacent partitions. We evaluate ArrayStore on over 80GB of real data from two scientific domains and real operators used in these domains. We show that ArrayStore outperforms previously proposed storage management strategies in the context of its diverse target workload.
Language Extensions and Compilation Techniques for Data Intensive Computations
- In Proceedings of Workshop on Compilers for Parallel Computing
, 2000
"... Processing and analyzing large volumes of data plays an increasingly important role in many ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Processing and analyzing large volumes of data plays an increasingly important role in many

