Results 1 - 10
of
13
bdbms: A database management system for biological data
- In CIDR
, 2007
"... Biologists are increasingly using databases for storing and managing their data. Biological databases typically consist of a mixture of raw data, metadata, sequences, annotations, and related data obtained from various sources. Current database technology lacks several functionalities that are neede ..."
Abstract
-
Cited by 29 (5 self)
- Add to MetaCart
(Show Context)
Biologists are increasingly using databases for storing and managing their data. Biological databases typically consist of a mixture of raw data, metadata, sequences, annotations, and related data obtained from various sources. Current database technology lacks several functionalities that are needed by biological databases. In this paper, we introduce bdbms, an extensible prototype database management system for supporting biological data. bdbms extends the functionalities of current DBMSs with: (1) Annotation and provenance management including storage, indexing, manipulation, and querying of annotation and provenance as first class objects in bdbms, (2) Local dependency tracking to track the dependencies and derivations among data items, (3) Update authorization to support data curation via content-based authorization, in contrast to identity-based authorization, and (4) New access methods and their supporting operators that support pattern matching on various types of compressed biological data types. This paper presents the design of bdbms along with the techniques proposed to support these functionalities including an extension to SQL. We also outline some open issues in building bdbms. 1.
Sail: A spatial index library for efficient application integration
- GeoInformatica
, 2005
"... With the proliferation of spatial and spatio-temporal data that are produced everyday by a wide range of applications, Geographic Information Systems (GIS) have to cope with millions of objects with diverse spatial characteristics. Clearly, under these circumstances, substantial performance speed up ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
(Show Context)
With the proliferation of spatial and spatio-temporal data that are produced everyday by a wide range of applications, Geographic Information Systems (GIS) have to cope with millions of objects with diverse spatial characteristics. Clearly, under these circumstances, substantial performance speed up can be achieved with the use of spatial, spatio-temporal and other multidimensional indexing techniques. Due to the increasing research effort on developing new indexing methods, the number of available alternatives is becoming overwhelming, making the task of selecting the most appropriate method for indexing the data according to application needs rather challenging. Therefore, developing a library that can combine a variety of indexing techniques under a common application programming interface can prove to be a valuable tool. In this paper we present SaIL (SpAtial Index Library), an extensible framework that enables easy integration of spatial and spatio-temporal index structures into existing applications. We focus on design issues and elaborate on techniques for making the framework generic enough, so that it can support user defined data types, customizable spatial queries, and a broad range of spatial (and spatio-temporal) index structures, in a way that does not compromise functionality, extensibility and, primarily, ease of use. SaIL is publicly available and has already been successfully utilized for research and commercial applications. 1
Space-Partitioning Trees in PostgreSQL: Realization and Performance
- In Proc. of the 22nd International Conference on Data Engineering (ICDE’06
, 2006
"... Many evolving database applications warrant the use of non-traditional indexing mechanisms beyond B+-trees and hash tables. SP-GiST is an extensible indexing framework that broadens the class of supported indexes to include disk-based versions of a wide variety of space-partitioning trees, e.g., dis ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
Many evolving database applications warrant the use of non-traditional indexing mechanisms beyond B+-trees and hash tables. SP-GiST is an extensible indexing framework that broadens the class of supported indexes to include disk-based versions of a wide variety of space-partitioning trees, e.g., disk-based trie variants, quadtree variants, and kd-trees. This paper presents a serious attempt at implementing and realizing SP-GiST-based indexes inside PostgreSQL. Several index types are realized inside PostgreSQL facilitated by rapid SP-GiST instantiations. Challenges, experiences, and performance issues are addressed in the paper. Performance comparisons are conducted from within PostgreSQL to compare update and search performances of SP-GiST-based indexes against the B+-tree and the R-tree for string, point, and line segment data sets. Interesting results that highlight the potential performance gains of SP-GiST-based indexes are presented in the paper. 1
Video query processing in the vdbms testbed for video database research
- in MMDB ’03: Proceedings of the 1st ACM international workshop on Multimedia databases
, 2003
"... The increased use of video data sets for multimedia-based applications has created a demand for strong video database support, including efficient methods for handling the content-based query and retrieval of video data. Video query processing presents significant research challenges, mainly associa ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
The increased use of video data sets for multimedia-based applications has created a demand for strong video database support, including efficient methods for handling the content-based query and retrieval of video data. Video query processing presents significant research challenges, mainly associated with the size, complexity and unstructured nature of video data. A video query processor must support video operations for search by content and streaming, new query types, and the incorporation of video methods and operators in generating, optimizing and executing query plans. In this paper, we address these query processing issues in two contexts, first as applied to the video data type and then as applied to the stream data type. We first present the query processing functionality of the VDBMS video database management system as a framework designed to support the full range of functionality for
A Video Database Management System for Advancing Video Database Research
- In Proc. of the Int Workshop on Management Information Systems. Nov
, 2002
"... The most useful environments for advancing research and development in video databases are those that provide complete video database management, including (1) video preprocessing for content representation and indexing, (2) storage management for video, metadata and indices, (3) image and semant ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
The most useful environments for advancing research and development in video databases are those that provide complete video database management, including (1) video preprocessing for content representation and indexing, (2) storage management for video, metadata and indices, (3) image and semantic-based query processing, (4) realtime buffer management, and (5) continuous media streaming. Such environments support the entire process of investigating, implementing, analyzing and evaluating new techniques, thus identifying in a concrete way which techniques are truly practical and robust. In this paper we present a video database research initiative that culminated in the successful development of VDBMS, a video database research platform that supports comprehensive and efficient database management for digital video. We describe key video processing components of the system and illustrate the value of VDBMS as a research platform by describing several research projects carried out within the VDBMS environment. These include MPEG7 document support for video feature import and export, a new query operator for optimal multi-feature image similarity matching, secure access control for streaming video, and the mining of medical video data using hierarchical content organization.
Main-Memory Query Processing Utilizing External Indexes
"... Many applications require storage and indexing of new kinds of data in main-memory, e.g. color histograms, textures, shape features, gene sequences, sensor readings, or financial time series. Even though, many domain index structures were developed, very a few of them are implemented in any databas ..."
Abstract
- Add to MetaCart
(Show Context)
Many applications require storage and indexing of new kinds of data in main-memory, e.g. color histograms, textures, shape features, gene sequences, sensor readings, or financial time series. Even though, many domain index structures were developed, very a few of them are implemented in any database management system (DBMS), usually only B-trees and hash indexes. A major reason is that the manual effort to include a new index implementation in a regular DBMS is very costly and time-consuming because it requires integration with all components of the DBMS kernel. To alleviate this, there are some extensible indexing frameworks. However, they all require re-engineering the index implementations, which is a problem when the index has third-party ownership, when only binary code is available, or simply when the index implementation is complex to re-engineer. Therefore, the DBMS should allow including new index implementations without code changes and performance degradation. Furthermore, for high performance the query processor needs knowledge of how to process queries to utilize plugged-in index. Moreover, it is important that all functionalities of a pluggedin index implementation are correct. The extensible main memory database system (MMDB) Mexima (Main-memory External Index Manager) addresses these challenges. It enables transparent plugging in main-memory index implementations without code changes. Index specific rewrite rules transform complex queries to utilize the indexes. Automatic test procedures validate the correctness of them based on user provided index meta-data. Moreover, the same optimization framework can also optimize complex queries sent to a back-end DBMS by exposing hidden indexes for its query optimizer. Altogether, Mexima is a complete and extensible platform for transparently index integration, utilization, and evaluation.
Duplicate Elimination in Space-partitioning Tree Indexes ∗
"... Space-partitioning trees, like the disk-based trie, quadtree, kd-tree and their variants, are a family of access methods that index multi-dimensional objects. In the case of indexing non-zero extent objects, e.g., line segments and rectangles, space-partitioning trees may replicate objects over mult ..."
Abstract
- Add to MetaCart
(Show Context)
Space-partitioning trees, like the disk-based trie, quadtree, kd-tree and their variants, are a family of access methods that index multi-dimensional objects. In the case of indexing non-zero extent objects, e.g., line segments and rectangles, space-partitioning trees may replicate objects over multiple space partitions, e.g., PMR quadtree, expanded MX-CIF quadtree, and extended kd-tree. As a result, the answer to a query over these indexes may include duplicates that need to be eliminated, i.e., the same object may be reported more than once. In this paper, we propose generic duplicate elimination techniques for the class of space-partitioning trees in the context of SP-GiST; an extensible indexing framework for realizing space-partitioning trees. The proposed techniques are embedded inside the INDEX-SCAN operator. Therefore, duplicate copies of the same object do not propagate in the query plan, and the elimination process is transparent to the end-users. Two cases for the index structures are considered based on whether or not the objects ’ coordinates are stored inside the index tree. The theoretical and experimental analysis illustrate that the proposed techniques achieve savings in the storage requirements, I/O operations, and processing time when compared to adding a separate duplicate elimination operator in the query plan. 1
ML-Quadtree: The Design of an Efficient Access Method for Spatial Database Systems
, 2003
"... Abstract. The aim of this paper is to present a new indexing technique that provides an efficient support for retrieving and handling spatial data. Traditionally, the mapping between layers (in a thematic point of view) and index structures is one to one. Each layer is associated with an index struc ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. The aim of this paper is to present a new indexing technique that provides an efficient support for retrieving and handling spatial data. Traditionally, the mapping between layers (in a thematic point of view) and index structures is one to one. Each layer is associated with an index structure. In some previous work, we have presented a data structure, the FI-Quadtree that handles a set of images using only one index structure. This handling is a raster-oriented format. In this paper, we focus on the processing of these objects from the vector oriented format point of view. The Multi-Layer Quadtree (ML-Quadtree) is a new data structure that allows the storage and processing of several layers at the same time. This structure is based on the PM-Quadtree, which allows the storage of only a single-layer map. The aim of the ML-Quadtree is to be able to manage, store and perform queries among multiple layers simultaneously. The design and the manipulation of the proposed structure is presented in this paper whereas the implementation and the experimentation result will be treated in a subsequent paper. 1.
SaIL: A Library for Efficient Application
"... Many scientific applications deal with spatial, spatiotemporal and other multidimensional indexing structures, typically managing millions of objects with arbitrary and complex features. Choosing the appropriate method to index such data becomes rather difficult. Having an index library that can com ..."
Abstract
- Add to MetaCart
Many scientific applications deal with spatial, spatiotemporal and other multidimensional indexing structures, typically managing millions of objects with arbitrary and complex features. Choosing the appropriate method to index such data becomes rather difficult. Having an index library that can combine different indices under the same programming interface is thus very valuable. In this paper we present SaIL (SpAtial Index Library), a robust and extensible library that enables simple integration of spatial index structures in existing applications. We mainly focus on design issues and elaborate on techniques for making the framework generic enough, so that it can support user defined data types, customizable spatial queries, and a broad range of spatial (and spatio-temporal) index structures. The library is publicly available and has already been successfully utilized for research and commercial applications.