Results 1 - 10
of
23
XGRIND: A Query-friendly XML Compressor
- IN ICDE
, 2002
"... XML documents are extremely verbose since the "schema" is repeated for every "record" in the document. While a variety of compressors are available to address this problem, they are not designed to support direct querying of the compressed document, a useful feature from a database perspective. In t ..."
Abstract
-
Cited by 71 (0 self)
- Add to MetaCart
XML documents are extremely verbose since the "schema" is repeated for every "record" in the document. While a variety of compressors are available to address this problem, they are not designed to support direct querying of the compressed document, a useful feature from a database perspective. In this paper, we propose a new compression tool called XGrind, that directly supports queries in the compressed domain. A special feature of XGrind is that the compressed document retains the structure of the original document, permitting reuse of the standard XML techniques for processing the compressed document. Performance evaluation over a variety of XML documents and user queries indicates that XGrind simultaneously delivers improved query processing times and reasonable compression ratios.
Client-server paradise
- In Proceedings of the 20th VLDB Conference
, 1994
"... This paper describes the design and implementation of Paradise, a database system designed for handling GIS type of applications. The current version of Paradise, uses a client{server architecture and provides an extended{relational data model for modeling GIS applications. Paradise supports an exte ..."
Abstract
-
Cited by 67 (7 self)
- Add to MetaCart
This paper describes the design and implementation of Paradise, a database system designed for handling GIS type of applications. The current version of Paradise, uses a client{server architecture and provides an extended{relational data model for modeling GIS applications. Paradise supports an extended version of SQL and provides a graphical user interface for querying and browsing the database. We also describe the results of benchmarking Paradise using the Sequoia 2000 storage benchmark. 1
Super-Scalar RAM-CPU Cache Compression
- In Proceedings of the International Conference of Data Engineering (IEEE ICDE
, 2006
"... CWI is a founding member of ERCIM, the European Research Consortium for Informatics and Mathematics. CWI's research has a theme-oriented structure and is grouped into four clusters. Listed below are the names of the clusters and in parentheses their acronyms. ..."
Abstract
-
Cited by 49 (12 self)
- Add to MetaCart
CWI is a founding member of ERCIM, the European Research Consortium for Informatics and Mathematics. CWI's research has a theme-oriented structure and is grouped into four clusters. Listed below are the names of the clusters and in parentheses their acronyms.
The Implementation and Performance of Compressed Databases
, 1998
"... In this paper, we show how compression can be integrated into a relational database system. Specifically, we describe how the storage manager, the query execution engine, and the query optimizer of a database system can be extended to deal with compressed data. Our main result is that compression ca ..."
Abstract
-
Cited by 44 (5 self)
- Add to MetaCart
In this paper, we show how compression can be integrated into a relational database system. Specifically, we describe how the storage manager, the query execution engine, and the query optimizer of a database system can be extended to deal with compressed data. Our main result is that compression can significantly improve the response time of queries if very light-weight compression techniques are used. We will present such light-weight compression techniques and give the results of running the TPC-D benchmark on a so compressed database and a non-compressed database using the AODB database system, an experimental database system that was developed at the Universities of Mannheim and Passau. Our benchmark results demonstrate that compression indeed offers high performance gains (up to 55%) for IO-intensive queries and moderate gains for CPU-intensive queries. Compression can, however, also increase the running time of certain update operations. In all, we recommend to extend today's da...
Data Compression Support in Databases
- In Proceedings of the 20th International Conference on Very Large Data Bases
, 1994
"... Computers running database management applications often manage large amounts of data. Typically, the price of the I/O sub-system is a considerable portion of the com-puting hardware. Fierce price competition demands every possible savings. Lossless data compression methods, when appropri-ately inte ..."
Abstract
-
Cited by 35 (1 self)
- Add to MetaCart
Computers running database management applications often manage large amounts of data. Typically, the price of the I/O sub-system is a considerable portion of the com-puting hardware. Fierce price competition demands every possible savings. Lossless data compression methods, when appropri-ately integrated with the dbms, yield sig-niflcant savings. Roughly speaking, a slight increase in cpu cycles is more than offset by savings in I/O subsystem. Various de-sign issues arise in the use of data compres-sion in the dbms- from the choice of algo-rithm, statistics collection, hardware ver-sus software based compression, location of the compression function in the overall computer system architecture, unit of com-pression, update in place, and the applica-tion of log ’ to compressed data. These are methodic & y examined and trade-offs dis-cussed in the context of choices made for IBM’s DB2 dbms product. 1
Performance tradeoffs in read-optimized databases
- In VLDB 2006: Proceedings of the 32nd international conference on Very large data bases
, 2006
"... Database systems have traditionally optimized performance for write-intensive workloads. Recently, there has been renewed interest in architectures that optimize read performance by using column-oriented data representation and light-weight compression. This previous work has shown that under certai ..."
Abstract
-
Cited by 31 (11 self)
- Add to MetaCart
Database systems have traditionally optimized performance for write-intensive workloads. Recently, there has been renewed interest in architectures that optimize read performance by using column-oriented data representation and light-weight compression. This previous work has shown that under certain broad classes of workloads, column-based systems can outperform rowbased systems. Previous work, however, has not characterized the precise conditions under which a particular query workload can be expected to perform better on a column-oriented database. In this paper we first identify the distinctive components of a read-optimized DBMS and describe our implementation of a high-performance query engine that can operate on both row and column-oriented data. We then use our prototype to perform an in-depth analysis of the tradeoffs between column and row-oriented architectures. We explore these tradeoffs in terms of disk bandwidth, CPU cache latency, and CPU cycles. We show that for most database workloads, a carefully designed column system can outperform a carefully designed row system, sometimes by an order of magnitude. We also present an analytical model to predict whether a given workload on a particular hardware configuration is likely to perform better on a row or column-based system. 1.
Query Optimization In Compressed Database Systems
- In ACM SIGMOD
, 2001
"... Over the lastd ecad es, improvements in CPU speed have outpaced improvements in main memory and d isk access rates by ord ers of magnitud , enabling the use ofd ata compression techniques to improve the performance ofd atabase systems. Previous work d scribes the benefits of compression for numerica ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Over the lastd ecad es, improvements in CPU speed have outpaced improvements in main memory and d isk access rates by ord ers of magnitud , enabling the use ofd ata compression techniques to improve the performance ofd atabase systems. Previous work d scribes the benefits of compression for numerical attributes, whered8 a is stored in compressed format ond isk. Despite the abund3& e of stringvalued attributes in relational schemas there is little work on compression for string attributes in ad atabase context. Moreover, none of the previous work suitablyad2 esses the role of the query optimizer: During query execution, dD a is either eagerly d compressed when it is read into main memory, or dD a lazily stays compressed in main memory and is d compressed ond emand only. In this paper, we present an e#ective approach for dD abase compression based on lightweight, attribute-level compression techniques. We propose a Hierarchical ictionary Encod ing strategy that intelligently selects the most e#ective compression method for string-valued attributes. We show that eager and lazy d compression strategies prod1 e suboptimal plans for queries involving compressed string attributes. We then formalize the problem of compressionaware query optimizationand propose one provably optimal and two fast heuristic algorithms for selecting a query plan for relational schemas with compressed attributes; our algorithms can easily be integrated into existing cost-based query optimizers. Experiments using TPC-Hd atad emonstrate the impact of our string compression method s and show the importance of compression-aware query optimization. Our approach results in up to an or d r speed up over existing approaches. 1.
Database Compression: A Performance Enhancement Tool
- Proc. of 7th Intl. Conf. on Management of Data (COMAD
, 1995
"... Compression is typically used for databases that have grown large enough to create a strain on system storage capacity. We argue here that database compression is attractive from a query processing viewpoint also and should therefore be implemented even when disk storage is plentiful. We study the c ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
Compression is typically used for databases that have grown large enough to create a strain on system storage capacity. We argue here that database compression is attractive from a query processing viewpoint also and should therefore be implemented even when disk storage is plentiful. We study the compression ratio and query processing performance of a variety of compression algorithms, for different compression granularities, on a set of relations drawn from real world databases. Our study shows that attribute level compression is the best from a query processing perspective but has poor compression ratio. We then present a modified attribute level compression algorithm, based on non-adaptive arithmetic compression, called COLA, which simultaneously provides good query processing and reasonable compression ratios. We also analyze, for a range of relational queries, the performance benefits that COLA could be expected to provide. 1 Introduction Many database management systems provide...
An algebraic compression framework for query results
- In ICDE
, 2000
"... Decision-support applications in emerging environments require that SQL query results or intermediate results be shipped to clients for further analysis and presentation. These clients may use low bandwidth connections or have severe storage restrictions. Consequently, there is a need to compress th ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Decision-support applications in emerging environments require that SQL query results or intermediate results be shipped to clients for further analysis and presentation. These clients may use low bandwidth connections or have severe storage restrictions. Consequently, there is a need to compress the results of a query for efficient transfer and client-side access. This paper explores a variety of techniques that address this issue. Instead of using a fixed method, we choose a combination of compression methods that use statistical and semantic information of the query results to enhance the effect of compression. To represent such a combination, we present a framework of “compression plans ” formed by composing primitive compression operators. We also present optimization algorithms that enumerate valid compression plans and choose an optimal plan. Our experiments show that our techniques achieve significant performance improvement over standard compression tools like WinZip. 1.
Query execution in column-oriented database systems
, 2008
"... There are two obvious ways to map a two-dimension relational database table onto a one-dimensional storage interface: store the table row-by-row, or store the table column-by-column. Historically, database system implementations and research have focused on the row-by row data layout, since it perfo ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
There are two obvious ways to map a two-dimension relational database table onto a one-dimensional storage interface: store the table row-by-row, or store the table column-by-column. Historically, database system implementations and research have focused on the row-by row data layout, since it performs best on the most common application for database systems: business transactional data processing. However, there are a set of emerging applications for database systems for which the row-by-row layout performs poorly. These applications are more analytical in nature, whose goal is to read through the data to gain new insight and use it to drive decision making and planning. In this dissertation, we study the problem of poor performance of row-by-row data layout for these emerging

