Results 1 - 10
of
112
Oceanstore: An architecture for global-scale persistent storage
, 2000
"... OceanStore is a utility infrastructure designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. To improve performance, data is allowed to be cac ..."
Abstract
-
Cited by 847 (27 self)
- Add to MetaCart
OceanStore is a utility infrastructure designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. To improve performance, data is allowed to be cached anywhere, anytime. Additionally, monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data. A prototype implementation is currently under development. 1
Query evaluation techniques for large databases
- ACM COMPUTING SURVEYS
, 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract
-
Cited by 592 (7 self)
- Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate it: In order to manipulate large sets of complex objects as efficiently as today’s database systems manipulate simple records, query processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and post-relational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
The Implementation Of Postgres
- IEEE Transactions on Knowledge and Data Engineering
, 1990
"... Currently, POSTGRES is about 90,000 lines of code in C and is being used by assorted "bold and brave" early users. The system has been constructed by a team of 5 part time students led by a full time chief programmer over the last three years. During this period, we have made a large number of desig ..."
Abstract
-
Cited by 371 (22 self)
- Add to MetaCart
Currently, POSTGRES is about 90,000 lines of code in C and is being used by assorted "bold and brave" early users. The system has been constructed by a team of 5 part time students led by a full time chief programmer over the last three years. During this period, we have made a large number of design and implementation choices. Moreover, in some areas we would do things quite differently if we were to start from scratch again. The purpose of this paper is to reflect on the design and implementation decisions we made and to offer advice to implementors who might follow some of our paths. In this paper we restrict our attention to the DBMS "backend" functions. In another paper some of us treat PICASSO, the application development environment that is being built on top of POSTGRES. 1. INTRODUCTION Current relational DBMSs are oriented toward efficient support for business data processing applications where large numbers of instances of fixed format records must be stored and accessed. Tr...
Aries: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging
- ACM Transactions on Database Systems
, 1992
"... In this paper we present a simple and efficient method, called ARIES ( Algorithm for Recouery and Isolation Exploiting Semantics), which supports partial rollbacks of transactions, finegranularity (e.g., record) locking and recovery using write-ahead logging (WAL). We introduce the paradigm of repea ..."
Abstract
-
Cited by 277 (8 self)
- Add to MetaCart
In this paper we present a simple and efficient method, called ARIES ( Algorithm for Recouery and Isolation Exploiting Semantics), which supports partial rollbacks of transactions, finegranularity (e.g., record) locking and recovery using write-ahead logging (WAL). We introduce the paradigm of repeating history to redo all missing updates before performing the rollbacks of the loser transactions during restart after a system failure. ARIES uses a log sequence number in each page to correlate the state of a page with respect to logged updates of that page. All updates of a transaction are logged, including those performed during rollbacks. By appropriate chaining of the log records written during rollbacks to those written during forward progress, a bounded amount of logging is ensured during rollbacks even in the face of repeated failures during restart or of nested rollbacks We deal with a variety of features that are very Important in building and operating an industrial-strength transaction processing system ARIES supports fuzzy checkpoints, selective and deferred restart, fuzzy image copies, media recovery, and high concurrency lock modes (e. g., increment /decrement) which exploit the semantics of the operations and require the ability to perform operation logging. ARIES is flexible with respect to the kinds of buffer management policies that can be implemented. It supports objects of
A critique of ANSI SQL isolation levels
, 1995
"... Reads, and Phantoms. This paper shows that these phenomena and the ANSI SQL definitions fail to properly characterize several popular isolation levels, including the standard Ioeking implementations of the levels covered. Ambiguity in the statement of the phenomena is investigated and a more formal ..."
Abstract
-
Cited by 178 (4 self)
- Add to MetaCart
Reads, and Phantoms. This paper shows that these phenomena and the ANSI SQL definitions fail to properly characterize several popular isolation levels, including the standard Ioeking implementations of the levels covered. Ambiguity in the statement of the phenomena is investigated and a more formal statement is arrived at; in addition new phenomena that better characterize isolation types are introduced. Finally, an important multiversion isolation type, called Snapshot Isolation, is defined. 1.
Trio: a system for integrated management of data, accuracy, and lineage
- PRESENTED AT CIDR 2005
, 2005
"... Trio is a new database system that manages not only data, butalsotheaccuracy and lineage of the data. Inexact (uncertain, probabilistic, fuzzy, approximate, incomplete, and imprecise!) databases have been proposed in the past, and the lineage problem also has been studied. The goals of the Trio proj ..."
Abstract
-
Cited by 174 (11 self)
- Add to MetaCart
Trio is a new database system that manages not only data, butalsotheaccuracy and lineage of the data. Inexact (uncertain, probabilistic, fuzzy, approximate, incomplete, and imprecise!) databases have been proposed in the past, and the lineage problem also has been studied. The goals of the Trio project are to combine and distill previous work into a simple and usable model, design a query language as an understandable extension to SQL, and most importantly build a working system—a system that augments conventional data management with both accuracy and lineage as an integral part of the data. This paper provides numerous motivating applications for Trio and lays out preliminary plans for the data model, query language, and prototype system.
Deciding when to forget in the Elephant file system
- 17TH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP ’99), PUBLISHED AS OPERATING SYSTEMS REVIEW, 34(5):110–123, DEC. 1999
, 1999
"... Modern file systems associate the deletion of a file with the immediate release of storage, and file writes with the irrevocable change of file contents. We argue that this behavior is a relic of the past, when disk storage was a scarce resource. Today, large cheap disks make it possible for the fil ..."
Abstract
-
Cited by 160 (5 self)
- Add to MetaCart
Modern file systems associate the deletion of a file with the immediate release of storage, and file writes with the irrevocable change of file contents. We argue that this behavior is a relic of the past, when disk storage was a scarce resource. Today, large cheap disks make it possible for the file system to protect valuable data from accidental delete or overwrite. This paper describes the design, implementation, and performance of the Elephant file system, which automatically retains all important versions of user files. Users name previous file versions by combining a traditional pathname with a time when the desired version of a file or directory existed. Storage in Elephant is managed by the system using filegrain user-specified retention policies. This approach contrasts with checkpointing file systems such as Plan-9, AFS, and WAFL that periodically generate efficient checkpoints of entire file systems and thus restrict retention to be guided by a single policy for all files within that file system. Elephant is implemented as a new Virtual File System in the FreeBSD kernel.
Pond: the OceanStore Prototype
, 2003
"... OceanStore is an Internet-scale, persistent data store designed for incremental scalability, secure sharing, and long-term durability. Pond is the OceanStore prototype; it contains many of the features of a complete system including location-independent routing, Byzantine update commitment, push-bas ..."
Abstract
-
Cited by 158 (14 self)
- Add to MetaCart
OceanStore is an Internet-scale, persistent data store designed for incremental scalability, secure sharing, and long-term durability. Pond is the OceanStore prototype; it contains many of the features of a complete system including location-independent routing, Byzantine update commitment, push-based update of cached copies through an overlay multicast network, and continuous archiving to erasure-coded form. In the wide area, Pond outperforms NFS by up to a factor of 4.6 on readintensive phases of the Andrew benchmark, but underperforms NFS by as much as a factor of 7.3 on writeintensive phases. Microbenchmarks show that write performance is limited by the speed of erasure coding and threshold signature generation, two important areas of future research. Further microbenchmarks show that Pond manages replica consistency in a bandwidthefficient manner and quantify the latency cost imposed by this bandwidth savings.
Temporal and Real-Time Databases: A Survey
- IEEE Transactions on Knowledge and Data Engineering
, 1995
"... A temporal database contains time-varying data. In a real-time database transactions have deadlines or timing constraints. In this paper we review the substantial research in these two heretofore separate research areas. We first characterize the time domain, then investigate temporal and real-time ..."
Abstract
-
Cited by 154 (9 self)
- Add to MetaCart
A temporal database contains time-varying data. In a real-time database transactions have deadlines or timing constraints. In this paper we review the substantial research in these two heretofore separate research areas. We first characterize the time domain, then investigate temporal and real-time data models. We evaluate temporal and real-time query languages along several dimensions. Temporal and real-time DBMS implementation is examined. We conclude with a summary of the major accomplishments of the research to date, and list several research questions that should be addressed next. Keywords: object-oriented database, relational databases, query language, temporal data model, time-constrained database, transaction time, user-defined time, valid time 1 Introduction Time is an important aspect of all real-world phenomena. Events occur at specific points in time; objects and the relationships among objects exist over time. The ability to model this temporal dimension of the real worl...

