Results 1 - 10
of
21
Logic and databases: a deductive approach
- ACM Computing Surveys
, 1984
"... The purpose of this paper is to show that logic provides a convenient formalism for studying classical database problems. There are two main parts to the paper, devoted respectively to conventional databases and deductive databases. In the first part, we focus on query languages, integrity modeling ..."
Abstract
-
Cited by 130 (2 self)
- Add to MetaCart
The purpose of this paper is to show that logic provides a convenient formalism for studying classical database problems. There are two main parts to the paper, devoted respectively to conventional databases and deductive databases. In the first part, we focus on query languages, integrity modeling and maintenance, query optimization, and data
Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files
- In Proceedings of the 1997 USENIX Technical Conference
, 1997
"... Small file performance in most file systems is limited by slowly improving disk access times, even though current file systems improve on-disk locality by allocating related data objects in the same general region. The key insight for why current file systems perform poorly is that locality is insuf ..."
Abstract
-
Cited by 92 (14 self)
- Add to MetaCart
Small file performance in most file systems is limited by slowly improving disk access times, even though current file systems improve on-disk locality by allocating related data objects in the same general region. The key insight for why current file systems perform poorly is that locality is insufficient --- exploiting disk bandwidth for small data objects requires that they be placed adjacently. We describe C-FFS (Co-locating Fast File System), which introduces two techniques, embedded inodes and explicit grouping, for exploiting what disks do well (bulk data movement) to avoid what they do poorly (reposition to new locations). With embedded inodes, the inodes for most files are stored in the directory with the corresponding name, removing a physical level of indirection without sacrificing the logical level of indirection. With explicit grouping, the data blocks of multiple small files named by a given directory are allocated adjacently and moved to and from the disk as a unit in ...
Metadata Update Performance in File Systems
- In Proceedings of the 1st Symposium on Operating Systems Design and Implementation (OSDI ’94
, 1994
"... Structural changes, such as file creation and block allocation, have consistently been identified as file system performance problems in many user environments. We compare several implementations that maintain metadata integrity in the event of a system failure but do not require changes to the on-d ..."
Abstract
-
Cited by 89 (12 self)
- Add to MetaCart
Structural changes, such as file creation and block allocation, have consistently been identified as file system performance problems in many user environments. We compare several implementations that maintain metadata integrity in the event of a system failure but do not require changes to the on-disk structures. In one set of schemes, the file system uses asynchronous writes and passes ordering requirements to the disk scheduler. These schedulerenforced ordering schemes outperform the conventional approach (synchronous writes) by more than 30 percent for metadata update intensive benchmarks, but are suboptimal mainly due to their inability to safely use delayed writes when ordering is required. We therefore introduce soft updates, an implementation that asymptotically approaches memory-based file system performance (within 5 percent) while providing stronger integrity and security guarantees than most UNIX file systems. For metadata update intensive benchmarks, this improves performance by more than a factor of two when compared to the conventional approach. 1
Physical database design for relational databases
- ACM Transactions on Database Systems
, 1988
"... This paper describes the concepts used in the implementation of DBDSGN, an experimental physical design tool for relational databases developed at the IBM San Jose Research Laboratory. Given a workload for System R (consisting of a set of SQL statements and their execution frequencies), DBDSGN sugge ..."
Abstract
-
Cited by 71 (0 self)
- Add to MetaCart
This paper describes the concepts used in the implementation of DBDSGN, an experimental physical design tool for relational databases developed at the IBM San Jose Research Laboratory. Given a workload for System R (consisting of a set of SQL statements and their execution frequencies), DBDSGN suggests physical configurations for efficient performance. Each configuration consists of a set of indices and an ordering for each table. Workload statements are evaluated only for atomic configurations of indices, which have only one index per table. Costs for any configuration can be obtained from those of the atomic configurations. DBDSGN uses information supplied by the System R optimizer both to determine which columns might be worth indexing and to obtain estimates of the cost of executing statements in different configurations. The tool finds efficient solutions to the index-selection problem; if we assume the cost estimates supplied by the optimizer are the actual execution costs, it finds the optimal solution. Optionally, heuristics can be used to reduce execution time. The approach taken by DBDSGN in solving the index-selection problem for multiple-table statements significantly reduces the complexity of the problem. DBDSGN’s principles were used in the Relational Design Tool (RDT), an IBM product based on DBDSGN, which performs design for SQL/DS, a relational system based on System R. System R actually uses DBDSGN’s suggested solutions as the tool expects because cost estimates and other necessary information can be obtained from System R using a new SQL statement, the EXPLAIN statement. This illustrates how a system can export a model of its internal assumptions and behavior so that other systems (such as tools) can share this model.
A Transaction Mechanism for Engineering Design Databases
- Proc. of the VLDB conference
, 1984
"... One primary difference between transactions in an engineering design environment and those in conventional business applications is that an engineering transaction typically lasts a much longer time. Existing proposals for supporting the long-lived engineering transactions are all based on the publi ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
One primary difference between transactions in an engineering design environment and those in conventional business applications is that an engineering transaction typically lasts a much longer time. Existing proposals for supporting the long-lived engineering transactions are all based on the public/private database architecture, in which a transaction checks out design objects from the public database, modifies them, and checks them into the public database for use by other transactions. However, the design environment which these proposals model is a very rigid one which does not allow a team of designers to complete a complex design involving numerous design objects by passing incomplete objects back and forth among them in a controlled manner. In this paper we present a model of engineering transactions which attempts to resolve this shortcoming as well as satisfying the constraints imposed by the engineering design environment. The model augments existing models by refining the notion of checkout environment which a transaction sees and coupling it with the notion of nested transactions. The model is then extended to a practical mechanism for supporting a complex engineering design environment by imposing the view that a long-lived engineering transaction is really a sequence of conventional short-lived transactions. 1.
Commit LSN: A Novel and Simple Method for Reducing Locking and Latching in Transaction Processing Systems
- in Transaction Processing Systems, Proc. 16th International Conference on Very Large Data Bases
, 1990
"... mnha/vr~lhm.com Abstract This paper presents a novel and simple method, called CommitYLSN, for determining if a piece of data is in the commltted state in a transaction pro-cessing system. This method is a much cheaper alter-native to the locking approach used by the prior art for this purpose. The ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
mnha/vr~lhm.com Abstract This paper presents a novel and simple method, called CommitYLSN, for determining if a piece of data is in the commltted state in a transaction pro-cessing system. This method is a much cheaper alter-native to the locking approach used by the prior art for this purpose. The method takes advantage of the concept of a log sequence number (LSN). In many systems. an LSN is recorded in each page of the data base to relate the state of the page to the log of update actions for that page. Our method uses information about the LSN of the first log record (call it Commit-LSN) of the oldest update transaction still executing in the system to infer that all the updates in pages with page_LSN less than Commit LSN have been committed. This reduces locking and latching. In addition. the method may also increase the level of concurrency that could be supported. The Commit LSN method makes it possible to use fine-granulazty locking without unduly penalizing transactions which read numerous records. It also benefits update transactions by reducing the cost of fine-granularity lock-ing when contention is not present for data on a page. We discuss in detail many applications of this method and illustrate its potential benefits for various environ-ments. In order to apply the Commit-LSN method, ex-tensions are also proposed for those systems in which (1) LSNs are not associated with pages (AS1400, ’ SQLIDS, System R), (2) LSNs are used only partially (IMS), and/or (3) not all objects ’ changes are logged (AS1400, SQL/DS. System R). 1.
On the Integration of Concurrency, Distribution and Persistence
, 1993
"... The principal tenet of the persistence model is that it abstracts over all the physical properties of data such as how long it is stored, where it is stored, how it is stored, what form it is kept in and who is using it. Experience with programming systems which support orthogonal persistence has sh ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
The principal tenet of the persistence model is that it abstracts over all the physical properties of data such as how long it is stored, where it is stored, how it is stored, what form it is kept in and who is using it. Experience with programming systems which support orthogonal persistence has shown that the simpler semantics and reduced complexity can often lead to a significant reduction in software production costs. Persistent systems are relatively new and it is not yet clear which of the many models of concurrency and distribution best suit the persistence paradigm. Previous work in this area has tended to build one chosen model into the system which may then only be applicable to a particular set of problems. This thesis challenges the orthodoxy by designing a persistent framework in which all models of concurrency and distribution can be integrated in an add-on fashion. The provision of such a framework is complicated by a tension between the conceptual ideas of persistence...
Optimistic parallelism benefits from data partitioning
- In Proc. 13th Int’l Conf. on Architecture Support for Programming Languages and Operating Systems (ASPLOS
, 2008
"... Recent studies of irregular applications such as finite-element mesh generators and data-clustering codes have shown that these applications have a generalized data parallelism that arises from the use of iterative algorithms that perform computations on elements of worklists of various kinds. In so ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
Recent studies of irregular applications such as finite-element mesh generators and data-clustering codes have shown that these applications have a generalized data parallelism that arises from the use of iterative algorithms that perform computations on elements of worklists of various kinds. In some irregular applications, the computations on different elements are independent. In other applications, there may be complex patterns of dependences between these computations. The Galois system was designed to exploit this kind of irregular data parallelism on multicore processors. Its main features are (i) two kinds of set iterators for expressing worklist-based data parallelism, and (ii) a runtime system that performs optimistic parallelization of these iterators, detecting conflicts and rolling back computations as needed. Detection of conflicts and rolling back iterations requires information from class implementors.
Scalable and Recoverable Implementation of Object Evolution for the PJama Platform
- In Persistent Object Systems (POS
, 2000
"... PJama 1 is the latest version of an orthogonally persistent platform for Java. It depends on a new persistent object store, Sphere, and provides facilities for class evolution. This evolution technology supports an arbitrary set of changes to the classes, which may have arbitrarily large populations ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
PJama 1 is the latest version of an orthogonally persistent platform for Java. It depends on a new persistent object store, Sphere, and provides facilities for class evolution. This evolution technology supports an arbitrary set of changes to the classes, which may have arbitrarily large populations of persistent objects. We verify that the changes are safe. When there are format changes, we also convert all of the instances, while leaving their identities unchanged. We aspire to both very large persistent object stores and freedom for developers to specify arbitrary conversion methods in Java to convey information from old to new formats. Evolution operations must be safe and the evolution cost should be approximately linear in the number of objects that must be reformatted. In order that these conversion methods can be written easily, we continue to present the pre-evolution state consistently to Java executions throughout an evolution. At the completion of applying all of these tra...
Write-Optimized B-Trees
, 2004
"... Large writes are beneficial both on individual disks and on disk arrays, e.g., RAID-5. The presented design enables large writes of internal B-tree nodes and leaves. It supports both in-place updates and large append-only (“log-structured”) write operations within the same storage volume, within the ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
Large writes are beneficial both on individual disks and on disk arrays, e.g., RAID-5. The presented design enables large writes of internal B-tree nodes and leaves. It supports both in-place updates and large append-only (“log-structured”) write operations within the same storage volume, within the same B-tree, and even at the same time. The essence of the proposal is to make page migration inexpensive, to migrate pages while writing them, and to make such migration optional rather than mandatory as in log-structured file systems. The inexpensive page migration also aids traditional defragmentation as well as consolidation of free space needed for future large writes. These advantages are achieved with a very limited modification to conventional B-trees that also simplifies other B-tree operations, e.g., key range locking and compression. Prior proposals and prototypes implemented transacted B-tree on top of log-structured file systems and added transaction support to log-structured file systems. Instead, the presented design adds techniques and performance characteristics of log-structured file systems to traditional B-trees and their standard transaction support, notably without adding a layer of indirection for locating B-tree nodes on disk. The result retains fine-granularity locking, full transactional ACID guarantees, fast search performance, etc. expected of a modern B-tree implementation, yet adds efficient transacted page relocation and large, high-bandwidth writes. 1

