Results 1 - 10
of
10
A Framework for Analysis of Data Quality Research
- IEEE Transactions on Knowledge and Data Engineering
, 1995
"... Abstiuct-Organizational databases are pervaded with data of poor quality. However, there has not been an analysis of the data quality literature that provides an overall understanding of the state-of-art research in this area. Using an analogy between product manufacturing and data manufacturing, th ..."
Abstract
-
Cited by 70 (6 self)
- Add to MetaCart
Abstiuct-Organizational databases are pervaded with data of poor quality. However, there has not been an analysis of the data quality literature that provides an overall understanding of the state-of-art research in this area. Using an analogy between product manufacturing and data manufacturing, this paper de-velops a framework for analyzing data quality research, and uses it as the basis for organizing the data quality literature. This framework consists of seven elements: management responsibili-ties, operation and assurance costs, research and development, production, distribution, personnel management, and legal func-tion. The analysis reveals that most research efforts focus on op-eration and assurance costs, research and development, and pro-duction of data products. Unexplored research topics and unre-solved issues are identified and directions for future research provided. Index Terms-Data quality, data manufacturing, data product,
Are Quorums an Alternative for Data Replication
- ACM TRANSACTIONS ON DATABASE SYSTEMS
, 2003
"... ... this article, we analyze several quorum types in order to better understand their behavior in practice. The results obtained challenge many of the assumptions behind quorum based replication. Our evaluation indicates that the conventional read-one/write-all-available approach is the best choice ..."
Abstract
-
Cited by 32 (10 self)
- Add to MetaCart
... this article, we analyze several quorum types in order to better understand their behavior in practice. The results obtained challenge many of the assumptions behind quorum based replication. Our evaluation indicates that the conventional read-one/write-all-available approach is the best choice for a large range of applications requiring data replication. We believe this is an important result for anybody developing code for computing clusters as the read-one/write-all-available strategy is much simpler to implement and more flexible than quorum-based approaches. In this article, we show that, in addition, it is also the best choice using a number of other selection criteria
Performance Modeling of Distributed and Replicated Databases
, 2000
"... This paper surveys performance models for distributed and replicated database systems. Over the last 20 years a variety of such performance models have been developed and they differ in (1) which aspects of a real system are or are not captured in the model (e.g. replication, communication, non-unif ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
This paper surveys performance models for distributed and replicated database systems. Over the last 20 years a variety of such performance models have been developed and they differ in (1) which aspects of a real system are or are not captured in the model (e.g. replication, communication, non-uniform data access, etc.) and (2) how these aspects are modeled. We classify the different alternatives and modeling assumptions, and discuss their interdependencies and expressiveness for the representation of distributed databases. This leads to set of building blocks for analytical performance models. To illustrate the work that is surveyed, we select a combination of these proven modeling concepts and give an example how to compose a balanced analytical model of a replicated database. We use this example to show how to derive meaningful performance values and to discuss the applicability and expressiveness of performance models for distributed and replicated databases. Finally, we compare the analytical results to measurements in a distributed database system.
Light-Weight Currency Management Mechanisms in Mobile and Weakly-Connected Environments
- In Proc. 10th IEEE Workshop on Research Issues in Data Engineering (RIDE
, 2001
"... This paper discusses the currency management mechanisms used in Deno, a replicated object storage system designed for use in mobile and weakly-connected environments. Deno primarily differs from previous work in implementing an asynchronous weighted-voting scheme via epidemic information flow, and i ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This paper discusses the currency management mechanisms used in Deno, a replicated object storage system designed for use in mobile and weakly-connected environments. Deno primarily differs from previous work in implementing an asynchronous weighted-voting scheme via epidemic information flow, and in committing updates in an entirely decentralized fashion, without requiring any server to have complete knowledge of system membership.
Two Approaches for High Concurrency in Multicast-Based Object Replication
, 1994
"... This report presents a replica control protocol for atomic objects. The protocol is derived from an atomic broadcast primitive, and places constraints on the delivery of messages to provide a consistent message order among sites. Several heuristic techniques are proposed to reduce the latency of mes ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This report presents a replica control protocol for atomic objects. The protocol is derived from an atomic broadcast primitive, and places constraints on the delivery of messages to provide a consistent message order among sites. Several heuristic techniques are proposed to reduce the latency of message delivery, for two types of orders. Messages are delivered either in the same order for all sites, or in an order semantically equivalent to this unique ordering. The equivalence relation is based on the commutativity property of operations on objects, i.e: two deposit operations commute. The protocol uses a reliable causal multicast primitive, and is fully distributed. The first set of heuristics is based on a voting scheme, and delivers messages in a unique order. Totally ordered atomic multicast can be built on top of a reliable causal multicast by waiting until each processor in the group has multicast a message, inserting them in a causal graph, and then delivering the roots of this...
Service-Constrained Network Design Problems
, 1996
"... . Several practical instances of network design problems often require the network to satisfy multiple constraints. In this paper, we focus on the following problem (and its variants): find a low-cost network, under one cost function, that services every node in the graph, under another cost funct ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
. Several practical instances of network design problems often require the network to satisfy multiple constraints. In this paper, we focus on the following problem (and its variants): find a low-cost network, under one cost function, that services every node in the graph, under another cost function, (i.e., every node of the graph is within a prespecified distance from the network). This study has important applications to the problems of optical network design and the efficient maintenance of distributed databases. We utilize the framework developed in Marathe et al. [1995] to formulate these problems as bicriteria network design problems, and present approximation algorithms for a class of service-constrained network design problems. Key words: Approximation algorithms, Bicriteria problems, Spanning trees, Network design, Combinatorial algorithms. CR Classification: G.2.2. 1. Introduction and Motivation The problem of managing replicated copies of a data in a distributed datab...
Modeling Replica Availability in Large Data Grids
- JOURNAL OF GRID COMPUTING
, 2003
"... Large Grid systems not only provide massive aggregated computing power but also an unprecedented amount of distributed storage space. Unfortunately, the dynamic behavior of the Grid, caused by varying resource availability, unpredictable data updates, and the impact of local site policies make it di ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Large Grid systems not only provide massive aggregated computing power but also an unprecedented amount of distributed storage space. Unfortunately, the dynamic behavior of the Grid, caused by varying resource availability, unpredictable data updates, and the impact of local site policies make it difficult to exploit the full capabilities of Data Grids. We present
A Fully Distributed Quorum Consensus Method with High Fault-Tolerance and Low Communication Overhead
- Low Communication Overhead, Theoretical Computer Science
, 1997
"... The main objective of data replication in a distributed database system is to provide high data availability for transaction processing. Quorum consensus (QC) methods are commonly applied to managing replicated data. In this paper, we present a new quorum consensus method. The proposed QC method is ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The main objective of data replication in a distributed database system is to provide high data availability for transaction processing. Quorum consensus (QC) methods are commonly applied to managing replicated data. In this paper, we present a new quorum consensus method. The proposed QC method is highly fault-tolerant, and fully distributed (i.e., each site in a distributed system is equally weighted). Further, we can show that the proposed QC method has a low message overhead: 1) In the best case, each transaction operation process needs only to communicate with\Omega\Gamma p n) remote sites to get permission (n is the number of sites storing replicated copies of the manipulating data item). 2) In the worst case, each transaction operation process may be forced to communicate with\Omega\Gamma p n log n) remote sites due to site failures. We also compare our method with the existing QC methods.
Highly Available Replicated Atomic Data
, 1994
"... 3 1.1 Acknowledgments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3 2 Introduction 4 2.1 High available replicated atomic data : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2.2 Contributions : : : : : : : : : : : : : : : : : : : : : : ..."
Abstract
- Add to MetaCart
3 1.1 Acknowledgments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3 2 Introduction 4 2.1 High available replicated atomic data : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2.2 Contributions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 2.3 Organization : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 3 Related work 8 3.1 Replica control protocols : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 3.2 Process group communication : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10 3.3 Replicated databases and Atomic data types : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 4 Group communication 14 4.1 System model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15 4.2 Reliable communication : : : : : : : : : : : : : : : : : : : :...
Recommended by:
"... Abstract. This paper discusses the currency management mechanisms used in Deno, an object replication system designed for use in mobile and weakly-connected environments. Deno primarily differs from previous work in implementing an asynchronous weighted-voting scheme via epidemic information flow, a ..."
Abstract
- Add to MetaCart
Abstract. This paper discusses the currency management mechanisms used in Deno, an object replication system designed for use in mobile and weakly-connected environments. Deno primarily differs from previous work in implementing an asynchronous weighted-voting scheme via epidemic information flow, and in committing updates in an entirely decentralized fashion, without requiring any server to have complete knowledge of system membership. We first give an overview of Deno, discussing its voting scheme, proxy mechanism, basic API, and commit performance. We then focus on the issue of currency management. Although there has been much work on currency management in synchronous, strongly-connected environments, this issue has not been explored in asynchronous, weakly-connected environments. We present currency management mechanisms, based on peer-to-peer currency exchanges, that enable light-weight replica creation, retirement, and currency redistribution while maintaining the correctness of the underlying consistency protocol. We also demonstrate that peer-to-peer currency exchanges can be used to exponentially converge to arbitrary target currency distributions, without the need for any server to have global system information. Keywords: management epidemic algorithms, replicated data consistency, mobile and weakly-connected systems, weight 1.

