Results 1 - 10
of
15
Layering in Provenance Systems
, 2009
"... Digital provenance describes the ancestry or history of a digital object. Most existing provenance systems, however, operate at only one level of abstraction: the system call layer, a workflow specification, or the high-level constructs of a particular application. The provenance collectable in each ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
Digital provenance describes the ancestry or history of a digital object. Most existing provenance systems, however, operate at only one level of abstraction: the system call layer, a workflow specification, or the high-level constructs of a particular application. The provenance collectable in each of these layers is different, and all of it can be important. Single-layer systems fail to account for the different levels of abstraction at which users need to reason about their data and processes. These systems cannot integrate data provenance across layers and cannot answer questions that require an integrated view of the provenance. We have designed a provenance collection structure facilitating the integration of provenance across multiple levels of abstraction, including a workflow engine, a web browser, and an initial runtime Python provenance tracking wrapper. We layer these components atop provenance-aware network storage (NFS) that builds upon a Provenance-Aware Storage System (PASS). We discuss the challenges of building systems that integrate provenance across multiple layers of abstraction, present how we augmented systems in each layer to integrate provenance, and present use cases that demonstrate how provenance spanning multiple layers provides functionality not available in existing systems. Our evaluation shows that the overheads imposed by layering provenance systems are reasonable.
Provenance for the Cloud
"... The cloud is poised to become the next computing environment for both data storage and computation due to its pay-as-you-go and provision-as-you-go models. Cloud storage is already being used to back up desktop user data, host shared scientific data, store web application data, and to serve web page ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The cloud is poised to become the next computing environment for both data storage and computation due to its pay-as-you-go and provision-as-you-go models. Cloud storage is already being used to back up desktop user data, host shared scientific data, store web application data, and to serve web pages. Today’s cloud stores, however, are missing an important ingredient: provenance. Provenance is metadata that describes the history of an object. We make the case that provenance is crucial for data stored on the cloud and identify the properties of provenance that enable its utility. We then examine current cloud offerings and design and implement three protocols for maintaining data/provenance in current cloud stores. The protocols represent different points in the design space and satisfy different subsets of the provenance properties. Our evaluation indicates that the overheads of all three protocols are comparable to each other and reasonable in absolute terms. Thus, one can select a protocol based upon the properties it provides without sacrificing performance. While it is feasible to provide provenance as a layer on top of today’s cloud offerings, we conclude by presenting the case for incorporating provenance as a core cloud feature, discussing the issues in doing so. 1
Bridging Workflow and Data Provenance using Strong Links
"... Abstract. As scientists continue to migrate their work to computational methods, it is important to track not only the steps involved in the computation but also the data consumed and produced. While this provenance information can be captured, in existing approaches, it often contains only weak ref ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. As scientists continue to migrate their work to computational methods, it is important to track not only the steps involved in the computation but also the data consumed and produced. While this provenance information can be captured, in existing approaches, it often contains only weak references between data and provenance. When data files or provenance are moved or modified, it can be difficult to find the data associated with the provenance or to find the provenance associated with the data. We propose a persistent storage mechanism that manages input, intermediate, and output data files, strengthening the links between provenance and data. This mechanism provides better support for reproducibility because it ensures the data referenced in provenance information can be readily located. Another important benefit of such management is that it allows caching of intermediate data which can then be shared with other users. We present an implemented infrastructure for managing data in a provenance-aware manner and demonstrate its application in scientific projects. 1
Towards a data-centric view of cloud security
- in Proceedings of the second international workshop on Cloud data management, ser. CloudDB ’10
"... Cloud security issues have recently gained traction in the research community, with much of the focus primarily concentrated on securing the operating systems and virtual machines on which the services are deployed. In this paper, we take an alternative perspective and propose a data-centric view of ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Cloud security issues have recently gained traction in the research community, with much of the focus primarily concentrated on securing the operating systems and virtual machines on which the services are deployed. In this paper, we take an alternative perspective and propose a data-centric view of cloud security. In particular, we explore the security properties of secure data sharing between applications hosted in the cloud. We discuss data management challenges in the areas of secure distributed query processing, system analysis and forensics, and query correctness assurance, and describe our current efforts towards meeting these challenges using our Declarative Secure Distributed Systems (DS2) platform.
Securing Provenance-based Audits
"... Abstract. Given the significant increase of on-line services that require personal information from users, the risk that such information is misused has become an important concern. In such a context, information accountability is desirable since it allows users (and society in general) to decide, b ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Given the significant increase of on-line services that require personal information from users, the risk that such information is misused has become an important concern. In such a context, information accountability is desirable since it allows users (and society in general) to decide, by means of audits, whether information is used appropriately. To ensure information accountability, information flow should be made transparent. It has been argued that data provenance can be used as the mechanism to underpin such a transparency. Under these conditions, an audit’s quality depends on the quality of the captured provenance information. Thereby, the integrity of provenance information emerges as a decisive issue in the quality of a provenance-based audit. The aim of this paper is to secure provenance-based audits by the inclusion of cryptographic elements in the communication between the involved entities as well as in the provenance representation. This paper also presents a formalisation and an automatic verification of a set of security properties that increase the level of trust in provenance-based audit results. 1
Towards a Secure and Efficient System for End-to-End Provenance
- APPEARS IN THE PROCEEDINGS OF THE SECOND USENIX WORKSHOP ON THEORY AND PRACTICE OF PROVENANCE (TAPP 2010)
, 2010
"... Work on the End-to-End Provenance System (EEPS) began in the late summer of 2009. The EEPS effort seeks to explore the three central questions in provenance systems: (1) “Where and how do I design secure hostlevel provenance collecting instruments (called provenance monitors)?”; (2) “How do I extend ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Work on the End-to-End Provenance System (EEPS) began in the late summer of 2009. The EEPS effort seeks to explore the three central questions in provenance systems: (1) “Where and how do I design secure hostlevel provenance collecting instruments (called provenance monitors)?”; (2) “How do I extend completeness and accuracy guarantees to distributed systems and computations?”; and (3) “What are the costs associated with provenance collection? ” This position paper discusses our initial exploration into these issues and posits several challenges to the realization of the EEPS vision.
Garm: Cross Application Data Provenance and Policy Enforcement
"... We present Garm, a new tool for tracing data provenance and enforcing data access policies with arbitrary binaries. Users can use Garm to attach access policies to data and Garm ensures that all accesses to the data (and derived data) across all applications and executions are consistent with the po ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present Garm, a new tool for tracing data provenance and enforcing data access policies with arbitrary binaries. Users can use Garm to attach access policies to data and Garm ensures that all accesses to the data (and derived data) across all applications and executions are consistent with the policy. Garm uses a staged analysis that combines a static analysis with a dynamic analysis to trace the provenance of an application’s state and the policies that apply to this state. The implementation monitors the interactions of the application with the underlying operating system to enforce policies. Conceptually, Garm combines trusted computing support from the underlying operating system with a stream cipher to ensure that data protected by an access policy cannot be accessed outside of Garm’s policy enforcement mechanisms. We have evaluated Garm with several common Linux applications. We found that Garm can successfully trace the provenance of data across executions of multiple applications and enforce data access policies on the application’s executions.
Provenance of Decisions in Emergency Response Environments
"... Abstract. Mitigating the devastating ramifications of major disasters requires emergency workers to respond in a maximally efficient way. Information systems can improve their efficiency by organizing their efforts and automating many of their decisions. However, absence of documenting how decisions ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. Mitigating the devastating ramifications of major disasters requires emergency workers to respond in a maximally efficient way. Information systems can improve their efficiency by organizing their efforts and automating many of their decisions. However, absence of documenting how decisions were made by the system prevents decisions from being reviewed to check the reasons for their making or their compliance with policies. We apply the concept of provenance to decision making in emergency response situations and use the Open Provenance Model to express provenance produced in RoboCup Rescue Simulation. We produce provenance DAGs using a novel OPM profile that conceptualizes decisions in the context of emergency response. Finally, we traverse the OPM DAGs to answer some provenance questions about those decisions.
Towards Semantics for Provenance Security
"... Provenance records the history of data. Careless use of provenance may violate the security policies of data. Moreover, the provenance itself may be sensitive information, necessitating restrictions on the use of both data and provenance to enforce security requirements. This paper proposes extensio ..."
Abstract
- Add to MetaCart
Provenance records the history of data. Careless use of provenance may violate the security policies of data. Moreover, the provenance itself may be sensitive information, necessitating restrictions on the use of both data and provenance to enforce security requirements. This paper proposes extensional semantic definitions for provenance security. The semantic definitions require that provenance information released to the user does not reveal confidential data, and that neither the provenance information given to the user, nor the program’s output, reveal sensitive provenance information. 1
Trustworthy Information: Concepts and Mechanisms
"... Abstract. We used to treating information received (from recognized sources) as trustworthy, which is unfortunately not true because of attacks. The situation can get worse with the emerging shift of information sharing paradigm from “need to know ” to “need to share. ” In order to help information ..."
Abstract
- Add to MetaCart
Abstract. We used to treating information received (from recognized sources) as trustworthy, which is unfortunately not true because of attacks. The situation can get worse with the emerging shift of information sharing paradigm from “need to know ” to “need to share. ” In order to help information consumers make the “best” decision possible, it is imperative to formulate concepts, models, frameworks, architectures, and mechanisms to facilitate information trustworthiness management in distributed and decentralized environment. In this paper we initiate a study in this direction by proposing an abstraction called information networks as well as two supporting mechanisms called provenance digital signatures and optimal security hardening of information network. 1

