Results 1 -
8 of
8
Efficient Querying and Maintenance of Network Provenance at Internet-Scale
"... Network accountability, forensic analysis, and failure diagnosis are becoming increasingly important for network management and security. Such capabilities often utilize network provenance – the ability to issue queries over network meta-data. For example, network provenance may be used to trace the ..."
Abstract
-
Cited by 17 (10 self)
- Add to MetaCart
Network accountability, forensic analysis, and failure diagnosis are becoming increasingly important for network management and security. Such capabilities often utilize network provenance – the ability to issue queries over network meta-data. For example, network provenance may be used to trace the path a message traverses on the network as well as to determine how message data were derived and which parties were involved in its derivation. This paper presents the design and implementation of ExSPAN, a generic and extensible framework that achieves efficient network provenance in a distributed environment. We utilize the database notion of data provenance to “explain ” the existence of any network state, providing a versatile mechanism for network provenance. To achieve such flexibility at Internet-scale, ExSPAN uses declarative networking in which network protocols can be modeled as continuous queries over distributed streams and specified concisely in a declarative query language. We extend existing data models for provenance developed in database literature to enable distribution at Internet-scale, and investigate numerous optimization techniques to maintain and query distributed network provenance efficiently. The ExSPAN prototype is developed using Rapid-Net, a declarative networking platform based on the emerging ns-3 toolkit. Experiments over a simulated network and an actual deployment in a testbed environment demonstrate that our system supports a wide range of distributed provenance computations efficiently, resulting in significant reductions in bandwidth costs compared to traditional approaches.
Querying data provenance
- In SIGMOD
, 2010
"... Many advanced data management operations (e.g., incremental maintenance, ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
Many advanced data management operations (e.g., incremental maintenance,
The Perm Provenance Management System in Action
"... In this demonstration we present the Perm provenance management system (PMS). Perm is capable of computing, storing and querying provenance information for the relational data model. Provenance is computed by using query rewriting techniques to annotate tuples with provenance information. Thus, prov ..."
Abstract
- Add to MetaCart
In this demonstration we present the Perm provenance management system (PMS). Perm is capable of computing, storing and querying provenance information for the relational data model. Provenance is computed by using query rewriting techniques to annotate tuples with provenance information. Thus, provenance data and provenance computations are represented as relational data and queries and, hence, can be queried, stored and optimized using standard relational database techniques. This demo shows the complete Perm system and lets attendants examine in detail the process of query rewriting and provenance retrieval in Perm, the most complete data provenance system available today. For example, Perm supports lazy and eager provenance computation, external provenance and various contribution semantics for an almost complete subset of SQL.. 1.
TRAMP: Understanding the Behavior of Schema Mappings through Provenance
"... Though partially automated, developing schema mappings remains a complex and potentially error-prone task. In this paper, we present TRAMP (TRAnsformation Mapping Provenance), an extensive suite of tools supporting the debugging and tracing of schema mappings and transformation queries. TRAMP combin ..."
Abstract
- Add to MetaCart
Though partially automated, developing schema mappings remains a complex and potentially error-prone task. In this paper, we present TRAMP (TRAnsformation Mapping Provenance), an extensive suite of tools supporting the debugging and tracing of schema mappings and transformation queries. TRAMP combines and extends data provenance with two novel notions, transformation provenance and mapping provenance, to explain the relationship between transformed data and those transformations and mappings that produced that data. In addition we provide query support for transformations, data, and all forms of provenance. We formally define transformation and mapping provenance, present an efficient implementation of both forms of provenance, and evaluate the resulting system through extensive experiments. 1.
Declarative Secure Distributed Systems
, 2010
"... In the past decade, distributed systems have rapidly evolved and gained significant traction in the research community, with an increasing interest concentrated on developing and analyzing secure distributed systems. In this paper, we present DS2 (Declarative Secure Distributed Systems), a unified p ..."
Abstract
- Add to MetaCart
In the past decade, distributed systems have rapidly evolved and gained significant traction in the research community, with an increasing interest concentrated on developing and analyzing secure distributed systems. In this paper, we present DS2 (Declarative Secure Distributed Systems), a unified platform for specifying, implementing, and analyzing large-scale secure distributed systems. First, we propose the Secure Network Datalog (SeNDlog) language that enables distributed systems and their security policies to be specified and implemented within a same declarative framework. We show that the existing semi-naïve evaluation can be extended to execute SeNDlog programs that incorporate authenticated communication among untrusted nodes. Second, we demonstrate that network provenance – the metadata that explains the derivation of network state – can be naturally and concisely captured within the DS2 system. We extend existing data models for provenance to enable distribution at Internet-scale, and present techniques for efficient and customizable maintenance and querying of network provenance. Finally, the future research plans on secure provenance and its integration with legacy applications are presented for discussion.
Enterprise Information Extraction SIGMOD 2010 Tutorial
"... – Code shipping with 8 IBM products ..."
Optimized Rollback and Re-computation
"... Abstract—Large data processing tasks can be effected using workflow management systems. When either the input data or the programs in the pipeline are modified, the workflow must be re-executed to ensure that the final output data is updated to reflect the changes. Since such re-computation can cons ..."
Abstract
- Add to MetaCart
Abstract—Large data processing tasks can be effected using workflow management systems. When either the input data or the programs in the pipeline are modified, the workflow must be re-executed to ensure that the final output data is updated to reflect the changes. Since such re-computation can consume substantial resources, optimizing the system to avoid redundant computation is desirable. In the case of a workflow, the dependency relationships between files are specified at the outset and can be leveraged to track which programs need to be re-executed when particular files change. Current distributed systems cannot provide such functionality when no predefined workflows exist. In this paper, we present an architecture that provides functionality to produce both correct output as well as fast re-execution by leveraging the provenance of data to propagate changes along an implicit dependency graph. We explore the tradeoff between storage and availability by presenting a performance analysis of our rollback and re-execution scheme. I.

