Results 1 - 10
of
44
Models and issues in data stream systems
- In PODS
, 2002
"... In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work releva ..."
Abstract
-
Cited by 519 (18 self)
- Add to MetaCart
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work relevant to data stream systems and current projects in the area, the paper explores topics in stream query languages, new requirements and challenges in query processing, and algorithmic issues. 1
NiagaraCQ: A Scalable Continuous Query System for Internet Databases
- In SIGMOD
, 2000
"... Continuous queries are persistent queries that allow users to receive new results when they become available. While continuous query systems can transform a passive web into an active environment, they need to be able to support millions of queries due to the scale of the Internet. No existing syste ..."
Abstract
-
Cited by 441 (7 self)
- Add to MetaCart
Continuous queries are persistent queries that allow users to receive new results when they become available. While continuous query systems can transform a passive web into an active environment, they need to be able to support millions of queries due to the scale of the Internet. No existing systems have achieved this level of scalability. NiagaraCQ addresses this problem by grouping continuous queries based on the observation that many web queries share similar structures. Grouped queries can share the common computation, tend to fit in memory and can reduce the I/O cost significantly. Furthermore, grouping on selection predicates can eliminate a large number of unnecessary query invocations. Our grouping technique is distinguished from previous group optimization approaches in the following ways. First, we use an incremental group optimization strategy with dynamic re-grouping. New queries are added to existing query groups, without having to regroup already installed queries. Second, we use a query-split scheme that requires minimal changes to a general-purpose query engine. Third, NiagaraCQ groups both change-based and timer-based queries in a uniform way. To insure that NiagaraCQ is scalable, we have also employed other techniques including incremental evaluation of continuous queries, use of both pull and push models for detecting heterogeneous data source changes, and memory caching. This paper presents the design of NiagaraCQ system and gives some experimental results on the system’s performance and scalability. 1.
Aurora: a new model and architecture for data stream management
, 2003
"... This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual in ..."
Abstract
-
Cited by 237 (26 self)
- Add to MetaCart
This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual inputs from many sources (e.g., sensors) rather than from human operators requires one to rethink the fundamental architecture of a DBMS for this application area. In this paper, we present Aurora, a new DBMS currently under construction at Brandeis University, Brown University, and M.I.T. We first provide an overview of the basic Aurora model and architecture and then describe in detail a stream-oriented set of operators.
Continuous Queries over Data Streams
, 2004
"... In many recent applications, data may take the form of continuous data streams, rather than finite stored data sets. Several aspects of data management need to be reconsidered in the presence of data streams, offering a new research direction for the database community. In this paper we focus primar ..."
Abstract
-
Cited by 215 (8 self)
- Add to MetaCart
In many recent applications, data may take the form of continuous data streams, rather than finite stored data sets. Several aspects of data management need to be reconsidered in the presence of data streams, offering a new research direction for the database community. In this paper we focus primarily on the problem of query processing, specifically on how to define and evaluate continuous queries over data streams. We address semantic issues as well as efficiency concerns. Our main contributions are threefold. First, we specify a general and flexible architecture for query processing in the presence of data streams. Second, we use our basic architecture as a tool to clarify alternative semantics and processing techniques for continuous queries. The architecture also captures most previous work on continuous queries and data streams, as well as related concepts such as triggers and materialized views. Finally, we map out research topics in the area of query processing over data streams, showing where previous work is relevant and describing problems yet to be addressed.
Continual Queries for Internet Scale Event-Driven Information Delivery
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 1999
"... In this paper we introduce the concept of continual queries, describe the design of a distributed event-driven continual query system -- OpenCQ, and outline the initial implementation of OpenCQ on top of the distributed interoperable information mediation system DIOM [21, 19]. Continual queries a ..."
Abstract
-
Cited by 153 (13 self)
- Add to MetaCart
In this paper we introduce the concept of continual queries, describe the design of a distributed event-driven continual query system -- OpenCQ, and outline the initial implementation of OpenCQ on top of the distributed interoperable information mediation system DIOM [21, 19]. Continual queries are standing queries that monitor update of interest and return results whenever the update reaches specified thresholds. In OpenCQ, users may specify to the system the information they would like to monitor (such as the events or the update thresholds they are interested in). Whenever the information of interest becomes available, the system immediately delivers it to the relevant users; otherwise, the system continually monitors the arrival of the desired information and pushes it to the relevant users as it meets the specified update thresholds. In contrast to conventional pull-based data management systems such as DBMSs and Web search engines, OpenCQ exhibits two important featu...
Composite event specification in active databases: Model and implementation
, 1992
"... Active database systems require facilities to specify triggers that fire when specified events occur. We propose a language for specifying composite events as eveti expressions, formed using event operators and events (primitive or composite). An event expression maps an event history to anothe-r ev ..."
Abstract
-
Cited by 138 (4 self)
- Add to MetaCart
Active database systems require facilities to specify triggers that fire when specified events occur. We propose a language for specifying composite events as eveti expressions, formed using event operators and events (primitive or composite). An event expression maps an event history to anothe-r event history that contains only the events at which the event expression is “satisfied ” and at which the trigger should 6re. We present several examples illustrating how quite complex event specifications are possible using event expressions. In addition to the basic event operators, we also provide facilities that make it easier to specify composite events. “Pipes ” allow users to isolate sub-histories of interest. “Correlation variables ” allow users to ensure that different parts of an event expression are satisfied by the same event,
Path Sharing and Predicate Evaluation for High-Performance XML Filtering
- ACM TRANS. DATABASE SYST
, 2003
"... ... In this paper we first describe the XFilter and YFilter approaches and present results of a detailed performance comparison of structure matching for these algorithms as well as a hybrid approach. The results show that the path sharing employed by YFilter can provide order-of-magnitude performan ..."
Abstract
-
Cited by 105 (5 self)
- Add to MetaCart
... In this paper we first describe the XFilter and YFilter approaches and present results of a detailed performance comparison of structure matching for these algorithms as well as a hybrid approach. The results show that the path sharing employed by YFilter can provide order-of-magnitude performance benefits. We then propose two alternative techniques for extending YFilter's shared structure matching with support for valuebased predicates, and compare the performance of these two techniques. The results of this latter study demonstrate some key differences between shared XML filtering and traditional database query processing. Finally, we describe how the YFilter approach is extended to handle more complicated queries containing nested path expressions.
Active Database Systems
- Modern Database Systems
, 1994
"... Integrating a production rules facility into a database system provides a uniform mechanism for a number of advanced database features including integrity constraint enforcement, derived data maintenance, triggers, alerters, protection, version control, and others. In addition, a database system wit ..."
Abstract
-
Cited by 68 (6 self)
- Add to MetaCart
Integrating a production rules facility into a database system provides a uniform mechanism for a number of advanced database features including integrity constraint enforcement, derived data maintenance, triggers, alerters, protection, version control, and others. In addition, a database system with rule processing capabilities provides a useful platform for large and efficient knowledge-base and expert systems. Database systems with production rules are referred to as active database systems, and the field of active database systems has indeed been active. This chapter summarizes current work in active database systems; topics covered include active database rule models and languages, rule execution semantics, and implementation issues. 1 Introduction Conventional database systems are passive: they only execute queries or transactions explicitly submitted by a user or an application program. For many applications, however, it is important to monitor situations of interest, and to ...
An Overview of Production Rules in Database Systems
- The Knowledge Engineering Review
, 1992
"... Database researchers have recognized that integrating a production rules facility into a database system provides a uniform mechanism for a number of advanced database features including integrity constraint enforcement, derived data maintenance, triggers, protection, version control, and others. In ..."
Abstract
-
Cited by 53 (8 self)
- Add to MetaCart
Database researchers have recognized that integrating a production rules facility into a database system provides a uniform mechanism for a number of advanced database features including integrity constraint enforcement, derived data maintenance, triggers, protection, version control, and others. In addition, a database system with rule processing capabilities provides a useful platform for large and efficient knowledge-base and expert systems. Database systems with production rules are referred to as active database systems, and the field of active database systems has indeed been active. This paper summarizes current work in active database systems and suggests future research directions. Topics covered include database rule languages, rule processing semantics, and implementation issues. 1 Introduction Database systems provide persistent storage for massive amounts of data and powerful interfaces for querying and modifying this data. Even so, most database systems are passive, si...
Differential Evaluation of Continual Queries
- In IEEE Proceedings of the 16th International Conference on Distributed Computing Systems, Hong Kong
, 1996
"... Information Superhighway environments such as the Internet have brought us ready access to large amount of information. However, Internet data is notoriously unorganized and autonomously managed in a distributed fashion. Large scale information monitoring in the Internet environment requires support ..."
Abstract
-
Cited by 45 (10 self)
- Add to MetaCart
Information Superhighway environments such as the Internet have brought us ready access to large amount of information. However, Internet data is notoriously unorganized and autonomously managed in a distributed fashion. Large scale information monitoring in the Internet environment requires support beyond traditional database techniques. Two of the key issues are the increasing reward in monitoring a fast growing information base and the similarly increasing processing cost. To improve the expressiveness of queries for information monitoring, we define continual queries as a useful tool for monitoring of updated information. Continual queries are standing queries that monitor the source data and notify the users whenever new data matches the query. In addition to periodic refresh, continual queries include Epsilon Transaction concepts to allow users to specify query refresh based on the magnitude of updates. To support efficient processing of continual queries, we propose a different...

