Results 1 - 10
of
81
The CQL Continuous Query Language: Semantic Foundations and Query Execution
- VLDB Journal
, 2003
"... CQL, a Continuous Query Language, is supported by the STREAM prototype Data Stream Management System at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and updatable relations. We begin by presenting an abstract semantics that relie ..."
Abstract
-
Cited by 354 (4 self)
- Add to MetaCart
CQL, a Continuous Query Language, is supported by the STREAM prototype Data Stream Management System at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and updatable relations. We begin by presenting an abstract semantics that relies only on "black box" mappings among streams and relations.
The design of the borealis stream processing engine
- In CIDR
, 2005
"... Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both ..."
Abstract
-
Cited by 250 (10 self)
- Add to MetaCart
(Show Context)
Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both systems in non-trivial and critical ways to provide advanced capabilities that are commonly required by newly-emerging stream processing applications. In this paper, we outline the basic design and functionality of Borealis. Through sample real-world applications, we motivate the need for dynamically revising query results and modifying query specifications. We then describe how Borealis addresses these challenges through an innovative set of features, including revision records, time travel, and control lines. Finally, we present a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.
Exploiting k-Constraints to Reduce Memory Overhead in Continuous Queries over Data Streams
- ACM Transactions on Database Systems, TODS
, 2004
"... We consider the problem of efficiently processing continuous queries over multiple continuous data streams inthe presence of constraints on the datastreams. We specify several types of constraints, and for each constrainttype we identify an “ adherence parameter ” that captures how closely a given s ..."
Abstract
-
Cited by 59 (9 self)
- Add to MetaCart
We consider the problem of efficiently processing continuous queries over multiple continuous data streams inthe presence of constraints on the datastreams. We specify several types of constraints, and for each constrainttype we identify an “ adherence parameter ” that captures how closely a given stream or joining pair of streams adheres to a constraint of that type. We then present a query execution algorithm that takes-constraints over streams into account in order to reduce memory overhead. In general, the tighter the adherence parameters are in the-constraints, the less memory required. Furthermore, if input streams do not adhere to constraints within the specified adherence parameters, our algorithm automatically degrades gracefully to provide continuous approximate answers. We have implemented our approach in a testbed continuous query processor and preliminary experimental results are reported. 1
Design, implementation, and evaluation of the linear road benchmark on the stream processing core
- In SIGMOD
, 2006
"... Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physicall ..."
Abstract
-
Cited by 46 (7 self)
- Add to MetaCart
(Show Context)
Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physically-distributed and rapidly-updating data streams. While numerous stream processing systems exist, there has been little work on understanding the performance characteristics of these applications in a distributed setup. In this paper, we examine the performance bottlenecks of streaming data applications, in particular the Linear Road stream data management benchmark, in achieving good performance in large-scale distributed environments, using the Stream Processing Core (SPC), a stream processing middleware we have developed. First, we present the design and implementation of the Linear Road benchmark on the SPC middleware. SPC has been designed to scale to tens of thousands of processing nodes, while supporting concurrent applications and multiple simultaneous queries. Second, we identify the main performance bottlenecks in the Linear Road application in achieving scalability and low query response latency. Our results show that data locality, buffer capacity, physical allocation of processing elements to infrastructure nodes, and packaging for transporting streamed data are important factors in achieving good application performance. Though we evaluate our system primarily for the Linear Road application, we believe it also provides useful insights into the overall system behavior for supporting other distributed and large-scale continuous streaming data applications. Finally, we examine how SPC can be used and tuned to enable a very efficient implementation of the Linear Road application in a distributed environment.
SPC: A distributed, scalable platform for data mining
- In Proceedings of the Workshop on Data Mining Standards, Services and Platforms, DM-SSP
, 2006
"... The Stream Processing Core (SPC) is distributed stream processing middleware designed to support applications that extract information from a large number of digital data streams. In this paper, we describe the SPC programming model which, to the best of our knowledge, is the first to support stream ..."
Abstract
-
Cited by 43 (2 self)
- Add to MetaCart
(Show Context)
The Stream Processing Core (SPC) is distributed stream processing middleware designed to support applications that extract information from a large number of digital data streams. In this paper, we describe the SPC programming model which, to the best of our knowledge, is the first to support stream-mining applications using a subscriptionlike model for specifying stream connections as well as to provide support for non-relational operators. This enables stream-mining applications to tap into, analyze and track an ever-changing array of data streams which may contain information relevant to the streaming-queries placed on it. We describe the design, implementation, and experimental evaluation of the SPC distributed middleware, which deploys applications on to the running system in an incremental fashion, making stream connections as required. Using micro-benchmarks and a representative large-scale synthetic stream-mining application, we evaluate the performance of the control and data paths of the SPC middleware. 1.
Linked Stream Data: A Position Paper
"... Abstract. The amount of sensors publishing data on the Web is increasing as a result of the online availability of Sensor Web platforms that provide support for this task. With such increase in sensor data publication, new challenges arise for the identification, discovery and access to this data. F ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
(Show Context)
Abstract. The amount of sensors publishing data on the Web is increasing as a result of the online availability of Sensor Web platforms that provide support for this task. With such increase in sensor data publication, new challenges arise for the identification, discovery and access to this data. Following the set of best practices to publish and link structured data on the web proposed by the Linked Data community, in this paper we introduce the concept of Linked Stream Data, a way in which the Linked Data principles can be applied to stream data and be part of the Web of Linked Data. 1
Distributed Event Stream Processing with Non-deterministic Finite Automata
, 2009
"... Efficient matching of incoming events to persistent queries is fundamental to event pattern matching, complex event processing, and publish/subscribe systems. Recent processing engines based on non-deterministic finite automata (NFAs) have demonstrated scalability in the number of queries that can b ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
(Show Context)
Efficient matching of incoming events to persistent queries is fundamental to event pattern matching, complex event processing, and publish/subscribe systems. Recent processing engines based on non-deterministic finite automata (NFAs) have demonstrated scalability in the number of queries that can be efficiently executed on a single machine. However, existing NFA based systems are limited to processing events on a single machine. Consequently, their event processing capacity cannot be increased by adding more machines. In this paper, we present an experimental evaluation of different methods for distributing an event processing system that is based on NFAs across multiple machines in a cluster. Our results show that careful input stream partitioning gives close to linear performance scaleup for CPU bound workloads.
Linked stream data processing engines: Facts and figures
- IN: THE SEMANTIC WEB - ISWC 2012 - 11TH INTERNATIONAL SEMANTIC WEB CONFERENCE
"... Linked Stream Data, i.e., the RDF data model extended for representing stream data generated from sensors social network applications, is gaining popularity. This has motivated considerable work on developing corresponding data models associated with processing engines. However, current implemente ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
(Show Context)
Linked Stream Data, i.e., the RDF data model extended for representing stream data generated from sensors social network applications, is gaining popularity. This has motivated considerable work on developing corresponding data models associated with processing engines. However, current implemented engines have not been thoroughly evaluated to assess their capabilities. For reasonable systematic evaluations, in this work we propose a novel, customizable evaluation framework and a corresponding methodology for realistic data generation, system testing, and result analysis. Based on this evaluation environment, extensive experiments have been conducted in order to compare the state-of-the-art LSD engines wrt. qualitative and quantitative properties, taking into account the underlying principles of stream processing. Consequently, we provide a detailed analysis of the experimental outcomes that reveal useful findings for improving current and future engines.
Distributed operation in the Borealis stream processing engine
- In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’05). ACM
, 2005
"... Borealis is a distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functional-ity from Aurora and inter-node communication functionality from Medusa. We propose to demonstrate some of the key aspects ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
(Show Context)
Borealis is a distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functional-ity from Aurora and inter-node communication functionality from Medusa. We propose to demonstrate some of the key aspects of distributed operation in Borealis, using a multi-player net-work game as the underlying application. The demonstra-tion will illustrate the dynamic resource management, query optimization and high availability mechanisms employed by Borealis, using visual performance-monitoring tools as well as the gaming experience. 1.
A Proposal for Publishing Data Streams as Linked Data- A Position Paper-
"... Streams are appearing more and more often on the Web in sites that distribute and present information in real-time streams. We anticipate a rapidly growing need of mashing up this streaming information with more static one. While best practices for linking static data on the Web were published and f ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
(Show Context)
Streams are appearing more and more often on the Web in sites that distribute and present information in real-time streams. We anticipate a rapidly growing need of mashing up this streaming information with more static one. While best practices for linking static data on the Web were published and facilitate the mash up of static information published on the Web, streams were neglected. In this short position paper, we propose an approach to publish Data Streams as Linked Data.