Results 1 - 10
of
371
Scalable distributed stream processing
- in Proc. Conf. for Innovative Database Research (CIDR
, 2003
"... Many stream-based applications are naturally distributed. Applications are often embedded in an environment with numerous connected computing devices with heterogeneous capabilities. As data travels from its point of origin (e.g., sensors) downstream to applications, it passes through many computin ..."
Abstract
-
Cited by 156 (16 self)
- Add to MetaCart
the Internet on computers typically owned by multiple cooperating administrative domains. This paper describes the architectural challenges facing the design of large-scale distributed stream processing systems, and discusses novel approaches for addressing load management, high availability, and federated
ELASTICITY AND RESOURCE AWARE SCHEDULING IN DISTRIBUTED DATA STREAM PROCESSING SYSTEMS BY
"... ii The era of big data has led to the emergence of new systems for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in industry today. However, Storm, like many other stream processing systems, lacks many important and desired features. ..."
Abstract
- Add to MetaCart
ii The era of big data has led to the emergence of new systems for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in industry today. However, Storm, like many other stream processing systems, lacks many important and desired features
1 StreamCloud: An Elastic and Scalable Data Streaming System
"... Abstract — Many applications in several domains such as telecommunications, network security, large scale sensor networks, require online processing of continuous data flows. They produce very high loads that requires aggregating the processing capacity of many nodes. Current Stream Processing Engin ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
Engines do not scale with the input load due to single-node bottlenecks. Additionally, they are based on static configurations that lead to either under or over-provisioning. In this paper, we present StreamCloud, a scalable and elastic stream processing engine for processing large data stream volumes
Global land cover mapping from MODIS: algorithms and early results,
- Remote Sensing of Environment,
, 2002
"... Abstract Until recently, advanced very high-resolution radiometer (AVHRR) observations were the only viable source of data for global land cover mapping. While many useful insights have been gained from analyses based on AVHRR data, the availability of moderate resolution imaging spectroradiometer ..."
Abstract
-
Cited by 212 (8 self)
- Add to MetaCart
for several other classification schemes that are used by the global modeling community. Initial results based on 5 months of MODIS data are encouraging. At global scales, the distribution of vegetation and land cover types is qualitatively realistic. At regional scales, comparisons among heritage AVHRR
Towards Elastic Stream Processing: Patterns and Infrastructure
"... Distributed, highly-parallel processing frameworks as Hadoop are deemed to be state-of-the-art for handling big data today. But they burden application developers with the task to manually implement program logic using lowlevel batch processing APIs. Thus, a movement can be observed that high-level ..."
Abstract
- Add to MetaCart
queries contain operators that maintain state information which depends on the data that has already been processed and hence they cannot be restarted without information loss. This also is an issue when streaming tasks should be scaled. Therefore, integrating elasticity and fault
Scalable Distributed Stream Join Processing
"... Efficient and scalable stream joins play an important role in performing real-time analytics for many cloud applications. However, like in conventional database processing, online theta-joins over data streams are computationally expensive and moreover, being memory-based processing, they impose hig ..."
Abstract
- Add to MetaCart
and online data aggregation. BiStream also sup-ports adaptive resource management to dynamically scale out and down the system according to its application work-loads. We provide both theoretical cost analysis and ex-tensive experimental evaluations to evaluate the efficiency, elasticity and scalability
Spade: the system s declarative stream processing engine
- in SIGMOD ’08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data
"... In this paper, we present Spade − the System S declarative stream processing engine. System S is a large-scale, distributed data stream processing middleware under development at IBM T. J. Watson Research Center. As a front-end for rapid application development for System S, Spade provides (1) an in ..."
Abstract
-
Cited by 79 (8 self)
- Add to MetaCart
In this paper, we present Spade − the System S declarative stream processing engine. System S is a large-scale, distributed data stream processing middleware under development at IBM T. J. Watson Research Center. As a front-end for rapid application development for System S, Spade provides (1
Adaptive control of extreme-scale stream processing systems
- In ICDCS 2006
, 2006
"... Abstract — Distributed stream processing systems offer a highly scalable and dynamically configurable platform for time-critical applications ranging from real-time, exploratory data mining to high performance transaction processing. Resource management for distributed stream processing systems is c ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
Abstract — Distributed stream processing systems offer a highly scalable and dynamically configurable platform for time-critical applications ranging from real-time, exploratory data mining to high performance transaction processing. Resource management for distributed stream processing systems
Auto-scaling tech-niques for elastic data stream processing
"... A major problem of today's cloud infrastructure is the low utilization of the overall system This observation also holds true for data stream processing systems, which continuously produce output for a set of standing queries and a potentially infinite input stream. Many real-world workloads ..."
Abstract
- Add to MetaCart
to end latency. Therefore, this characteristic needs to be reflected in the scaling strategy to achieve a good trade-off between the monetary cost spent and the achieved quality of service. In this paper we address the problem of choosing the scaling strategy for a data stream processing system
Discretized Streams: Fault-tolerant streaming computation at scale
- In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP
, 2013
"... Many “big data ” applications must act on data in real time. Running these applications at ever-larger scales requires parallel platforms that automatically handle faults and stragglers. Unfortunately, current distributed stream processing models provide fault recovery in an expensive manner, requir ..."
Abstract
-
Cited by 45 (6 self)
- Add to MetaCart
Many “big data ” applications must act on data in real time. Running these applications at ever-larger scales requires parallel platforms that automatically handle faults and stragglers. Unfortunately, current distributed stream processing models provide fault recovery in an expensive manner
Results 1 - 10
of
371