• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 1,255
Next 10 →

Spark: Cluster Computing with Working Sets

by Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, Ion Stoica
"... MapReduce and its variants have been highly successful in implementing large-scale data-intensive applications on commodity clusters. However, most of these systems are built around an acyclic data flow model that is not suitable for other popular applications. This paper focuses on one such class o ..."
Abstract - Cited by 213 (9 self) - Add to MetaCart
of applications: those that reuse a working set of data across multiple parallel operations. This includes many iterative machine learning algorithms, as well as interactive data analysis tools. We propose a new framework called Spark that supports these applications while retaining the scalability and fault

Scaling Java points-to analysis using Spark

by Ondrej Lhoták, Laurie Hendren - IN COMPILER CONSTRUCTION, 12TH INTERNATIONAL CONFERENCE, VOLUME 2622 OF LNCS , 2003
"... Most points-to analysis research has been done on different systems by different groups, making it difficult to compare results, and to understand interactions between individual factors each group studied. Furthermore, points-to analysis for Java has been studied much less thoroughly than for C, an ..."
Abstract - Cited by 179 (15 self) - Add to MetaCart
, and the tradeoffs appear very different. We introduce Spark, a flexible framework for experimenting with points-to analyses for Java. Spark supports equality- and subset-based analyses, variations in field sensitivity, respect for declared types, variations in call graph construction, off-line simplification

Parsec: A Parallel Simulation Environment for Complex Systems

by R. Bagrodia, Mineo Takai, Yu-an Chen, Xiang Zeng, Jay Martin - IEEE Computer , 1998
"... ulating large-scale systems. Widespread use of parallel simulation, however, has been significantly hindered by a lack of tools for integrating parallel model execution into the overall framework of system simulation. Although a number of algorithmic alternatives exist for parallel execution of disc ..."
Abstract - Cited by 247 (23 self) - Add to MetaCart
ulating large-scale systems. Widespread use of parallel simulation, however, has been significantly hindered by a lack of tools for integrating parallel model execution into the overall framework of system simulation. Although a number of algorithmic alternatives exist for parallel execution

SPARK: A high-level synthesis framework for applying parallelizing compiler transformations

by Sumit Gupta, Nikil Dutt, Rajesh Gupta, Alex Nicolau - In International Conference on VLSI Design , 2003
"... This paper presents a modular and extensible high-level synthesis research system, called SPARK, that takes a behavioral description in ANSI-C as input and produces synthesizable register-transfer level VHDL. SPARK uses parallelizing compiler technology developed previously to enhance instruction-le ..."
Abstract - Cited by 129 (11 self) - Add to MetaCart
This paper presents a modular and extensible high-level synthesis research system, called SPARK, that takes a behavioral description in ANSI-C as input and produces synthesizable register-transfer level VHDL. SPARK uses parallelizing compiler technology developed previously to enhance instruction

Logistic Regression, AdaBoost and Bregman Distances

by Michael Collins, Robert E. Schapire, Yoram Singer , 2000
"... We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easily adapt al ..."
Abstract - Cited by 259 (45 self) - Add to MetaCart
We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easily adapt

An Atlas Framework for Scalable Mapping

by Michael Bosse, Paul Newman, John Leonard, Martin Soika, Wendelin Feiten, Seth Teller - in IEEE International Conference on Robotics and Automation , 2003
"... This paper describes Atlas, a hybrid metrical /topological approach to SLAM that achieves efficient mapping of large-scale environments. The representation is a graph of coordinate frames, with each vertex in the graph representing a local frame, and each edge representing the transformation between ..."
Abstract - Cited by 178 (19 self) - Add to MetaCart
This paper describes Atlas, a hybrid metrical /topological approach to SLAM that achieves efficient mapping of large-scale environments. The representation is a graph of coordinate frames, with each vertex in the graph representing a local frame, and each edge representing the transformation

A Parallel Distributed Weka Framework for Big Data Mining using Spark

by Aris-kyriakos Koliopoulos, Paraskevas Yiapanis, Firat Tekiner, Goran Nenadic, John Keane
"... Abstract—Effective Big Data Mining requires scalable and efficient solutions that are also accessible to users of all levels of expertise. Despite this, many current efforts to provide effective knowledge extraction via large-scale Big Data Mining tools focus more on performance than on use and tuni ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
framework with fast in-memory processing capabilities and support for iterative computations. By combining Weka’s usability and Spark’s processing power, DistributedWekaSpark provides a usable prototype distributed Big Data Mining workbench that achieves near-linear scaling in executing various real

Large-Scale Online Expectation Maximization with Spark Streaming

by Timothy Hunter, Tathagata Das, Matei Zaharia, Alexandre Bayen, Pieter Abbeel
"... Many “Big Data ” applications in Machine Learning (ML) need to react quickly to large streams of incoming data. The standard paradigm nowadays is to run ML algorithms on frameworks designed for batch operations, such as MapReduce or Hadoop. By design, these frameworks are not a good match for low-la ..."
Abstract - Add to MetaCart
Many “Big Data ” applications in Machine Learning (ML) need to react quickly to large streams of incoming data. The standard paradigm nowadays is to run ML algorithms on frameworks designed for batch operations, such as MapReduce or Hadoop. By design, these frameworks are not a good match for low

Spectral segmentation with multiscale graph decomposition

by Timothée Cour, Florence Bénézit, Jianbo Shi - In CVPR , 2005
"... We present a multiscale spectral image segmentation algorithm. In contrast to most multiscale image processing, this algorithm works on multiple scales of the image in parallel, without iteration, to capture both coarse and fine level details. The algorithm is computationally efficient, allowing to ..."
Abstract - Cited by 185 (3 self) - Add to MetaCart
to segment large images. We use the Normalized Cut graph partitioning framework of image segmentation. We construct a graph encoding pairwise pixel affinity, and partition the graph for image segmentation. We demonstrate that large image graphs can be compressed into multiple scales capturing image structure

Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud

by Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Joseph M. Hellerstein , 2012
"... While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill ..."
Abstract - Cited by 141 (2 self) - Add to MetaCart
While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help
Next 10 →
Results 1 - 10 of 1,255
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University