Pig Latin: A Not-So-Foreign Language for Data Processing

by Christopher Olston , Benjamin Reed , Utkarsh Srivastava , Ravi Kumar , Andrew Tomkins
Citations:348 - 11 self

Documents Related by Co-Citation

1682 MapReduce: Simplified Data Processing on Large Clusters – Jeffrey Dean, et al. - 2004
455 Dryad: distributed data-parallel programs from sequential building blocks – M Isard, M Budiu, Y Yu, A Birrell, D Fetterly
181 Interpreting the Data: Parallel Analysis with Sawzall – Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan, Google Inc
109 SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets – Ronnie Chaiken, Bob Jenkins, Per-åke Larson, Bill Ramsey, Darren Shakib, Simon Weaver, Jingren Zhou
167 DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language – Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar, Gunda Jon Currey
908 The Google File System – Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung - 2003
519 Parallel database systems: the future of high performance database systems – David J. Dewitt, Jim Gray - 1992
120 A comparison of approaches to large-scale data analysis – Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. Dewitt, Samuel Madden, Michael Stonebraker - 2009
506 Bigtable: A distributed storage system for structured data – Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber - 2006
170 Pregel: A system for large-scale graph processing – Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski, Google Inc - 2010
28 An overview of the system software of a parallel relational database machine – S Fushimi, M Kitsuregawa, H Tanaka - 1986
107 Hive- A Warehousing Solution Over a Map-Reduce Framework – Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, Raghotham Murthy - 2009
93 HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads – Azza Abouzeid, Kamil Bajda-pawlikowski, Daniel Abadi, Avi Silberschatz, Er Rasin
123 Parker,"MapReduce-Merge: Simplified Relational Data Processing on Large Clusters – H chih Yang, A Dasdan, R-L Hsiao, D S - 2007
3234 The Anatomy of a Large-Scale Hypertextual Web Search Engine – Sergey Brin, Lawrence Page - 1998
1126 A bridging model for parallel computation – Leslie G Valiant - 1990
97 Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling – Matei Zaharia, Khaled Elmeleegy, Dhruba Borthakur, Scott Shenker, Joydeep Sen Sarma, Ion Stoica - 2010
63 Mesos: A platform for fine-grained resource sharing in the data center – Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, Ion Stoica - 2010
45 FlumeJava: Easy, efficient data-parallel pipelines – C Chambers, A Raniwala, F Perry, S Adams, R R Henry, R Bradshaw, N Weizenbaum - 2010