Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce

by Jimmy Lin
Citations:4 - 3 self

Documents Related by Co-Citation

1602 The PageRank Citation Ranking: Bringing Order to the Web – Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd - 1999
417 Statistical phrase-based translation – Franz Josef Och, Daniel Marcu - 2003
913 MapReduce: simplified data processing on large clusters – Jeffrey Dean, Sanjay Ghemawat - 2004
400 Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities – Gene Amdahl - 1967
637 The Google File System – Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung - 2003
1548 Conditional random fields: Probabilistic models for segmenting and labeling sequence data – John Lafferty - 2001
5 Exploring large-data issues in the curriculum: A case study with MapReduce – J Lin - 2008
9 cheap: Construction of statistical machine translation models with MapReduce – easy Fast
891 The Mathematics of Statistical Machine Translation: Parameter Estimation – Peter F. Brown, Vincent J.Della Pietra, Stephen A. Della Pietra, Robert. L. Mercer - 1993
83 Evaluating MapReduce for multi-core and multiprocessor systems – Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, Christos Kozyrakis - 2007
30 A survey of statistical machine translation – Adam Lopez - 2007
33 Mars: A MapReduce Framework on Graphics Processors – Bingsheng He, Wenbin Fang, Naga K. Govindaraju, Qiong Luo, Tuyong Wang
285 Bigtable: A distributed storage system for structured data – Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber - 2006
78 Large language models in machine translation – Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, Jeffrey Dean, Google Inc - 2007
78 Map-Reduce-Merge: Simplified Relational Data Processing on Large clusters – H chih Yang, A Dasdan, R-L Hsiao, D S Parker - 2007
196 Pig Latin: A Not-So-Foreign Language for Data Processing – Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins
6234 Maximum likelihood from incomplete data via the EM algorithm – A. P. Dempster, N. M. Laird, D. B. Rubin - 1977
990 Xen and the art of virtualization – Paul Barham, Boris Dragovic, Keir Fraser, Steven H, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield
9 Brute force and indexed approaches to pairwise document similarity comparisons with mapreduce – Jimmy Lin - 2009