Maximum likelihood from incomplete data via the EM algorithm
 JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B
, 1977
"... A broadly applicable algorithm for computing maximum likelihood estimates from incomplete data is presented at various levels of generality. Theory showing the monotone behaviour of the likelihood and convergence of the algorithm is derived. Many examples are sketched, including missing value situat ..."
A broadly applicable algorithm for computing maximum likelihood estimates from incomplete data is presented at various levels of generality. Theory showing the monotone behaviour of the likelihood and convergence of the algorithm is derived. Many examples are sketched, including missing value
A Model of Computation for MapReduce
 Proc. ACMSIAM SODA
, 2010
"... In recent years the MapReduce framework has emerged as one of the most widely used parallel computing platforms for processing data on terabyte and petabyte scales. Used daily at companies such as Yahoo!, Google, Amazon, and Facebook, and adopted more recently by several universities, it allows for ..."
for easy parallelization of data intensive computations over many machines. One key feature of MapReduce that differentiates it from previous models of parallel computation is that it interleaves sequential and parallel computation. We propose a model of efficient computation using the MapReduce paradigm
MapReduce: Simplified Data Processing on Large Clusters
, 2004
"... MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with t ..."
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated
Improving MapReduce Performance in Heterogeneous Environments
"... MapReduce is emerging as an important programming model for largescale dataparallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an opensource implementation of MapReduce enjoying wide adoption and is often used for short jobs where low response time is cri ..."
MapReduce is emerging as an important programming model for largescale dataparallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an opensource implementation of MapReduce enjoying wide adoption and is often used for short jobs where low response time
Markov Random Field Models in Computer Vision
, 1994
"... . A variety of computer vision problems can be optimally posed as Bayesian labeling in which the solution of a problem is defined as the maximum a posteriori (MAP) probability estimate of the true labeling. The posterior probability is usually derived from a prior model and a likelihood model. The l ..."
. A variety of computer vision problems can be optimally posed as Bayesian labeling in which the solution of a problem is defined as the maximum a posteriori (MAP) probability estimate of the true labeling. The posterior probability is usually derived from a prior model and a likelihood model
Evaluating MapReduce for multicore and multiprocessor systems
 In HPCA ’07: Proceedings of the 13th International Symposium on HighPerformance Computer Architecture
, 2007
"... This paper evaluates the suitability of the MapReduce model for multicore and multiprocessor systems. MapReduce was created by Google for application development on datacenters with thousands of servers. It allows programmers to write functionalstyle code that is automatically parallelized and s ..."
This paper evaluates the suitability of the MapReduce model for multicore and multiprocessor systems. MapReduce was created by Google for application development on datacenters with thousands of servers. It allows programmers to write functionalstyle code that is automatically parallelized
LogP: Towards a Realistic Model of Parallel Computation
, 1993
"... A vast body of theoretical research has focused either on overly simplistic models of parallel computation, notably the PRAM, or overly specific models that have few representatives in the real world. Both kinds of models encourage exploitation of formal loopholes, rather than rewarding developme ..."
A vast body of theoretical research has focused either on overly simplistic models of parallel computation, notably the PRAM, or overly specific models that have few representatives in the real world. Both kinds of models encourage exploitation of formal loopholes, rather than rewarding
Computational LambdaCalculus and Monads
, 1988
"... The calculus is considered an useful mathematical tool in the study of programming languages, since programs can be identified with terms. However, if one goes further and uses fijconversion to prove equivalence of programs, then a gross simplification 1 is introduced, that may jeopardise the ..."
the applicability of theoretical results to real situations. In this paper we introduce a new calculus based on a categorical semantics for computations. This calculus provides a correct basis for proving equivalence of programs, independent from any specific computational model. 1 Introduction This paper
Simulating Physics with Computers
 SIAM Journal on Computing
, 1982
"... A digital computer is generally believed to be an efficient universal computing device; that is, it is believed able to simulate any physical computing device with an increase in computation time of at most a polynomial factor. This may not be true when quantum mechanics is taken into consideration. ..."
A digital computer is generally believed to be an efficient universal computing device; that is, it is believed able to simulate any physical computing device with an increase in computation time of at most a polynomial factor. This may not be true when quantum mechanics is taken into consideration
Pig Latin: A NotSoForeign Language for Data Processing
"... There is a growing need for adhoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected every day. Parallel database products, e.g., Teradata, offer a solution, but are usually prohibitively e ..."
expensive at this scale. Besides, many of the people who analyze this data are entrenched procedural programmers, who find the declarative, SQL style to be unnatural. The success of the more procedural mapreduce programming model, and its associated scalable implementations on commodity hardware
