Results 1 -
1 of
1
Transactional Support in MapReduce for Speculative Parallelism
"... MapReduce has emerged as a popular programming model for large-scale distributed computing. Its framework enforces strict synchronization between successive map and reduce phases and limited data-sharing within a phase. Use of key-value based persistent storage with MapReduce presents intriguing opp ..."
Abstract
- Add to MetaCart
MapReduce has emerged as a popular programming model for large-scale distributed computing. Its framework enforces strict synchronization between successive map and reduce phases and limited data-sharing within a phase. Use of key-value based persistent storage with MapReduce presents intriguing opportunities and challenges. These challenges relate primarily to semantic inconsistencies arising from the different fault-tolerant mechanisms employed by the execution environment and the underlying storage medium. We define formal transactional semantics for MapReduce over reliable key-value stores. With minimal performance overhead and no increase in program complexity, our solutions support broad classes of distributed applications hitherto infeasible in MapReduce. Specifically, this paper (i) motivates the use of key-value stores as the underlying storage for MapReduce, (ii) defines transactional semantics for MapReduce to address any inconsistencies, (iii) demonstrates broader application scope enabled by data sharing within and across jobs, and (iv) presents a detailed evaluation demonstrating the low overhead of our proposed semantics. 1.

