Results 1 - 10
of
12
A type and effect system for deterministic parallel java
- In Proc. Intl. Conf. on Object-Oriented Programming, Systems, Languages, and Applications
, 2009
"... Today’s shared-memory parallel programming models are complex and error-prone. While many parallel programs are intended to be deterministic, unanticipated thread interleavings can lead to subtle bugs and nondeterministic semantics. In this paper, we demonstrate that a practical type and effect syst ..."
Abstract
-
Cited by 39 (8 self)
- Add to MetaCart
Today’s shared-memory parallel programming models are complex and error-prone. While many parallel programs are intended to be deterministic, unanticipated thread interleavings can lead to subtle bugs and nondeterministic semantics. In this paper, we demonstrate that a practical type and effect system can simplify parallel programming by guaranteeing deterministic semantics with modular, compile-time type checking even in a rich, concurrent object-oriented language such as Java. We describe an object-oriented type and effect system that provides several new capabilities over previous systems for expressing deterministic parallel algorithms. We also describe a language called Deterministic Parallel Java (DPJ) that incorporates the new type system features, and we show that a core subset of DPJ is sound. We describe an experimental validation showing that DPJ can express a wide range of realistic parallel programs; that the new type system features are useful for such programs; and that the parallel programs exhibit good performance gains (coming close to or beating equivalent, nondeterministic multithreaded programs where those are available).
Multithreaded Geant4: Semi-Automatic Transformation into Scalable Thread-Parallel Software
"... Abstract. This work presents an application case study. Geant4 is a 750,000 line toolkit first designed in the mid-1990s and originally intended only for sequential computation. Intel’s promise of an 80-core CPU meant that Geant4 users would have to struggle in the future with 80 processes on one CP ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. This work presents an application case study. Geant4 is a 750,000 line toolkit first designed in the mid-1990s and originally intended only for sequential computation. Intel’s promise of an 80-core CPU meant that Geant4 users would have to struggle in the future with 80 processes on one CPU chip, each one having a gigabyte memory footprint. Thread parallelism would be desirable. A semiautomatic methodology to parallelize the Geant4 code is presented in this work. Our experimental tests demonstrate linear speedup in a range from one thread to 24 on a 24-core computer. To achieve this performance, we needed to write a custom, thread-private memory allocator, and to detect and eliminate excessive cache misses. Without these improvements, there was almost no performance improvement when going beyond eight cores. Finally, in order to guarantee the run-time correctness of the transformed code, a dynamic method was developed to capture possible bugs and either immediately generate a fault, or optionally recover from the fault. 1
ALTER: Exploiting Breakable Dependences for Parallelization
"... For decades, compilers have relied on dependence analysis to determine the legality of their transformations. While this conservative approach has enabled many robust optimizations, when it comes to parallelization there are many opportunities that can only be exploited by changing or re-ordering th ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
For decades, compilers have relied on dependence analysis to determine the legality of their transformations. While this conservative approach has enabled many robust optimizations, when it comes to parallelization there are many opportunities that can only be exploited by changing or re-ordering the dependences in the program. This paper presents ALTER: a system for identifying and enforcing parallelism that violates certain dependences while preserving overall program functionality. Based on programmer annotations, ALTER exploits new parallelism in loops by reordering iterations or allowing stale reads. ALTER can also infer which annotations are likely to benefit the program by using a test-driven framework. Our evaluation of ALTER demonstrates that it uncovers parallelism that is beyond the reach of existing static and dynamic tools. Across a selection of 12 performance-intensive loops, 9 of which have loop-carried dependences, ALTER obtains an average speedup of 2.0x on 4 cores. Categories and Subject Descriptors D.3.3 [Programming Languages]: Language Constructs and Features—Concurrent programming
Coarse-Grained Transactions [Extended Version]
"... Traditional transactional memory systems suffer from overly conservative conflict detection, yielding so-called false conflicts, because they are based on fine-grained, low-level read/write conflicts. In response, the recent trend has been toward integrating various abstract data-type libraries usin ..."
Abstract
- Add to MetaCart
Traditional transactional memory systems suffer from overly conservative conflict detection, yielding so-called false conflicts, because they are based on fine-grained, low-level read/write conflicts. In response, the recent trend has been toward integrating various abstract data-type libraries using ad-hoc methods of high-level conflict detection. These proposals have led to improved performance but a lack of a unified theory has led to confusion in the literature. We clarify these recent proposals by defining a generalization data-type operations rather than simply memory read/write operations. We provide semantics for both pessimistic (e.g. transactional boosting) and optimistic (e.g. traditional TMs and recent alternatives) execution. We show that both are included in the standard atomic semantics, yet find that the choice imposes different requirements on the coarse-grained operations: pessimistic requires operations be left-movers, optimistic requires right-movers. Finally, we discuss how the semantics applies to numerous TM implementation details discussed widely in the literature. 1.
Languages, Theory
"... Traditional transactional memory systems suffer from overly conservative conflict detection, yielding so-called false conflicts, because they are based on fine-grained, low-level read/write conflicts. In response, the recent trend has been toward integrating various abstract data-type libraries usin ..."
Abstract
- Add to MetaCart
Traditional transactional memory systems suffer from overly conservative conflict detection, yielding so-called false conflicts, because they are based on fine-grained, low-level read/write conflicts. In response, the recent trend has been toward integrating various abstract data-type libraries using ad-hoc methods of high-level conflict detection. These proposals have led to improved performance but a lack of a unified theory has led to confusion in the literature. We clarify these recent proposals by defining a generalization of transactional memory in which a transaction consists of coarse-grained (abstract data-type) operations rather than simple memory read/write operations. We provide semantics for both pessimistic (e.g. transactional boosting) and optimistic (e.g. traditional TMs and recent alternatives) execution. We show that both are included in the standard atomic semantics, yet find that the choice imposes different requirements on the coarse-grained operations: pessimistic requires operations be left-movers, optimistic requires right-movers. Finally, we discuss how the semantics applies to numerous TM implementation details discussed widely in the literature.
Dynamic Binary Parallelization Ph.D. Dissertation Proposal
, 2011
"... A large and important base of existing software is being left behind by emerging microprocessor architectures. Recently, fundamental issues in microprocessor technologies have led designers to increase the number of cores on a chip instead of increasing its single-threaded performance. Many-core des ..."
Abstract
- Add to MetaCart
A large and important base of existing software is being left behind by emerging microprocessor architectures. Recently, fundamental issues in microprocessor technologies have led designers to increase the number of cores on a chip instead of increasing its single-threaded performance. Many-core designs with 4 to 8 cores are ubiquitous, and trends suggest that core counts will continue to grow for the foreseeable
Feasibility of Dynamic Binary Parallelization
"... This paper proposes DBP, an automatic technique that transparently parallelizes a sequential binary executable while it is running. A prototype implementationin simulation was able to increase sequential execution speeds byupto1.96x,averagedoverthreebenchmarkssuites. ..."
Abstract
- Add to MetaCart
This paper proposes DBP, an automatic technique that transparently parallelizes a sequential binary executable while it is running. A prototype implementationin simulation was able to increase sequential execution speeds byupto1.96x,averagedoverthreebenchmarkssuites.
Testing Mined Specifications ∗
"... Specifications are necessary for nearly every software engineering task, but they are often missing or incomplete. “Specification mining” is a line of research promising to solve this problem through automated tools that infer specifications directly from existing programs. The standard practice is ..."
Abstract
- Add to MetaCart
Specifications are necessary for nearly every software engineering task, but they are often missing or incomplete. “Specification mining” is a line of research promising to solve this problem through automated tools that infer specifications directly from existing programs. The standard practice is one of inductive learning: mining tools make observations about software and inductively generalize them into specifications. Inductive reasoning is unsound, however, and existing tools commonly grapple with the problem of inferring “false ” specifications, which must be manually checked. In this work, we introduce a new technique for automatically validating mined specifications that lessens this manual burden. Our technique is not based on heuristics; it rather uses a general, semantic definition of a “true ” specification. We perform systematic, targeted program transformations to test a mined specification’s necessity for overall correctness. If a “violating ” program is correct, the specification is false. We have implemented our technique in a prototype tool that validates temporal properties of Java programs, and we demonstrate it to be effective through a large-scale case study on the DaCapo benchmarks.

