Results 1 - 10
of
10
StarT the Next Generation: Integrating Global Caches and Dataflow Architecture
- CSG MEMO 354, COMPUTATION STRUCTURES GROUP, MIT LAB. FOR COMP. SCI
, 1994
"... The implicitly parallel programming model provides an attractive approach to deal with the complexity of parallel programming. Implementing this model efficiently, especially on stock processors, remains a big challenge, partly because of the fine granularity of the parallelism exploited. The Monsoo ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
The implicitly parallel programming model provides an attractive approach to deal with the complexity of parallel programming. Implementing this model efficiently, especially on stock processors, remains a big challenge, partly because of the fine granularity of the parallelism exploited. The Monsoon[27] project was designed to address and investigate support for fine-grain parallelism, and has yielded very encouraging results[13]. Our experience with Monsoon and *T[24, 28], a followup project after Monsoon, suggests that provision for global shared memory is an area where both the Monsoon and *T architectures can be improved. Starting with the split-phase approach used in Monsoon and *T, we propose to augment global memory access by including coherent global caches. The rapid improvements in stock microprocessors, and the high cost and effort required to develop a competitive microprocessor, presents practical constraints on what can be built in any experimental architecture project. ...
Compiler-directed Type Reconstruction for Polymorphic Languages
- In Proceedings of the ACM Conference on Functional Programming Languages and Computer Architecture
, 1993
"... ) Shail Aditya MIT Laboratory for Computer Science 545 Technology Square, Cambridge, MA 02139 shail@abp.lcs.mit.edu 1 Introduction Polymorphic programming languages provide the flexibility of code reuse by allowing objects with different types to share the same pattern of computation. But, this ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
) Shail Aditya MIT Laboratory for Computer Science 545 Technology Square, Cambridge, MA 02139 shail@abp.lcs.mit.edu 1 Introduction Polymorphic programming languages provide the flexibility of code reuse by allowing objects with different types to share the same pattern of computation. But, this feature creates problems for applications like garbage collection and source debugging that need to know the exact type of every object participating in a computation at run-time. Traditionally, dynamic objects in a polymorphic language keep additional type-tags to identify themselves. But, this scheme either requires complex hardware support or costs space and time overhead in managing the tags in software. This paper [1] proposes a compiler-directed, explicit tag management scheme for Id, which is a polymorphic, stronglytyped language developed by the Computation Structures Group at MIT. The underlying memory model for Id is tagless, and the explicit tag information is automatically inser...
Feedback Directed Implicit Parallelism
"... In this paper we present an automated way of using spare CPU resources within a shared memory multi-processor or multi-core machine. Our approach is (i) to profile the execution of a program, (ii) from this to identify pieces of work which are promising sources of parallelism, (iii) recompile the pr ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
In this paper we present an automated way of using spare CPU resources within a shared memory multi-processor or multi-core machine. Our approach is (i) to profile the execution of a program, (ii) from this to identify pieces of work which are promising sources of parallelism, (iii) recompile the program with this work being performed speculatively via a work-stealing system and then (iv) to detect at run-time any attempt to perform operations that would reveal the presence of speculation. We assess the practicality of the approach through an implementation based on GHC 6.6 along with a limit study based on the execution profiles we gathered. We support the full Concurrent Haskell language compiled with traditional optimizations and including I/O operations and synchronization as well as pure computation. We use 20 of the larger programs from the ‘nofib ’ benchmark suite. The limit study shows that programs vary a lot in the parallelism we can identify: some have none, 16 have a potential 2x speed-up, 4 have 32x. In practice, on a 4-core processor, we get 10-80 % speed-ups on 7 programs. This is mainly achieved at the addition of a second core rather than beyond this. This approach is therefore not a replacement for manual parallelization, but rather a way of squeezing extra performance out of the threads of an already-parallel program or out of a program that has not yet been parallelized.
Scalability of Dynamic Storage Allocation Algorithms
, 1996
"... Dynamic storage allocation has a significant impact on computer performance. A dynamic storage allocator manages space for objects whose lifetimes are not known by the system at the time of their creation. A good dynamic storage allocator should utilize storage efficiently and satisfy requests in as ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Dynamic storage allocation has a significant impact on computer performance. A dynamic storage allocator manages space for objects whose lifetimes are not known by the system at the time of their creation. A good dynamic storage allocator should utilize storage efficiently and satisfy requests in as few instructions as possible. A dynamic storage allocator on a multiprocessor should have the ability to satisfy multiple requests concurrently. This paper examines parallel dynamic storage allocation algorithms and how performancescales with increasing numbers of processors. The highest throughputs and lowest instruction counts are achieved with multiple free list fit I. The best memory utilization is achieved using a best fit system.
Hardware-modulated parallelism in chip multiprocessors
- SIGARCH Comput. Archit. News
, 2005
"... Chip multi-processors (CMPs) already have widespread commercial availability, and technology roadmaps project enough on-chip transistors to replicate tens or hundreds of current processor cores. How will we express parallelism, partition applications, and schedule/place/migrate threads on these high ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Chip multi-processors (CMPs) already have widespread commercial availability, and technology roadmaps project enough on-chip transistors to replicate tens or hundreds of current processor cores. How will we express parallelism, partition applications, and schedule/place/migrate threads on these highlyparallel CMPs? This paper presents and evaluates a new approach to highlyparallel CMPs, advocating a new hardware-software contract. The software layer is encouraged to expose large amounts of multi-granular, heterogeneous parallelism. The hardware, meanwhile, is designed to offer low-overhead, low-area support for orchestrating and modulating this parallelism on CMPs at runtime. Specifically, our proposed CMP architecture consists of architectural and ISA support targeting thread creation, scheduling
Performance Visualization on Monsoon
- Journal of Parallel and Distributed Computing
, 1993
"... The performance of an applications program running on a parallel machine is affected by several factors such as the algorithm, the programming language, the compiler and the operating system. Performance evaluation of parallel machines requires quick and easy-to-use analysis of large amounts of data ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The performance of an applications program running on a parallel machine is affected by several factors such as the algorithm, the programming language, the compiler and the operating system. Performance evaluation of parallel machines requires quick and easy-to-use analysis of large amounts of data. This paper describes a performance evaluation tool built for Monsoon, a multithreaded multiprocessor machine built by Motorola in collaboration with MIT. The tool offers integrated data collection, analysis and visualization and is designed to be simple but powerful. Software layers built on top of simple hardware monitors offers a flexible, yet non-intrusive performance evaluation tool. Examples of successful use of the tool by both systems and applications programmers are included. 1 Introduction Computer systems are becoming increasingly complex, making it harder to understand their performance behavior and more difficult to get near optimal performance [9]. Performance analysis tools a...
A Multithreaded Substrate and Compilation Model for the Implicitly Parallel Language pH
, 1996
"... We describe the compilation of the non-strict, implicitly parallel language pH to symmetric multiprocessors (SMPs) in several steps. We introduce the S calculus as a robust foundation for the semantics of pH. Next, we define a shared-memory threaded abstract machine (SMT) that captures the essence o ..."
Abstract
- Add to MetaCart
We describe the compilation of the non-strict, implicitly parallel language pH to symmetric multiprocessors (SMPs) in several steps. We introduce the S calculus as a robust foundation for the semantics of pH. Next, we define a shared-memory threaded abstract machine (SMT) that captures the essence of our compilation target, a modern SMP. Finally, we describe a complete syntax directed translation of S to SMT instructions. The paper makes three important contributions: it is the first implementation of pH based on direct semantics of barriers; second, in contrast to earlier work, the multithreaded code generated uses suspensive threads; and third, the compilation rules generate code from S source code directly, without resorting to intermediate dataflow-style graphs. 1 Introduction This paper describes the compilation of the pH language for symmetric multi-processors (SMPs). pH is a parallel dialect of the functional language Haskell. It has been designed to support general purpose pa...
Computer Architecture Research and the Real World
, 1997
"... In the mid 1980s, the U.S. Defense Advanced Research Projects Agency (DARPA) decided to explore new ways to increase the speed and extent of technology transfer from academia to industry, especially in the area of computer architecture. As a result, DARPA organized a number of large computer archite ..."
Abstract
- Add to MetaCart
In the mid 1980s, the U.S. Defense Advanced Research Projects Agency (DARPA) decided to explore new ways to increase the speed and extent of technology transfer from academia to industry, especially in the area of computer architecture. As a result, DARPA organized a number of large computer architecture projects as partnerships between universities and companies. This article describes one of the earliest of these collaborations between the Massachusetts Institute of Technology's Laboratory for Computer Science and Motorola, Inc.'s Computer Group. This research effort demonstrated that university-industry collaborations can produce excellent results, but it also showed that such partnerships can be quite risky. The goal of this article is to share the authors' experiences and to make recommendations that will improve the chances of success of similar future projects. Keywords: University-industry collaboration, computer architecture, parallel computers In the mid 1980s, the U.S. Defe...
A User-Flow approach for multi-user applications with DMS (Distributed Modules System)
"... We have developped tools that bring to non-programmer people the power to design multi-user applications over internet. Multi-user application is not only 3d virtual world, it is more generally application in which people can interact with or through the same "object". Because they are addressing ..."
Abstract
- Add to MetaCart
We have developped tools that bring to non-programmer people the power to design multi-user applications over internet. Multi-user application is not only 3d virtual world, it is more generally application in which people can interact with or through the same "object". Because they are addressing non-programmer people, these tools have to hide programming problems as well as distribution problems. This paper describes the DMS architecture on which both tools and applications are based. VIRTUAL REALITY AND PROTOTYPING June 1999, Laval (France) 1 Introduction 1.1 Starting point In this paper, multi-user application stands for application in which users can interact together over the network. This interaction is organized around a server which is at least the meeting point of users, but which usually runs some parts of the application. Multiuser applications is most probably the next step in internet conquest, after email, forums, broadcasting informations (Html), videoconferencin...

