Results 1 -
7 of
7
New Foundations for the Geometry of Interaction
- Information and Computation
, 1993
"... this paper, we present a new formal embodiment of Girard's programme, with the following salient features. 1. Our formalisation is based on elementary Domain Theory rather than C --algebras. It exposes precisely what structure is required of the ambient category in order to carry out the interpret ..."
Abstract
-
Cited by 69 (20 self)
- Add to MetaCart
this paper, we present a new formal embodiment of Girard's programme, with the following salient features. 1. Our formalisation is based on elementary Domain Theory rather than C --algebras. It exposes precisely what structure is required of the ambient category in order to carry out the interpretation. Furthermore, we show how the interpretation arises from the construction of a categorical model of Linear Logic; this provides the basis for a rational reconstruction which makes the structure of the interpretation much easier to understand. 2. The key definitions in our interpretation differ from Girard's. Most notably, we replace the "execution formula" by a least fixpoint, essentially a generalisation of Kahn's semantics for feedback in dataflow networks [Kah77, KM77]. This, coupled with the use of the other distinctive construct of Domain theory, the lifting monad, enables us to interpret the whole of Linear Logic, and to prove soundness in full generality. 3. Our general notion of interpretation has simple examples, providing a suitable basis for concrete implementations. In fact, we sketch a computational interpretation of the Geometry of Interaction in terms of dataflow networks. Recall that computation in dataflow networks is asynchronous, i.e. "no global time", and proceeds by purely local "firing rules" that manipulate tokens. The further structure of this paper is as follows. In Section 2, we review the syntax of Linear Logic, and present the basic, and quite simple intuitions underlying the interpretation. In Section 3, we use these ideas to construct models of Linear Logic. In Section 4 we define the Geometry of Interaction interpretations, and how that they arise from the model constructed previously in a natural fashion. In Section 5, we give a computati...
Superscalar Execution With Dynamic Data Forwarding
- In Proceedings of the 1998 ACM/IEEE Conference on Parallel Architectures and Compilation Techniques
, 1998
"... We empirically demonstrate that in order to take advantage of increasing issue widths, superscalar processors require quadratically growing instruction window sizes. Since conventional central window design aims to provide full data fan-out to all the instructions which are in the window, designing ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
We empirically demonstrate that in order to take advantage of increasing issue widths, superscalar processors require quadratically growing instruction window sizes. Since conventional central window design aims to provide full data fan-out to all the instructions which are in the window, designing large instruction windows using conventional techniques is not feasible. We show that full data fan-out is not necessary for achieving high performance when a novel approach is used to distribute the values. We use direct matching using a small on chip memory called the wait memory to implement the instruction window and bring in a small subset of instructions which are likely to become ready into a match unit where instruction selection and operand matching tasks are performed. We show that the match unit needs to grow only linearly with the issue width. We use SPEC95 benchmarks to demonstrate that at a given instruction window size our algorithm provides over 90 percent of the IPC that ca...
Parallel Functional Programming for Message-Passing Multiprocessors
, 1993
"... We propose a framework for the evaluation of implicitly parallel functional programs on message passing multiprocessors with special emphasis on the issue of load bounding. The model is based on a new encoding of the l-calculus in Milner's p-calculus and combines lazy evaluation and eager (parallel ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We propose a framework for the evaluation of implicitly parallel functional programs on message passing multiprocessors with special emphasis on the issue of load bounding. The model is based on a new encoding of the l-calculus in Milner's p-calculus and combines lazy evaluation and eager (parallel) evaluation in the same framework. The p-calculus encoding serves as the specification of a more concrete compilation scheme mapping a simple functional language into a message passing, parallel program. We show how and under which conditions we can guarantee successful load bounding based on this compilation scheme. Finally we discuss the architectural requirements for a machine to support our model efficiently and we present a simple RISC-style processor architecture which meets those criteria. 3 Acknowledgments Many people have had profound influence on this thesis and I want to pay tribute to some of them here. To my supervisor, Tony Davie, for his willingness to supervise what start...
A Method to Evaluate the Performance of a Multiprocessor Machine based on Data Flow Principles
"... Abstract: In this paper we present a method to model a static data pow oriented multiprocessor system. This methodology of modelling can be used to examine the machine behaviour for ex-ecuting a program according to three scheduling strategies, viz., static, dynamic and quasi-dynamic policies. The p ..."
Abstract
- Add to MetaCart
Abstract: In this paper we present a method to model a static data pow oriented multiprocessor system. This methodology of modelling can be used to examine the machine behaviour for ex-ecuting a program according to three scheduling strategies, viz., static, dynamic and quasi-dynamic policies. The processing elements (PES) of the machine go through dif-ferent states in order to complete tasks they are alloted. Hence, the time taken by the machine to execute a program is dil.ectly dependent on the time spent by the PES in various states dur-ing the execution of tasks. We adopt a “state diagram ” approach to model the machine. This modelling scheme can be used for a class of machines, which have similar execution paradigm. By introducing %sit states ” in the state diagram of a PE at ap-propriate places, we capture the delays that are incurred by the PE waiting on events; the events during the execution of a pro-
Compiling Lazy Functional Languages: An introduction
, 1987
"... Machine (FAM) [Car83] and the Categorical Abstract Machine (CAM) [CCM85] can be considered as variations of the SECD theme. Wadsworth [Wad71] describes an interpreter for the -calculus which performs normalorder graph reduction. In graph reduction, the expression being reduced is represented A by ..."
Abstract
- Add to MetaCart
Machine (FAM) [Car83] and the Categorical Abstract Machine (CAM) [CCM85] can be considered as variations of the SECD theme. Wadsworth [Wad71] describes an interpreter for the -calculus which performs normalorder graph reduction. In graph reduction, the expression being reduced is represented A by a directed graph (in Wadsworth's reducer it is also acyclic). When a reduction rule is applied, be it fi-reduction as in this case, or e.g. combinator reduction, the root of the reducible expression is overwritten with the result of the reduction. In Wadsworth's graph reducer, when applying the reduction rule (v:e)e 0 ) e[e 0 =v], a copy of the graph of the body e is created, with pointers to e 0 substituted for free occurrences of v---if v occurs twice or more e 0 thus becomes shared. When reducing a shared sub-graph, all other uses of this sub-graph benefit from the first reduction. Wadsworth coins the term call-by-need for the mechanism whereby an expression is reduced at most o...
SINAN: An Argument Forwarding Multithreaded Architecture
"... The direct execution of data flow graphs by data flow machines exposes the maximum amount of parallelism in a computation. However, it also results in undesirable properties such as: high overhead due to data replication required to implement high fanouts in the data flow graph; poor instruction and ..."
Abstract
- Add to MetaCart
The direct execution of data flow graphs by data flow machines exposes the maximum amount of parallelism in a computation. However, it also results in undesirable properties such as: high overhead due to data replication required to implement high fanouts in the data flow graph; poor instruction and data locality since the scheduling of ready instructions is not sensitive to data locality; and less efficient execution of sequentially dependent code sections and vector operations due the inability of the data flow machines to support deep pipelines. SINAN offers a novel approach called argument forwarding that can eliminate much of the data fanout overhead and provides greatly improved locality. We show that vector operations and sequential segments can be efficiently handled with the dataflow paradigm by a novel approach that dynamically forms the activity templates. 1 Introduction Since the introduction of the data flow concept in the late 1960s by Adams [1] the concept has evolved...
Execution Performance of the Scheduled Dataflow Architecture
"... This paper presents an evaluation of a non-blocking, decoupled memory/execution, multithreaded architecture known as the Scheduled Dataflow (SDF). Recent focus in the field of new processor architectures is mainly on VLIW (e.g. IA-64), superscalar and superspeculative designs. This trend allows for ..."
Abstract
- Add to MetaCart
This paper presents an evaluation of a non-blocking, decoupled memory/execution, multithreaded architecture known as the Scheduled Dataflow (SDF). Recent focus in the field of new processor architectures is mainly on VLIW (e.g. IA-64), superscalar and superspeculative designs. This trend allows for better performance at the expense of increased hardware complexity, and possibly higher power expenditures resulting from dynamic instruction scheduling. Our research deviates from this trend by exploring a simpler, yet powerful execution paradigm that is based on dataflow and multithreading. A program is partitioned into non-blocking execution threads. In addition, all memory accesses are decoupled from the thread's execution. Data is pre-loaded into the thread's context (registers), and all results are post-stored after the completion of the thread's execution. While multithreading and decoupling are possible with control-flow architectures, we believe that the non-blocking and functional nature of SDF, make it easier to coordinate the memory accesses and execution of a thread, as well as eliminate unnecessary dependencies among instructions. In this paper we compare the execution cycles required for programs on SDF with the execution cycles required by programs on SimpleScalar (a Superscalar simulator).

