Iterative modulo scheduling: An algorithm for software pipelining loops
 In Proceedings of the 27th Annual International Symposium on Microarchitecture
, 1994
Modulo scheduling is a framework within which a wide variety of algorithms and heuristics may be defined for software pipelining innermost loops. This paper presents a practical algorithm, iterative modulo scheduling, that is capable of dealing with realistic machine models. This paper also characterizes the algorithm in terms of the quality of the generated schedules as well the computational expense incurred.
Principles and methods of Testing Finite State Machines a survey. The
 Proceedings of IEEE
, 1996
With advanced computer technology, systems are getting larger to fulfill more complicated tasks, however, they are also becoming less reliable. Consequently, testing is an indispensable part of system design and implementation; yet it has proved to be a formidable task for complex systems. This motivates the study of testing finite state machines to ensure the correct functioning of systems and to discover aspects of their behavior. A finite state machine contains a finite number of states and produces outputs on state transitions after receiving inputs. Finite state machines are widely used to model systems in diverse areas, including sequential circuits, certain types of programs, and, more recently, communication protocols. In a testing problem we have a machine about which we lack some information; we would like to deduce this information by providing a sequence of inputs to the machine and observing the outputs produced. Because of its practical importance and theoretical interest, the problem of testing finite state machines has been studied in different areas and at various times. The earliest published literature on this topic dates back to the 50’s. Activities in the 60’s and early 70’s were motivated mainly by automata theory and sequential circuit testing. The area seemed to have mostly died down until a few years ago when the testing problem was resurrected and is now being studied anew due to its applications to conformance testing of communication protocols. While some old problems which had been open for decades were resolved recently, new concepts and more intriguing problems from new applications emerge. We review the fundamental problems in testing finite state machines and techniques for solving these problems, tracing progress in the area from its inception to the present and the state of the art. In addition, we discuss extensions of finite state machines and some other topics related to testing. 21.
Guillotine subdivisions approximate polygonal subdivisions: Part II  A simple polynomialtime approximation scheme for geometric kMST, TSP, and related problems
, 1996
this paper, thereby achieving essentially the same results that we report here, using decomposition schemes that are somewhat similar to our own. Arora's remarkable results predate this paper by several weeks, and his discovery was done independently of this work. 2 mGuillotine Subdivisions
Generalized Arc Consistency for Global Cardinality Constraint
A global cardinality constraint (gcc) is specified in terms of a set of variables X = fx1 ; :::; xpg which take their values in a subset of V = fv1 ; :::; vdg. It constrains the number of times a value v i 2 V is assigned toavariable in X to be in an interval (l i ;c i ). Cardinality constraints have proved very useful in many reallife problems, suchas scheduling, timetabling, or resource allocation. A gcc is more general than a constraint of difference, which requires each interval to be #0; 1#. In this paper, we present an efficient way of implementing generalized arc consistency for a gcc. The algorithm we propose is based on a new theorem of flow theory. Its space complexity is O(#Xj#jVj) and its time complexity is O(jXj 2 #jVj). We also show how this algorithm can efficiently be combined with other filtering techniques.
Performance Analysis and Optimization of Asynchronous Circuits
, 1991
We present a method for analyzing the time performance of asynchronous circuits, in particular, those derived by program transformation from concurrent programs using the synthesis approach developed by the second author. The analysis method produces a performance metric (related to the time needed to perform an operation) in terms of the primitive gate delays of the circuit. Such a metric provides a quantitative means by which to compare competing designs. Because the gate delays are functions of transistor sizes, the performance metric can be optimized with respect to these sizes. For a large class of asynchronous circuitsincluding those produced by using our synthesis methodthese techniques produce the global optimum of the performance metric. A CAD tool has been implemented to perform this optimization. 1 Introduction Performance analysis of a synchronous computer system is simplified by an external clock that partitions the events in the system into discrete segments. In a...
Meaningful Change Detection in Structured Data
 IN PROCEEDINGS OF THE ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA
, 1997
Detecting changes by comparing data snapshots is an important requirement for difference queries, active databases, and version and configuration management. In this paper we focus on detecting meaningful changes in hierarchically structured data, such as nestedobject data. This problem is much more challenging than the corresponding one for relational or flatfile data. In order to describe changes better, we base our work not just on the traditional "atomic" insert, delete, update operations, but also on operations that move an entire subtree of nodes, and that copy an entire subtree. These operations allows us to describe changes in a semantically more meaningful way. Since this change detection problem is NPhard, in this paper we present a heuristic change detection algorithm that yields close to "minimal" descriptions of the changes, and that has fewer restrictions than previous algorithms. Our algorithm is based on transforming the change detection problem to a problem of com...
A FASTER STRONGLY POLYNOMIAL MINIMUM COST FLOW ALGORITHM
, 1991
In this paper, we present a new strongly polynomial time algorithm for the minimum cost flow problem, based on a refinement of the EdmondsKarp scaling technique. Our algorithm solves the uncapacitated minimum cost flow problem as a sequence of O(n log n) shortest path problems on networks with n nodes and m arcs and runs in O(n log n (m + n log n)) time. Using a standard transformation, thjis approach yields an O(m log n (m + n log n)) algorithm for the capacitated minimum cost flow problem. This algorithm improves the best previous strongly polynomial time algorithm, due to Z. Galil and E. Tardos, by a factor of n 2 /m. Our algorithm for the capacitated minimum cost flow problem is even more efficient if the number of arcs with finite upper bounds, say n', is much less than m. In this case, the running time of the algorithm is O((m ' + n)log n(m + n log n)).
Rotation scheduling: A loop pipelining algorithm
 Dept. of Computer Science, Princeton University
, 1997
Abstract — We consider the resourceconstrained scheduling of loops with interiteration dependencies. A loop is modeled as a data flow graph (DFG), where edges are labeled with the number of iterations between dependencies. We design a novel and flexible technique, called rotation scheduling, for scheduling cyclic DFG’s using loop pipelining. The rotation technique repeatedly transforms a schedule to a more compact schedule. We provide a theoretical basis for the operations based on retiming. We propose two heuristics to perform rotation scheduling and give experimental results showing that they have very good performance. Index Terms — Highlevel synthesis, loop pipelining, parallel compiler, retiming, scheduling.
Iterative Modulo Scheduling
, 1995
Modulo scheduling is a framework within which algorithms for the software pipelining of innermost loops may be defined. The framework specifies a set of constraints that must be met in order to achieve a legal modulo schedule. A wide variety of algorithms and heuristics can be defined within this framework. Little work has been done to evaluate and compare alternative algorithms and heuristics for modulo scheduling from the viewpoints of schedule quality as well as computational complexity. This, along with a vague and unfounded perception that modulo scheduling is computationally expensive as well as difficult to implement, have inhibited its incorporation into product compilers. This report presents iterative modulo scheduling, a practical algorithm that is capable of dealing with realistic machine models. The report also characterizes the algorithm in terms of the quality of the generated schedules as well the computational expense incurred.