Results 1  10
of
10
ON THRESHOLD CIRCUITS AND POLYNOMIAL COMPUTATION
"... A Threshold Circuit consists of an acyclic digraph of unbounded fanin, where each node computes a threshold function or its negation. This paper investigates the computational power of Threshold Circuits. A surprising relationship is uncovered between Threshold Circuits and another class of unbound ..."
Abstract

Cited by 52 (1 self)
 Add to MetaCart
A Threshold Circuit consists of an acyclic digraph of unbounded fanin, where each node computes a threshold function or its negation. This paper investigates the computational power of Threshold Circuits. A surprising relationship is uncovered between Threshold Circuits and another class of unbounded fanin circuits which are denoted Finite Field ZP (n) Circuits, where each node computes either multiple sums or products of integers modulo a prime P (n). In particular, it is proved that all functions computed by Threshold Circuits of size S(n) n and depth D(n) can also be computed by ZP (n) Circuits of size O(S(n) log S(n)+nP (n) log P (n)) and depth O(D(n)). Furthermore, it is shown that all functions computed by ZP (n) Circuits of size S(n) and depth D(n) can be computed by Threshold Circuits of size O ( 1 2 (S(n) log P (n)) 1+) and depth O ( 1 5 D(n)). These are the main results of this paper. There are many useful and quite surprising consequences of this result. For example, integer reciprocal can be computed in size n O(1) and depth O(1). More generally, any analytic function with a convergent rational polynomial power series (such as sine, cosine, exponentiation, square root, and logarithm) can be computed within accuracy 2,nc, for any constant c, by Threshold Circuits of
Efficient Solution Of Parabolic Equations By Krylov Approximation Methods
 SIAM J. Sci. Statist. Comput
, 1992
"... . In this paper we take a new look at numerical techniques for solving parabolic equations by the method of lines. The main motivation for the proposed approach is the possibility of exploiting a high degree of parallelism in a simple manner. The basic idea of the method is to approximate the action ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
. In this paper we take a new look at numerical techniques for solving parabolic equations by the method of lines. The main motivation for the proposed approach is the possibility of exploiting a high degree of parallelism in a simple manner. The basic idea of the method is to approximate the action of the evolution operator on a given state vector by means of a projection process onto a Krylov subspace. Thus, the resulting approximation consists of applying an evolution operator of very small dimension to a known vector which is, in turn, computed accurately by exploiting highorder rational Chebyshev and Pad'e approximations to the exponential. Because the rational approximation is only applied to a small matrix, the only operations required with the original large matrix are matrixbyvector multiplications, and as a result the algorithm can easily be parallelized and vectorized. Further parallelism is introduced by expanding the rational approximations into partial fractions. Some ...
Variations by complexity theorists on three themes of
 Computational Complexity
, 2005
"... This paper surveys some connections between geometry and complexity. A main role is played by some quantities —degree, Euler characteristic, Betti numbers — associated to algebraic or semialgebraic sets. This role is twofold. On the one hand, lower bounds on the deterministic time (sequential and pa ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
This paper surveys some connections between geometry and complexity. A main role is played by some quantities —degree, Euler characteristic, Betti numbers — associated to algebraic or semialgebraic sets. This role is twofold. On the one hand, lower bounds on the deterministic time (sequential and parallel) necessary to decide a set S are established as functions of these quantities associated to S. The optimality of some algorithms is obtained as a consequence. On the other hand, the computation of these quantities gives rise to problems which turn out to be hard (or complete) in different complexity classes. These two kind of results thus turn the quantities above into measures of complexity in two quite different ways. 1
Maximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations
, 2000
"... By establishing a relationship between the basic properties of linear computations and eight optimizing transformations (distributivity, associativity, commutativity, inverse and zero element law, common subexpression replication and elimination, constant propagation), a computeraided design platfo ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
By establishing a relationship between the basic properties of linear computations and eight optimizing transformations (distributivity, associativity, commutativity, inverse and zero element law, common subexpression replication and elimination, constant propagation), a computeraided design platform is developed to optimally speedup an arbitrary instance from this large class of computations with respect to those transformations. Furthermore, arbitrarily fast implementation of an arbitrary linear computation is obtained by adding loop unrolling to the transformations set. During this process, a novel Horner pipelining scheme is used so that the areatime (AT) product is maintained constant, regardless of achieved speedup. We also present a generalization of the new approach so that an important subclass of nonlinear computations, named feedback linear computations, is efficiently, maximally, and arbitrarily spedup.
DivideandConquer Techniques for Global Throughput Optimization
 Proc. IEEE VLSI Signal Processing Workshop
, 1996
"... This paper proposes a divideandconquer approach for global throughput optimization which not only leverages upon existing techniques, but enables their more effective and coordinated use. The "divide" approach consists of logical partitioning of the computation into subparts falling into one of a ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
This paper proposes a divideandconquer approach for global throughput optimization which not only leverages upon existing techniques, but enables their more effective and coordinated use. The "divide" approach consists of logical partitioning of the computation into subparts falling into one of a set of preclassified computation types. The subparts are then "conquered" through coordinated application of existing optimization techniques. We have characterized a set of techniques in terms of their expected effect on throughput, and can thus select the most promising techniques for each unique situation. The technique is not limited to a specific class of computations and gives higher, or at worst equal, improvement than previously reported techniques on all examples. 1.0 Introduction Throughput optimization techniques remain important for meeting the sampling rate requirements of modern DSP and communication applications. Though clock rates for ASICs and general purpose computing devi...
Behavioral Level Guidance Using PropertyBased Design Characterization by
, 1996
"... BehavioralLevel Guidance Using PropertyBased Design Lisa Marie Guerra Doctor of Philosophy in Engineering  Electrical Engineering and Computer Sciences University of California at Berkeley Professor Jan M. Rabaey, Chair The growing importance of optimization, short time to market windows ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
BehavioralLevel Guidance Using PropertyBased Design Lisa Marie Guerra Doctor of Philosophy in Engineering  Electrical Engineering and Computer Sciences University of California at Berkeley Professor Jan M. Rabaey, Chair The growing importance of optimization, short time to market windows, and exponentially growing design complexity are just a few of the factors shaping the stateoftheart synthesis process. In particular, optimization at the early stages of design is crucial  at the system and behavioral levels, orders of magnitude performance improvement in key design metrics such as throughput, power, and area can be attained. This requires, however, strategic and coordinated application of design techniques best suited for a target design. The problem, however, is the number of options currently available is overwhelming, and as a result, design exploration is often conducted in a qualitative, adhoc manner.
On the error analysis and implementation of some eigenvalue decomposition and singular value decomposition algorithms
, 1996
"... Many algorithms exist for computing the symmetric eigendecomposition, the singular value decomposition and the generalized singular value decomposition. In this thesis, we present several new algorithms and improvements on old algorithms, analyzing them with respect to their speed, accuracy, and sto ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Many algorithms exist for computing the symmetric eigendecomposition, the singular value decomposition and the generalized singular value decomposition. In this thesis, we present several new algorithms and improvements on old algorithms, analyzing them with respect to their speed, accuracy, and storage requirements. We rst discuss the variations on the bisection algorithm for nding eigenvalues of symmetric tridiagonal matrices. We show the challenges in implementing a correct algorithm with oating point arithmetic. We show how reasonable looking but incorrect implementations can fail. We carefully de ne correctness, and present several implementations that we rigorously prove correct. We then discuss a fast implementation of bisection using parallel pre x. We show many numerical examples of the instability of this algorithm, and then discuss its forward error and backward error analysis. We also discuss possible ways to stabilize it by using iterative re nement. Finally, we discuss how to use a divideandconquer algorithm to compute the singular value decomposition and solve the linear least squares problem, and how to implement
On The Parallel Solution Of Parabolic Equations
 In Proc. ACM SIGARCH89
, 1989
"... We propose new parallel algorithms for the solution of linear parabolic problems. The first of these methods is based on using polynomial approximation to the exponential. It does not require solving any linear systems and is highly parallelizable. The two other methods proposed are based on Pad'e a ..."
Abstract
 Add to MetaCart
We propose new parallel algorithms for the solution of linear parabolic problems. The first of these methods is based on using polynomial approximation to the exponential. It does not require solving any linear systems and is highly parallelizable. The two other methods proposed are based on Pad'e and Chebyshev approximations to the matrix exponential. The parallelization of these methods is achieved by using partial fraction decomposition techniques to solve the resulting systems and thus offers the potential for increased time parallelism in time dependent problems. We also present experimental results from the Alliant FX/8 and the Cray YMP/832 vector multiprocessors. 1. Introduction. We consider the following linear parabolic partial differential equation: @u(x; t) @t = Lu(x; t) + s(x); x 2\Omega (1.1) u(0; x) = u 0 ; 8x 2\Omega u(t; x) = oe(x); x 2 @\Omega ; t 0: where L is a second order partial differential operator of elliptic type, acting on functions defined on the o...
· · · · · · −a(8) 1
, 2008
"... A linear, first order recurrence is a problem of the form x(j) =a(j)x(j − 1) + y(j), x(1) = y(1) (x(0) = 0), j = {1, 2,...,P}. Linear recurrences of this form arises in, for instance, the solution of bidiagonal systems of equations, as shown below ..."
Abstract
 Add to MetaCart
A linear, first order recurrence is a problem of the form x(j) =a(j)x(j − 1) + y(j), x(1) = y(1) (x(0) = 0), j = {1, 2,...,P}. Linear recurrences of this form arises in, for instance, the solution of bidiagonal systems of equations, as shown below
A Parallel Algorithm for Power Matrix Computation
"... Abstract⎯We present a parallel algorithm for power matrix A n in O(log 2 n) time using O(n 2.807 /log n) number of processors. It is shown that the growth rate of the proposed algorithm is the same as the parallel arithmetic complexity of matrix computations, including matrix inversion and solving s ..."
Abstract
 Add to MetaCart
Abstract⎯We present a parallel algorithm for power matrix A n in O(log 2 n) time using O(n 2.807 /log n) number of processors. It is shown that the growth rate of the proposed algorithm is the same as the parallel arithmetic complexity of matrix computations, including matrix inversion and solving systems of linear equations. Keywords⎯matrix computations, parallel algorithms, computational complexity 1.