Results 1  10
of
15
Weak Ordering  A New Definition
, 1990
"... A memory model for a shared memory, multiprocessor commonly and often implicitly assumed by programmers is that of sequential consistency. This model guarantees that all memory accesses will appear to execute atomically and in program order. An alternative model, weak ordering, offers greater perfor ..."
Abstract

Cited by 226 (13 self)
 Add to MetaCart
A memory model for a shared memory, multiprocessor commonly and often implicitly assumed by programmers is that of sequential consistency. This model guarantees that all memory accesses will appear to execute atomically and in program order. An alternative model, weak ordering, offers greater performance potential. Weak ordering was first defined by Dubois, Scheurich and Briggs in terms of a set of rules for hardware that have to be made visible to software. The central hypothesis of this work is that programmers prefer to reason about sequentially consistent memory, rather than having to think about weaker memory, or even write buffers. Following this hypothesis, we redefine weak ordering as a contract between software and hardware. By this contract, software agrees to some formally specified constraints, and hardware agrees to appear sequentially consistent to at least the software that obeys those constraints. We illustrate the power of the new definition with a set of software constraints that forbid data races and an implementation for cachecoherent systems chat is not allowed by the old definition.
A Unified Formalization of Four SharedMemory Models
 IEEE Transactions on Parallel and Distributed Systems
, 1993
"... This paper presents a sharedmemory model, dataracefree1, that unifies four earlier models: weak ordering, release consistency (with sequentially consistent special operations), the VAX memory model, and datarace free0. The most intuitive and commonly assumed sharedmemory model, sequential con ..."
Abstract

Cited by 110 (9 self)
 Add to MetaCart
This paper presents a sharedmemory model, dataracefree1, that unifies four earlier models: weak ordering, release consistency (with sequentially consistent special operations), the VAX memory model, and datarace free0. The most intuitive and commonly assumed sharedmemory model, sequential consistency, limits performance. The models of weak ordering, release consistency, the VAX, and dataracefree0 are based on the common intuition that if programs synchronize explicitly and correctly, then sequential consistency can be guaranteed with high performance. However, each model formalizes this intuition differently and has different advantages and disadvantages with respect to the other models. Dataracefree1 unifies the models of weak ordering, release consistency, the VAX, and dataracefree0 by formalizing the above intuition in a manner that retains the advantages of each of the four models. A multiprocessor is dataracefree1 if it guarantees sequential consistency to data...
Successive Overrelaxation for Support Vector Machines
 IEEE Transactions on Neural Networks
, 1998
"... Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs [11, 12, 9] is used to train a support vector machine (SVM) [20, 3] for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a t ..."
Abstract

Cited by 66 (14 self)
 Add to MetaCart
Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs [11, 12, 9] is used to train a support vector machine (SVM) [20, 3] for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a time, similar to Platt's sequential minimal optimization (SMO) algorithm [18] which handles two constraints at a time, it can process very large datasets that need not reside in memory. The algorithm converges linearly to a solution. Encouraging numerical results are presented on datasets with up to 10 million points. Such massive discrimination problems cannot be processed by conventional linear or quadratic programming methods, and to our knowledge have not been solved by other methods. 1 Introduction Successive overrelaxation, originally developed for the solution of large systems of linear equations [16, 15] has been successfully applied to mathematical programming problems [4, 11, 12, 1...
Designing Memory Consistency Models for SharedMemory Multiprocessors
, 1993
"... The memory consistency model (or memory model) of a sharedmemory multiprocessor system influences both the performance and the programmability of the system. The simplest and most intuitive model for programmers, sequential consistency, restricts the use of many performanceenhancing optimizations ..."
Abstract

Cited by 57 (9 self)
 Add to MetaCart
The memory consistency model (or memory model) of a sharedmemory multiprocessor system influences both the performance and the programmability of the system. The simplest and most intuitive model for programmers, sequential consistency, restricts the use of many performanceenhancing optimizations exploited by uniprocessors. For higher performance, several alternative models have been proposed. However, many of these are hardwarecentric in nature and difficult to program. Further, the multitude of many seemingly unrelated memory models inhibits portability. We use a 3P criteria of programmability, portability, and performance to assess memory models, and find current models lacking in one or more of these criteria. This thesis establishes a unifying framework for reasoning about memory models that leads to models that adequately satisfy the 3P criteria. The first contribution of this thesis is a programmercentric methodology, called sequential consistency normal form (SCNF), for specifying memory models. This methodology is based on the observation that performance enhancing optimizations can be allowed without violating sequential consistency if the system is given some information about the program. An SCNF model is a contract between the system and the programmer, where the system guarantees both high performance and sequential consistency only if the programmer provides certain information about the program. Insufficient information gives lower performance, but incorrect information
Where is Time Spent in MessagePassing and SharedMemory Programs?
 In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI
, 1994
"... Message passing and shared memory are two techniques parallel programs use for coordination and communication. This paper studies the strengths and weaknesses of these two mechanisms by comparing equivalent, wellwritten messagepassing and sharedmemory programs running on similar hardware. To ensu ..."
Abstract

Cited by 55 (3 self)
 Add to MetaCart
Message passing and shared memory are two techniques parallel programs use for coordination and communication. This paper studies the strengths and weaknesses of these two mechanisms by comparing equivalent, wellwritten messagepassing and sharedmemory programs running on similar hardware. To ensure that our measurements are comparable, we produced two carefully tuned versions of each program and measured them on closelyrelated simulators of a messagepassing and a sharedmemory machine, both of which are based on same underlying hardware assumptions. We examined the behavior and performance of each program carefully. Although the cost of computation in each pair of programs was similar, synchronization and communication differed greatly. We found that messagepassing's advantage over sharedmemory is not clearcut. Three of the four sharedmemory programs ran at roughly the same speed as their messagepassing equivalent, even though their communication patterns were different. 1 In...
Location Consistency: Stepping Beyond the Barriers of Memory Coherence and Serializability
 McGill University, School of Computer
, 1994
"... A memory consistency model represents a binding "contract" between software and hardware in a sharedmemory multiprocessor system. It is important to provide a memory consistency model that is easy to understand and that also facilitates efficient implementation. The memory consistency model that ha ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
A memory consistency model represents a binding "contract" between software and hardware in a sharedmemory multiprocessor system. It is important to provide a memory consistency model that is easy to understand and that also facilitates efficient implementation. The memory consistency model that has been most commonly used in past work is sequential consistency (SC), which requires the execution of a parallel program to appear as some interleaving of the memory operations on a sequential machine. To reduce the rigid constraints of the SC model, several relaxed consistency models have been proposed, notably weak ordering (or weak consistency) (WC), release consistency (RC), dataracefree0, and dataracefree1. These models allow performance optimizations to be correctly applied, while guaranteeing that sequential consistency is retained for a specified class of programs. We call these models SCderived models. A central assumption in the definitions of all SCderived memory consist...
Data Discrimination via Nonlinear Generalized Support Vector Machines
 Complementarity: Applications, Algorithms and Extensions
, 1999
"... The main purpose of this paper is to show that new formulations of support vector machines can generate nonlinear separating surfaces which can discriminate between elements of a given set better than a linear surface. The principal approach used is that of generalized support vector machines (GSVMs ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
The main purpose of this paper is to show that new formulations of support vector machines can generate nonlinear separating surfaces which can discriminate between elements of a given set better than a linear surface. The principal approach used is that of generalized support vector machines (GSVMs) which employ possibly indefinite kernels [17]. The GSVM training procedure is carried out by either the simple successive overrelaxation (SOR) [18] iterative method or by linear programming. This novel combination of powerful support vector machines [24, 5] with the highly effective SOR computational algorithm [15, 16, 14] or with linear programming allows us to use a nonlinear surface to discriminate between elements of a dataset that belong to one of two categories. Numerical results on a number of datasets show improved testing set correctness, by as much as a factor of two, when comparing the nonlinear GSVM surface to a linear separating surface. 1 Introduction A very simple convex qu...
Optimization Methods In Massive Datasets
"... We describe the role of generalized support vector machines in separating massive and complex data using arbitrary nonlinear kernels. Feature selection that improves generalization is implemented via an effective procedure that utilizes a polyhedral norm or a concave function minimization. Massive d ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We describe the role of generalized support vector machines in separating massive and complex data using arbitrary nonlinear kernels. Feature selection that improves generalization is implemented via an effective procedure that utilizes a polyhedral norm or a concave function minimization. Massive data is separated using a linear programming chunking algorithm as well as a successive overrelaxation algorithm, each of which is capable of processing data with millions of points. 1 2 1. INTRODUCTION We address here the problem of classifying data in ndimensional real (Euclidean) space R n into one of two disjoint nite point sets (i.e. classes). The support vector machine (SVM) approach to classication [57, 2, 25, 58, 13, 54, 55] attempts to separate points belonging to two given sets in R n by a nonlinear surface, often only implicitly dened by a kernel function. Since the nonlinear surface in R n is typically linear in its parameters, it can be represented as a linear func...
Active Set Strategies and the LP Dual Active Set Algorithm
, 1996
"... fter a general treatment of primal and dual active set strategies, we present the Dual m Active Set Algorithm for linear programming and prove its convergence. An efficient impleentation is developed using proximal point approximations. This implementation involves a b primal/dual proximal iteration ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
fter a general treatment of primal and dual active set strategies, we present the Dual m Active Set Algorithm for linear programming and prove its convergence. An efficient impleentation is developed using proximal point approximations. This implementation involves a b primal/dual proximal iteration similar to one introduced by Rockafellar, and a new iteration ased on optimization of a proximal vector parameter. This proximal parameter optimization , w problem is well conditioned, leading to rapid convergence of the conjugate gradient method hile the original proximal function is terribly conditioned, leading to almost undetectable conz vergence of the conjugate gradient method. Limits as a proximal scalar parameter tends to ero are evaluated. Intriguing numerical results are presented for Netlib test problems. t s Key Words. Linear programming, quadratic programming, active sets, dual method, leas quares, proximal point, extrapolation, conjugate gradients, successive overrelexation ...
Alternating Directions Methods for the Parallel Solution of LargeScale BlockStructured Optimization Problems
, 1994
"... Prompted by advances in computer technology and the increasing confidence of decision makers in largescale market models, practitioners of operations research are now tackling problems of increasing detail, complexity and size. This necessitates the development of new solution algorithms that explo ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Prompted by advances in computer technology and the increasing confidence of decision makers in largescale market models, practitioners of operations research are now tackling problems of increasing detail, complexity and size. This necessitates the development of new solution algorithms that exploit problem structure as well as the properties of the target hardware, in order to minimize turnaround time and maximize model utilization. Many models in planning and scheduling exhibit a blockangular structure, that can represent spatial or temporal partial decomposability: decision variables can be broken down to largely independent blocks, that correspond to firstlevel decisions satisfying a subset of the constraints, which may represent a time period, or a geographical region, or a commodity. The blocks interact via coupling constraints related to secondlevel coordination of block decisions, such as shared resource allocation restrictions. In this thesis we construct three efficient decomposition algorithms for such blockangular problems. These algorithms belong to the family of alternating directions methods, and can be thought of as block GaussSeidel iterative schemes for an augmented Lagrangian, that exploit the block structure. Alternatively, they can be thought of as DouglasRachford schemes for calculating a zero of the maximal monotone subgradient operator. Our algorithms are of the "forkjoin" type, alternating a local and a global computation phase. In the local phase, decoupled optimization subproblems corresponding to blocks are solved. In the global phase, solution information is combined and a coordination problem is solved, the results of which are used in modifying the objective function of the subproblems. The algorithms are thus similar to priced...