Results 11  20
of
240
Deterministic Sorting in Nearly Logarithmic Time on the Hypercube and Related Computers
 Journal of Computer and System Sciences
, 1996
"... This paper presents a deterministic sorting algorithm, called Sharesort, that sorts n records on an nprocessor hypercube, shuffleexchange, or cubeconnected cycles in O(log n (log log n) 2 ) time in the worst case. The algorithm requires only a constant amount of storage at each processor. Th ..."
Abstract

Cited by 72 (10 self)
 Add to MetaCart
(Show Context)
This paper presents a deterministic sorting algorithm, called Sharesort, that sorts n records on an nprocessor hypercube, shuffleexchange, or cubeconnected cycles in O(log n (log log n) 2 ) time in the worst case. The algorithm requires only a constant amount of storage at each processor. The fastest previous deterministic algorithm for this problem was Batcher's bitonic sort, which runs in O(log 2 n) time. Supported by an NSERC postdoctoral fellowship, and DARPA contracts N0001487K825 and N00014 89J1988. 1 Introduction Given n records distributed uniformly over the n processors of some fixed interconnection network, the sorting problem is to route the record with the ith largest associated key to processor i, 0 i ! n. One of the earliest parallel sorting algorithms is Batcher's bitonic sort [3], which runs in O(log 2 n) time on the hypercube [10], shuffleexchange [17], and cubeconnected cycles [14]. More recently, Leighton [9] exhibited a boundeddegree,...
Lower Bounds for Deterministic and Nondeterministic Branching Programs
 in Proceedings of the FCT'91, Lecture Notes in Computer Science
, 1991
"... We survey lower bounds established for the complexity of computing explicitly given Boolean functions by switchingandrectifier networks, branching programs and switching networks. We first consider the unrestricted case and then proceed to various restricted models. Among these are monotone networ ..."
Abstract

Cited by 66 (4 self)
 Add to MetaCart
We survey lower bounds established for the complexity of computing explicitly given Boolean functions by switchingandrectifier networks, branching programs and switching networks. We first consider the unrestricted case and then proceed to various restricted models. Among these are monotone networks, boundedwidth devices , oblivious devices and readk times only devices. 1 Introduction The main goal of the Boolean complexity theory is to prove lower bounds on the complexity of computing "explicitly given" Boolean functions in interesting computational models. By "explicitly given" researchers usually mean "belonging to the class NP ". This is a very plausible interpretation since on the one hand this class contains the overwhelming majority of interesting Boolean functions and on the other hand it is small enough to prevent us from the necessity to take into account counting arguments. To illustrate the second point, let me remind the reader that already the class \Delta p 2 ,...
PrivacyPreserving Group Data Access via Stateless Oblivious RAM Simulation ∗
"... Motivated by cloud computing applications, we study the problem of providing privacypreserving access to an outsourced honestbutcurious data repository for a group of trusted users. We show how to achieve efficient privacypreserving data access using a combination of probabilistic encryption, wh ..."
Abstract

Cited by 61 (8 self)
 Add to MetaCart
(Show Context)
Motivated by cloud computing applications, we study the problem of providing privacypreserving access to an outsourced honestbutcurious data repository for a group of trusted users. We show how to achieve efficient privacypreserving data access using a combination of probabilistic encryption, which directly hides data values, and stateless oblivious RAM simulation, which hides the pattern of data accesses. We give a method with O(log n) amortized access overhead for simulating a RAM algorithm that has a memory of size n, using a scheme that is dataoblivious with very high probability. We assume that the simulation has access to a private workspace of size O(nν), for any given fixed constant ν> 0, but does not maintain state in between data access requests. Our simulation makes use of pseudorandom hash functions and is based on a novel hierarchy of cuckoo hash tables that all share a common stash. The method outperforms all previous techniques for stateless clients in terms of access overhead. We also provide experimental results from a prototype implementation of our scheme, showing its practicality. In addition, we show that one can eliminate the dependence on pseudorandom hash functions in our simulation while having the overhead rise to be O(log 2 n). 1
Efficient Computation of Recurrence Diameters
 4th International Conference on Verification, Model Checking, and Abstract Interpretation, volume 2575 of Lecture Notes in Computer Science
, 2003
"... SAT based Bounded Model Checking (BMC) is an efficient method for detecting logical errors in finitestate transition systems. Given a transition system, an LTL property, and a user defined bound k, a bounded model checker generates a propositional formula that is satisfiable if and only if a counte ..."
Abstract

Cited by 55 (22 self)
 Add to MetaCart
(Show Context)
SAT based Bounded Model Checking (BMC) is an efficient method for detecting logical errors in finitestate transition systems. Given a transition system, an LTL property, and a user defined bound k, a bounded model checker generates a propositional formula that is satisfiable if and only if a counterexample to the property of length up to k exists. Standard SAT checkers can be used to check this formula. BMC is complete if k is larger than some precomputed threshold. It is still unknown how to compute this threshold for general properties. We show that the longest initialized loopfree path in the state graph, also known as the recurrence diameter, is sufficient for Fp properties. The recurrence diameter is also a known overapproximation for the threshold of simple safety properties (Gp). We discuss various techniques to compute the recurrence diameter efficiently and provide experimental results that demonstrate the benefits of using the new approach.
Counting Networks and MultiProcessor Coordination (Extended Abstract)
 In Proceedings of the 23rd Annual Symposium on Theory of Computing
, 1991
"... ) James Aspnes Maurice Herlihy y Nir Shavit z Digital Equipment Corporation Cambridge Research Lab CRL 90/11 September 18, 1991 Abstract Many fundamental multiprocessor coordination problems can be expressed as counting problems: processes must cooperate to assign successive values from a g ..."
Abstract

Cited by 52 (8 self)
 Add to MetaCart
) James Aspnes Maurice Herlihy y Nir Shavit z Digital Equipment Corporation Cambridge Research Lab CRL 90/11 September 18, 1991 Abstract Many fundamental multiprocessor coordination problems can be expressed as counting problems: processes must cooperate to assign successive values from a given range, such as addresses in memory or destinations on an interconnection network. Conventional solutions to these problems perform poorly because of synchronization bottlenecks and high memory contention. Motivated by observations on the behavior of sorting networks, we offer a completely new approach to solving such problems. We introduce a new class of networks called counting networks, i.e., networks that can be used to count. We give a counting network construction of depth log 2 n using n log 2 n "gates," avoiding the sequential bottlenecks inherent to former solutions, and having a provably lower contention factor on its gates. Finally, to show that counting networks are not ...
Private Set Intersection: Are Garbled Circuits Better than Custom Protocols?
, 2012
"... Cryptographic protocols for Private Set Intersection (PSI) are the basis for many important privacypreserving applications. Over the past few years, intensive research has been devoted to designing custom protocols for PSI based on homomorphic encryption and other publickey techniques, apparently ..."
Abstract

Cited by 50 (7 self)
 Add to MetaCart
(Show Context)
Cryptographic protocols for Private Set Intersection (PSI) are the basis for many important privacypreserving applications. Over the past few years, intensive research has been devoted to designing custom protocols for PSI based on homomorphic encryption and other publickey techniques, apparently due to the belief that solutions using generic approaches would be impractical. This paper explores the validity of that belief. We develop three classes of protocols targeted to different set sizes and domains, all based on Yao’s generic garbledcircuit method. We then compare the performance of our protocols to the fastest custom PSI protocols in the literature. Our results show that a careful application of garbled circuits leads to solutions that can run on millionelement sets on typical desktops, and that can be competitive with the fastest custom protocols. Moreover, generic protocols like ours can be used directly for performing more complex secure computations, something we demonstrate by adding a simple informationauditing mechanism to our PSI protocols.
GPUABiSort: Optimal parallel sorting on stream architectures
 IN PROCEEDINGS OF THE 20TH IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS ’06) (APR
, 2006
"... In this paper, we present a novel approach for parallel sorting on stream processing architectures. It is based on adaptive bitonic sorting. For sorting n values utilizing p stream processor units, this approach achieves the optimal time complexity O((n log n)/p). While this makes our approach compe ..."
Abstract

Cited by 47 (0 self)
 Add to MetaCart
(Show Context)
In this paper, we present a novel approach for parallel sorting on stream processing architectures. It is based on adaptive bitonic sorting. For sorting n values utilizing p stream processor units, this approach achieves the optimal time complexity O((n log n)/p). While this makes our approach competitive with common sequential sorting algorithms not only from a theoretical viewpoint, it is also very fast from a practical viewpoint. This is achieved by using efficient linear stream memory accesses and by combining the optimal time approach with algorithms optimized for small input sequences. We present an implementation on modern programmable graphics hardware (GPUs). On recent GPUs, our optimal parallel sorting approach has shown to be remarkably faster than sequential sorting on the CPU, and it is also faster than previous nonoptimal sorting approaches on the GPU for sufficiently large input sequences. Because of the excellent scalability of our algorithm with the number of stream processor units p (up to n / log 2 n or even n / log n units, depending on the stream architecture), our approach profits heavily from the trend of increasing number of fragment processor units on GPUs, so that we can expect further speed improvement with upcoming GPU generations.
Improved Parallel Integer Sorting without Concurrent Writing
, 1992
"... We show that n integers in the range 1 : : n can be sorted stably on an EREW PRAM using O(t) time and O(n( p log n log log n + (log n) 2 =t)) operations, for arbitrary given t log n log log n, and on a CREW PRAM using O(t) time and O(n( p log n + log n=2 t=logn )) operations, for arbitrary ..."
Abstract

Cited by 47 (5 self)
 Add to MetaCart
(Show Context)
We show that n integers in the range 1 : : n can be sorted stably on an EREW PRAM using O(t) time and O(n( p log n log log n + (log n) 2 =t)) operations, for arbitrary given t log n log log n, and on a CREW PRAM using O(t) time and O(n( p log n + log n=2 t=logn )) operations, for arbitrary given t log n. In addition, we are able to sort n arbitrary integers on a randomized CREW PRAM within the same resource bounds with high probability. In each case our algorithm is a factor of almost \Theta( p log n) closer to optimality than all previous algorithms for the stated problem in the stated model, and our third result matches the operation count of the best previous sequential algorithm. We also show that n integers in the range 1 : : m can be sorted in O((log n) 2 ) time with O(n) operations on an EREW PRAM using a nonstandard word length of O(log n log log n log m) bits, thereby greatly improving the upper bound on the word length necessary to sort integers with a linear t...
SmallDepth Counting Networks
, 1992
"... Generalizing the notion of a sorting network, Aspnes, Herlihy, and Shavit recently introduced a class of socalled "counting" networks, and established an O(lg 2 n) upper bound on the depth complexity of such networks. Their work was motivated by a number of practical applications arising ..."
Abstract

Cited by 43 (2 self)
 Add to MetaCart
Generalizing the notion of a sorting network, Aspnes, Herlihy, and Shavit recently introduced a class of socalled "counting" networks, and established an O(lg 2 n) upper bound on the depth complexity of such networks. Their work was motivated by a number of practical applications arising in the domain of asynchronous shared memory machines. This paper continues the analysis of counting networks, providing a number of new upper bounds. In particular, we present an explicit construction of an O(c lg* lg n) depth counting network, a randomized construction of an O(lgn)depth network (that works with extremely high probability), and using the random con struction we present an existential proof of a de terministic O(lgn)depth network. The latter result matches the trivial (lgn)depth lower bound to within a constant factor.