Results 11  20
of
34
Range Majority in Constant Time and Linear Space
, 2011
"... Given an array A of size n, we consider the problem of answering range majority queries: given a query range [i..j] where 1 ≤ i ≤ j ≤ n, return the majority element of the subarray A[i..j] if it exists. We describe a linear space data structure that answers range majority queries in constant time. W ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
(Show Context)
Given an array A of size n, we consider the problem of answering range majority queries: given a query range [i..j] where 1 ≤ i ≤ j ≤ n, return the majority element of the subarray A[i..j] if it exists. We describe a linear space data structure that answers range majority queries in constant time. We further generalize this problem by defining range αmajority queries: given a query range [i..j], return all the elements in the subarray A[i..j] with frequency greater than α(j − i + 1). We prove an upper bound on the number of αmajorities that can exist in a subarray, assuming that query ranges are restricted to be larger than a given threshold. Using this upper bound, we generalize our range majority data structure to answer range αmajority queries in O ( 1α) time using O(n lg ( 1α + 1)) space, for any fixed α ∈ (0, 1). This result is interesting since other similar range query problems based on frequency have nearly logarithmic lower bounds on query time when restricted to linear space.
Dynamic Shannon Coding
, 2005
"... We present a new algorithm for dynamic prefixfree coding, based on Shannon coding. We give a simple analysis and prove a better upper bound on the length of the encoding produced than the corresponding bound for dynamic Huffman coding. We show how our algorithm can be modified for efficient length ..."
Abstract

Cited by 12 (9 self)
 Add to MetaCart
We present a new algorithm for dynamic prefixfree coding, based on Shannon coding. We give a simple analysis and prove a better upper bound on the length of the encoding produced than the corresponding bound for dynamic Huffman coding. We show how our algorithm can be modified for efficient lengthrestricted coding, alphabetic coding and coding with unequal letter costs.
Cacheoblivious algorithms and data structures
 IN SWAT
, 2004
"... Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as stand ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as standard RAM algorithms with only one memory level, i.e. without any knowledge about memory hierarchies, but are analyzed in the twolevel I/O model of Aggarwal and Vitter for an arbitrary memory and block size and an optimal offline cache replacement strategy. The result are algorithms that automatically apply to multilevel memory hierarchies. This paper gives an overview of the results achieved on cacheoblivious algorithms and data structures since the seminal paper by Frigo et al.
DistributionSensitive Algorithms
 NORDIC J. COMPUT
, 1998
"... We investigate a new paradigm of algorithm design for geometric problems that can be termed distributionsensitive. Our notion of distribution is more combinatorial in nature than spatial. We illustrate this on problems like planarhulls and 2Dmaxima where some of the previously known outputsensit ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
We investigate a new paradigm of algorithm design for geometric problems that can be termed distributionsensitive. Our notion of distribution is more combinatorial in nature than spatial. We illustrate this on problems like planarhulls and 2Dmaxima where some of the previously known outputsensitive algorithms are recast in this setting. In a number of cases, the distributionsensitive analysis yields superior results for the above problems. Moreover these bounds are shown to be tight in the linear decision tree model. Our approach owes its spirit to the results known for sorting multisets and we exploit this relationship further to derive fast and efficient parallel algorithms for sorting multisets along with the geometric problems.
On Compressing Permutations and Adaptive Sorting
, 2013
"... We prove that, given a permutation π over [1..n] formed of nRuns sorted blocks of sizes given by the vector R = 〈r1,..., rnRuns〉, there exists a compressed data structure encoding π in n(1 + H(R)) = n + ∑nRuns i=1 ri n log2 ri n(1 + log2 nRuns) bits while supporting access to the values of π() and ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
(Show Context)
We prove that, given a permutation π over [1..n] formed of nRuns sorted blocks of sizes given by the vector R = 〈r1,..., rnRuns〉, there exists a compressed data structure encoding π in n(1 + H(R)) = n + ∑nRuns i=1 ri n log2 ri n(1 + log2 nRuns) bits while supporting access to the values of π() and π−1 () in time O(log nRuns / log log n) in the worst case and O(H(R) / log log n) on average, when the argument is uniformly distributed over [1..n]. This data structure can be constructed in time O(n(1 + H(R))), which yields an improved adaptive sorting algorithm. Similar results on compressed data structures for permutations and adaptive sorting algorithms are proved for other preorder measures of practical and theoretical interest.
Distributed Computation of the Mode
 PODC'08
, 2008
"... This paper studies the problem of computing the most frequent element (the mode) by means of a distributed algorithm where the elements are located at the nodes of a network. Let k denote the number of distinct elements and further let mi be the number of occurrences of the element ei in the ordered ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
This paper studies the problem of computing the most frequent element (the mode) by means of a distributed algorithm where the elements are located at the nodes of a network. Let k denote the number of distinct elements and further let mi be the number of occurrences of the element ei in the ordered list of occurrences m1> m2> =...> = mk. We give a deterministic distributed algorithm with time complexity O(D+k) where D denotes the diameter of the graph, which is essentially tight. As our main contribution, a Monte Carlo algorithm is presented which computes the mode in O(D + F2/m21 * log k) time with high probability, where the frequency moment F ` is deo/ned as F ` = Pki=1 m`i. This algorithm is substantially faster than the deterministic algorithm for various relevant frequency distributions. Moreover, we provide a lower bound of \Omega (D + F5/(m51B)), where B is the maximum message size, that captures the effect of the frequency distribution on the time complexity to compute the mode.
Data Reduction Through Early Grouping
 In Proceedings of the 1994 IBM CAS Conference
, 1994
"... SQL queries containing GROUPBY and aggregation occur frequently in decision support applications. Grouping with aggregation is typically done by first sorting the input and then performing the aggregation as part of the output phase of the sort. The most widely used external sorting algorithm is mer ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
SQL queries containing GROUPBY and aggregation occur frequently in decision support applications. Grouping with aggregation is typically done by first sorting the input and then performing the aggregation as part of the output phase of the sort. The most widely used external sorting algorithm is merge sort, consisting of a run formation phase followed by a (single) merge pass. The amount of data output from the run formation phase can be reduced by a technique that we call early grouping. The idea is straightforward: simply form groups and perform aggregation during run formation. Each run will now consist of partial groups instead of individual records. These partial groups are then combined during the merge phase. Early grouping always reduces the number of records output from the run formation phase. The relative output size depends on the amount of memory relative to the total number of groups and the distribution of records over groups. When the input data is uniformly distributed...
Distributionsensitive set multipartitioning
 1st International Conference on the Analysis of Algorithms
, 2005
"... Given a set S with realvalued members, associated with each member one of two possible types; a multipartitioning of S is a sequence of the members of S such that if x, y ∈ S have different types and x < y, x precedes y in the multipartitioning of S. We give two distributionsensitive algorith ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Given a set S with realvalued members, associated with each member one of two possible types; a multipartitioning of S is a sequence of the members of S such that if x, y ∈ S have different types and x < y, x precedes y in the multipartitioning of S. We give two distributionsensitive algorithms for the set multipartitioning problem and a matching lower bound in the algebraic decisiontree model. One of the two algorithms can be made stable and can be implemented in place. We also give an outputsensitive algorithm for the problem.
DistributionSensitive Construction of MinimumRedundancy Prefix Codes
, 2005
"... Abstract. A new method for constructing minimumredundancy prefix codes is described. This method does not build a Huffman tree; instead it uses a property of optimal codes to find the codeword length of each weight. The running time of the algorithm is shown to be O(nk), where n is the number of we ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract. A new method for constructing minimumredundancy prefix codes is described. This method does not build a Huffman tree; instead it uses a property of optimal codes to find the codeword length of each weight. The running time of the algorithm is shown to be O(nk), where n is the number of weights and k is the number of different codeword lengths. When the given sequence of weights is already sorted, it is shown that the codes can be constructed using O(log 2k−1 n) comparisons, which is sublinear if the value of k is small.
From Time to Space: Fast Algorithms that yield Small and Fast Data Structures
"... Abstract. In many cases, the relation between encoding space and execution time translates into combinatorial lower bounds on the computational complexity of algorithms in the comparison or external memory models. We describe a few cases which illustrate this relation in a distinct direction, where ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. In many cases, the relation between encoding space and execution time translates into combinatorial lower bounds on the computational complexity of algorithms in the comparison or external memory models. We describe a few cases which illustrate this relation in a distinct direction, where fast algorithms inspire compressed encodings or data structures. In particular, we describe the relation between searching in an ordered array and encoding integers; merging sets and encoding a sequence of symbols; and sorting and compressing permutations.