Results 1  10
of
17
Online stochastic matching: Online actions based on offline statistics
, 2010
"... We consider the online stochastic matching problem proposed by Feldman et al. [4] as a model of display ad allocation. We are given a bipartite graph; one side of the graph corresponds to a fixed set of bins and the other side represents the set of possible ball types. At each time step, a ball is s ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
We consider the online stochastic matching problem proposed by Feldman et al. [4] as a model of display ad allocation. We are given a bipartite graph; one side of the graph corresponds to a fixed set of bins and the other side represents the set of possible ball types. At each time step, a ball is sampled independently from the given distribution and it needs to be matched upon its arrival to an empty bin. The goal is to maximize the size of the matching. We present an online algorithm for this problem with a competitive ratio of 0.702. Before our result, algorithms with a competitive ratio better than 1 − 1/e were known under the assumption that the expected number of arriving balls of each type is integral. A key idea of the algorithm is to collect statistics about the decisions of the optimum offline solution using Monte Carlo sampling and use those statistics to guide the decisions of the online algorithm. We also show that no online algorithm can have a competitive ratio better than 0.823. 1
Maximum matchings in random bipartite graphs and the space utilization of cuckoo hashtables
, 2009
"... We study the the following question in Random Graphs. We are given two disjoint sets L, R with L  = n = αm and R  = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When consi ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
We study the the following question in Random Graphs. We are given two disjoint sets L, R with L  = n = αm and R  = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When considered in the context of Cuckoo Hashing, one key question is as to when is µ(G) = n whp? We answer this question exactly when d is at least three. We also establish a precise threshold for when Phase 1 of the KarpSipser Greedy matching algorithm suffices to compute a maximum matching whp.
Backyard Cuckoo Hashing: Constant WorstCase Operations with a Succinct Representation
, 2010
"... The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constanttime operations in the worst case with high probability, and in terms of space consumption ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constanttime operations in the worst case with high probability, and in terms of space consumption there are known constructions that use essentially optimal space. In this paper we settle two fundamental open problems: • We construct the first dynamic dictionary that enjoys the best of both worlds: we present a twolevel variant of cuckoo hashing that stores n elements using (1+ϵ)n memory words, and guarantees constanttime operations in the worst case with high probability. Specifically, for any ϵ = Ω((log log n / log n) 1/2) and for any sequence of polynomially many operations, with high probability over the randomness of the initialization phase, all operations are performed in constant time which is independent of ϵ. The construction is based on augmenting cuckoo hashing with a “backyard ” that handles a large fraction of the elements, together with a deamortized perfect hashing scheme for eliminating the dependency on ϵ.
Bipartite Graph Structures for Efficient Balancing of Heterogeneous Loads
, 2012
"... This paper considers large scale distributed content service platforms, such as peertopeer videoondemand systems. Such systems feature two basic resources, namely storage and bandwidth. Their efficiency critically depends on two factors: (i) content replication within servers, and (ii) how incom ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
This paper considers large scale distributed content service platforms, such as peertopeer videoondemand systems. Such systems feature two basic resources, namely storage and bandwidth. Their efficiency critically depends on two factors: (i) content replication within servers, and (ii) how incoming service requests are matched to servers holding requested content. To inform the corresponding design choices, we make the following contributions. We first show that, for underloaded systems, socalled proportional content placement with a simple greedy strategy for matching requests to servers ensures full system efficiency provided storage size grows logarithmically with the system size. However, for constant storage size, this strategy undergoes a phase transition with severe loss of efficiency as system load approaches criticality. To better understand the role of the matching strategy in this performance degradation, we characterize the asymptotic system efficiency under an optimal matching policy. Our analysis shows that –in contrast to greedy matching– optimal matching incurs an inefficiency that is exponentially small in the server storage size, even at critical system loads. It further allows a characterization of content replication policies that minimize the inefficiency. These optimal policies, which differ markedly from proportional placement, have a simple structure which makes them implementable in practice. On the methodological side, our analysis of matching performance uses the theory of local weak limits of random graphs, and highlights a novel characterization of matching numbers in bipartite graphs, which may both be of independent interest.
Load Balancing and Orientability Thresholds for Random Hypergraphs
"... Let h> w> 0 be two fixed integers. Let H be a random hypergraph whose hyperedges are uniformly of size h. To worient a hyperedge, we assign exactly w of its vertices positive signs with respect to the hyperedge, and the rest negative. A (w, k)orientation of H consists of a worientation of all hyp ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Let h> w> 0 be two fixed integers. Let H be a random hypergraph whose hyperedges are uniformly of size h. To worient a hyperedge, we assign exactly w of its vertices positive signs with respect to the hyperedge, and the rest negative. A (w, k)orientation of H consists of a worientation of all hyperedges of H, such that each vertex receives at most k positive signs from its incident hyperedges. When k is large enough, we determine the threshold of the existence of a (w, k)orientation of a random hypergraph. The (w, k)orientation of hypergraphs is strongly related to a general version of the offline load balancing problem. The graph case, when h = 2 and w = 1, was solved recently by Cain, Sanders and Wormald and independently by Fernholz and Ramachandran, thereby settling a conjecture made by Karp and Saks. Motivated by a problem of cuckoo hashing, the special hypergraph case with w = k = 1, was solved in three separate preprints dating from October 2009, by Frieze and Melsted, by Fountoulakis and
A New Approach to the Orientation of Random Hypergraphs
"... A huniform hypergraph H = (V,E) is called (ℓ,k)orientable if there exists an assignment of each hyperedge e ∈ E to exactly ℓ of its vertices v ∈ e such that no vertex is assigned more than k hyperedges. Let Hn,m,h be a hypergraph, drawn uniformly at random from the set of all huniform hypergraphs ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
A huniform hypergraph H = (V,E) is called (ℓ,k)orientable if there exists an assignment of each hyperedge e ∈ E to exactly ℓ of its vertices v ∈ e such that no vertex is assigned more than k hyperedges. Let Hn,m,h be a hypergraph, drawn uniformly at random from the set of all huniform hypergraphs with n vertices and m edges. In this paper, we determine the threshold of the existence of a (ℓ,k)orientation of Hn,m,h for k ≥ 1 and h> ℓ ≥ 1, extending recent results motivated by applications such as cuckoo hashing or load balancing with guaranteed maximum load. Our proof combines the local weak convergence of sparse graphs and a careful analysis of a Gibbs measure on spanning subgraphs with degree constraints. It allows us to deal with a much broader class than the uniform hypergraphs.
The Set of Solutions of Random XORSAT Formulae
, 2011
"... The XORsatisfiability (XORSAT) problem requires finding an assignment of n Boolean variables that satisfymexclusiveOR(XOR)clauses, wherebyeachclause constrains a subset of the variables. We consider random XORSAT instances, drawn uniformly at random from the ensemble of formulae containing n variab ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The XORsatisfiability (XORSAT) problem requires finding an assignment of n Boolean variables that satisfymexclusiveOR(XOR)clauses, wherebyeachclause constrains a subset of the variables. We consider random XORSAT instances, drawn uniformly at random from the ensemble of formulae containing n variables and m clauses of size k. This model presents several structural similarities to other ensembles of constraint satisfaction problems, such as ksatisfiability (kSAT). For many of these ensembles, as the number of constraints per variable grows, the set of solutions shatters into an exponential number of wellseparated components. This phenomenon appears to be related to the difficulty of solving random instances of such problems. We prove a complete characterization of this clustering phase transition for random kXORSAT. In particular we prove that the clustering threshold is sharp and determine its exact location. We prove that the set of solutions has large conductance below this threshold and that each of the clusters has large conductance above the same threshold. Our proof constructs a very sparse basis for the set of solutions (or the subset within a cluster). This construction is achieved through a low complexity iterative algorithm. 1
Convergence of multivariate belief propagation, with applications to cuckoo hashing and load balancing. Arxiv preprint arXiv:1207.1659
, 2012
"... This paper is motivated by two applications, namely i) generalizations of cuckoo hashing, a computationally simple approach to assigning keys to objects, and ii) load balancing in content distribution networks, where one is interested in determining the impact of content replication on performance. ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper is motivated by two applications, namely i) generalizations of cuckoo hashing, a computationally simple approach to assigning keys to objects, and ii) load balancing in content distribution networks, where one is interested in determining the impact of content replication on performance. These two problems admit a common abstraction: in both scenarios, performance is characterized by the maximum weight of a generalization of a matching in a bipartite graph, featuring node and edge capacities. Our main result is a law of large numbers characterizing the asymptotic maximum weight matching in the limit of large bipartite random graphs, when the graphs admit a local weak limit that is a tree. This result specializes to the two application scenarios, yielding new results in both contexts. In contrast with previous results, the key novelty is the ability to handle edge capacities with arbitrary integer values. An analysis of belief propagation algorithms (BP) with multivariate belief vectors underlies the proof. In particular, we show convergence of the corresponding BP by exploiting monotonicity of the belief vectors with respect to the socalled upshifted likelihood ratio stochastic order. This auxiliary result can be of independent interest, providing a new set of structural conditions which ensure convergence of BP. 1
Orientability thresholds for random hypergraphs
"... Let h> w> 0 be two fixed integers. Let H be a random hypergraph whose hyperedges are all of cardinality h. To worient a hyperedge, we assign exactly w of its vertices positive signs with respect to the hyperedge, and the rest negative. A (w, k)orientation of H consists of a worientation of all hy ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Let h> w> 0 be two fixed integers. Let H be a random hypergraph whose hyperedges are all of cardinality h. To worient a hyperedge, we assign exactly w of its vertices positive signs with respect to the hyperedge, and the rest negative. A (w, k)orientation of H consists of a worientation of all hyperedges of H, such that each vertex receives at most k positive signs from its incident hyperedges. When k is large enough, we determine the threshold of the existence of a (w, k)orientation of a random hypergraph. The (w, k)orientation of hypergraphs is strongly related to a general version of the offline load balancing problem. The graph case, when h = 2 and w = 1, was solved recently by Cain, Sanders and Wormald and independently by Fernholz and Ramachandran, which settled a conjecture of Karp and Saks. 1
Biff (Bloom Filter) Codes: Fast Error Correction for Large Data Sets
"... Abstract—Large data sets are increasingly common in cloud and virtualized environments. For example, transfers of multiple gigabytes are commonplace, as are replicated blocks of such sizes. There is a need for fast errorcorrection or data reconciliation in such settings even when the expected numbe ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract—Large data sets are increasingly common in cloud and virtualized environments. For example, transfers of multiple gigabytes are commonplace, as are replicated blocks of such sizes. There is a need for fast errorcorrection or data reconciliation in such settings even when the expected number of errors is small. Motivated by such cloud reconciliation problems, we consider errorcorrection schemes designed for large data, after explaining why previous approaches appear unsuitable. We introduce Biff codes, which are based on Bloom filters and are designed for large data. For Biff codes with a message of length L and E errors, the encoding time is O(L), decoding time is O(L + E) and the space overhead is O(E). Biff codes are lowdensity paritycheck codes; they are similar to Tornado codes, but are designed for errors instead of erasures. Further, Biff codes are designed to be very simple, removing any explicit graph structures and based entirely on hash tables. We derive Biff codes by a simple reduction from a set reconciliation algorithm for a recently developed data structure, invertible Bloom lookup tables. While the underlying theory is extremely simple, what makes this code especially attractive is the ease with which it can be implemented and the speed of decoding. We present results from a prototype implementation that decodes messages of 1 million words with thousands of errors in well under a second. I.