Results 11 - 20
of
76
On the Analysis of Linear Probing Hashing
, 1998
"... This paper presents moment analyses and characterizations of limit distributions for the construction cost of hash tables under the linear probing strategy. Two models are considered, that of full tables and that of sparse tables with a fixed filling ratio strictly smaller than one. For full tables, ..."
Abstract
-
Cited by 19 (8 self)
- Add to MetaCart
This paper presents moment analyses and characterizations of limit distributions for the construction cost of hash tables under the linear probing strategy. Two models are considered, that of full tables and that of sparse tables with a fixed filling ratio strictly smaller than one. For full tables, the construction cost has expectation O(n3/2), the standard deviation is of the same order, and a limit law of the Airy type holds. (The Airy distribution is a semiclassical distribution that is defined in terms of the usual Airy functions or equivalently in terms of Bessel functions of indices − 1 2 3, 3.) For sparse tables, the construction cost has expectation O(n), standard deviation O ( √ n), and a limit law of the Gaussian type. Combinatorial relations with other problems leading to Airy phenomena (like graph connectivity, tree inversions, tree path length, or area under excursions) are also briefly discussed.
Extremal properties of three-dimensional sensor networks with applications
- IEEE Transactions on Mobile Computing
, 2004
"... In this paper, we analyze various critical transmitting/sensing ranges for connectivity and coverage in three-dimensional sensor networks. As in other large-scale complex systems, many global parameters of sensor networks undergo phase transitions: For a given property of the network, there is a cri ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
In this paper, we analyze various critical transmitting/sensing ranges for connectivity and coverage in three-dimensional sensor networks. As in other large-scale complex systems, many global parameters of sensor networks undergo phase transitions: For a given property of the network, there is a critical threshold, corresponding to the minimum amount of the communication effort or power expenditure by individual nodes, above (resp. below) which the property exists with high (resp. a low) probability. For sensor networks, properties of interest include simple and multiple degrees of connectivity/coverage. First, we investigate the network topology according to the region of deployment, the number of deployed sensors and their transmitting/sensing ranges. More specifically, we consider the following problems: Assume that n nodes, each capable of sensing events within a radius of r, are randomly and uniformly distributed in a 3-dimensional region R of volume V, how large must the sensing range rSense be to ensure a given degree of coverage of the region to monitor? For a given transmission range rTrans, what is the minimum (resp. maximum) degree of the network? What is then the typical hop-diameter of the underlying network? Next, we show how these results affect algorithmic aspects of the network by designing specific distributed protocols for sensor networks. Keywords Sensor networks, ad hoc networks; coverage, connectivity; hop-diameter; minimum/maximum degrees; transmitting/sensing ranges; analytical methods; energy consumption; topology control. I.
The Average Case Analysis Of Algorithms - Complex Asymptotics and Generating Functions
, 1991
"... This report is part of a projected series whose aim is to present in a synthetic way the major methods and models in the average--case analysis of algorithms. The following items are to be treated in the series. First, there will be a collection of reports on Methods: ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
This report is part of a projected series whose aim is to present in a synthetic way the major methods and models in the average--case analysis of algorithms. The following items are to be treated in the series. First, there will be a collection of reports on Methods:
Data Morphing: An Adaptive, Cache-Conscious Storage Technique
- In Proc. VLDB, 2003
, 2003
"... The number of processor cache misses has a critical impact on the performance of DBMSs running on servers with large main-memory configurations. ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
The number of processor cache misses has a critical impact on the performance of DBMSs running on servers with large main-memory configurations.
A Sequence of Series for The Lambert Function
, 1997
"... We give a uniform treatment of several series expansions for the Lambert W function, leading to an infinite family of new series. We also discuss standardization, complex branches, a family of arbitrary-order iterative methods for computation of W , and give a theorem showing how to correctly solve ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
We give a uniform treatment of several series expansions for the Lambert W function, leading to an infinite family of new series. We also discuss standardization, complex branches, a family of arbitrary-order iterative methods for computation of W , and give a theorem showing how to correctly solve another simple and frequently occurring nonlinear equation in terms of W and the unwinding number. 1 Introduction Investigations of the properties of the Lambert W function are good examples of nontrivial interactions between computer algebra, mathematics, and applications. To begin with, the standardization of the name W by computer algebra (see section 1.2 below) has had several effects. First, this standardization has exposed a great variety of applications; second, it has uncovered a significant history, hitherto unnoticed because the lack of a standard name meant that most researchers were unaware of previous work; and, third, it has now stimulated current interest in this remarkable ...
Planar Maps and Airy Phenomena
, 2000
"... A considerable number of asymptotic distributions arising in random combinatorics and analysis of algorithms are of the exponentialquadratic type (e x 2 ), that is, Gaussian. We exhibit here a new class of \universal" phenomena that are of the exponential-cubic type (e ix 3 ), corresponding to ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
A considerable number of asymptotic distributions arising in random combinatorics and analysis of algorithms are of the exponentialquadratic type (e x 2 ), that is, Gaussian. We exhibit here a new class of \universal" phenomena that are of the exponential-cubic type (e ix 3 ), corresponding to nonstandard distributions that involve the Airy function. Such Airy phenomena are expected to be found in a number of applications, when conuences of critical points and singularities occur. About a dozen classes of planar maps are treated in this way, leading to the occurrence of a common Airy distribution that describes the sizes of cores and of largest (multi)connected components. Consequences include the analysis and ne optimization of random generation algorithms for multiply connected planar graphs.
Model Selection for Generalized Linear Models via GLIB, with Application to Epidemiology
, 1993
"... Epidemiological studies for assessing risk factors often use logistic regression, log-linear models, or other generalized linear models. They involve many decisions, including the choice and coding of risk factors and control variables. It is common practice to select independent variables using a s ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
Epidemiological studies for assessing risk factors often use logistic regression, log-linear models, or other generalized linear models. They involve many decisions, including the choice and coding of risk factors and control variables. It is common practice to select independent variables using a series of significance tests and to choose the way variables are coded somewhat arbitrarily. The overall properties of such a procedure are not well understood, and conditioning on a single model ignores model uncertainty, leading to underestimation of uncertainty about quantities of interest (QUOIs). We describe a Bayesian modeling strategy that formalizes the model selection process and propagates model uncertainty through to inference about QUOIs. Each possible combination of modeling decisions defines a different model, and the models are compared using Bayes factors. Inference about a QUOI is based on an average of its posterior distributions under the individual models, weighted by thei...
A Calculus for the Random Generation of Combinatorial Structures
, 1993
"... A systematic approach to the random generation of labelled combinatorial objects is presented. It applies to structures that are decomposable, i.e., formally specifiable by grammars involving set, sequence, and cycle constructions. A general strategy is developed for solving the random generation pr ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
A systematic approach to the random generation of labelled combinatorial objects is presented. It applies to structures that are decomposable, i.e., formally specifiable by grammars involving set, sequence, and cycle constructions. A general strategy is developed for solving the random generation problem with two closely related types of methods: for structures of size n, the boustrophedonic algorithms exhibit a worst-case behaviour of the form O(n log n); the sequential algorithms haveworst case O(n²), while offering good potential for optimizations in the average case. (Both methods appeal to precomputed numerical tables of linear size.) A companion calculus permits to systematically compute the average case cost of the sequential generation algorithm associated to a given specification. Using optimizations dictated by the cost calculus, several random generation algorithms are developed, based on the sequential principle; most of them have expected complexity 1/2 n log n,thu...
On the Stochastic Complexity of Learning Realizable and Unrealizable Rules
, 1995
"... The problem of learning from examples in an average case setting is considered. Focusing on the stochastic complexity, an information theoretic quantity measuring the minimal description length of the data given a class of models, we find rigorous upper and lower bounds for this quantity under vario ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
The problem of learning from examples in an average case setting is considered. Focusing on the stochastic complexity, an information theoretic quantity measuring the minimal description length of the data given a class of models, we find rigorous upper and lower bounds for this quantity under various conditions. For realizable problems, where the model class used is sufficiently rich to represent the function giving rise to the examples, we find tight upper and lower bounds for the stochastic complexity. In this case, bounds on the prediction error follow immediately using the methods of Haussler et al. (1994a). For unrealizable learning we find a tight upper bound only in the case of learning within a space of finite VC dimension. Moreover, we show in the latter case that the optimal method for prediction may not be the same as that for data compression, even in the limit of an infinite amount of training data, although the two problems (i.e. prediction and compression) are asymptoti...
Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm
- IN AOFA ’07: PROCEEDINGS OF THE 2007 INTERNATIONAL CONFERENCE ON ANALYSIS OF ALGORITHMS
, 2007
"... This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of distinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, “short bytes”), HYPERLOGLOG performs a single pa ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of distinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, “short bytes”), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about 1.04 / √ m. This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond 10 9 with a typical accuracy of 2 % while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.

