## Profile of Tries (2006)

### Cached

### Download Links

- [140.109.73.41]
- [algo.stat.sinica.edu.tw]
- [www.lix.polytechnique.fr]
- [www.lix.polytechnique.fr]
- [www.cs.purdue.edu]
- DBLP

### Other Repositories/Bibliography

Citations: | 17 - 7 self |

### BibTeX

@TECHREPORT{Park06profileof,

author = {Gahyun Park and Hsien-kuei Hwang and Pierre Nicodème and Wojciech Szpankowski},

title = {Profile of Tries},

institution = {},

year = {2006}

}

### OpenURL

### Abstract

Tries (from retrieval) are one of the most popular data structures on words. They are pertinent to (internal) structure of stored words and several splitting procedures used in diverse contexts. The profile of a trie is a parameter that represents the number of nodes (either internal or external) with the same distance from the root. It is a function of the number of strings stored in a trie and the distance from the root. Several, if not all, trie parameters such as height, size, depth, shortest path, and fill-up level can be uniformly analyzed through the (external and internal) profiles. Although profiles represent one of the most fundamental parameters of tries, they have been hardly studied in the past. The analysis of profiles is surprisingly arduous but once it is carried out it reveals unusually intriguing and interesting behavior. We present a detailed study of the distribution of the profiles in a trie built over random strings generated by a memoryless source. We first derive recurrences satisfied by the expected profiles and solve them asymptotically for all possible ranges of the distance from the root. It appears that profiles of tries exhibit several fascinating phenomena. When moving from the root to the leaves of a trie, the growth of the expected profiles vary. Near the root, the external profiles tend to zero in an exponentially rate, then the rate gradually rises to being logarithmic; the external profiles then abruptly tend to infinity, first logarithmically

### Citations

970 |
Algorithms on strings, trees and sequences
- Gusfield
- 1997
(Show Context)
Citation Context ...ocument taxonomy to IP addresses lookup, from data compression to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing tables (see =-=[30, 51, 55, 82]-=-). In this paper, we are concerned with probabilistic properties of the profiles of tries, where the profile of a tree is the sequence of numbers each counting the number of nodes with the same distan... |

532 | Asymptotics and Special Functions - Olver - 1974 |

389 |
Techniques for automatically correcting words in text
- Kukich
- 1992
(Show Context)
Citation Context ...ch trees. Since their invention, tries have found frequent use in many computer science applications. For example, tries are widely used in algorithms for automatically correcting words in texts (see =-=[53]-=-) and in algorithms for taxonomies and toolkits of regular language (see the Ph. D. Thesis [83]); they are also used to represent the event history in datarace detection for multi-threaded objectorien... |

293 | An Introduction To The Analysis of Algorithms - Flajolet, Sedgewick - 1995 |

285 |
Trie memory
- Fredkin
- 1960
(Show Context)
Citation Context ...ntroduction Tries are prototype data structures useful for many indexing and retrieval purposes. They were first proposed by de la Briandais [9] in the late 1950’s for information processing; Fredkin =-=[28]-=- suggested the current name as it being part of retrieval. Tries are multiway trees whose nodes are vectors of characters or digits. Due to their simplicity and efficiency, tries found widespread use ... |

224 |
Average case analysis of algorithms on sequences
- Szpankowski
- 2001
(Show Context)
Citation Context ...ocument taxonomy to IP addresses lookup, from data compression to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing tables (see =-=[30, 51, 55, 82]-=-). In this paper, we are concerned with probabilistic properties of the profiles of tries, where the profile of a tree is the sequence of numbers each counting the number of nodes with the same distan... |

221 |
Evolution of random search trees
- Mahmoud
- 1991
(Show Context)
Citation Context ...ocument taxonomy to IP addresses lookup, from data compression to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing tables (see =-=[30, 51, 55, 82]-=-). In this paper, we are concerned with probabilistic properties of the profiles of tries, where the profile of a tree is the sequence of numbers each counting the number of nodes with the same distan... |

201 | Efcient and precise datarace detection for multithreaded object-oriented programs
- Choi, Lee, et al.
- 2002
(Show Context)
Citation Context ...s for taxonomies and toolkits of regular language (see the Ph. D. Thesis [83]); they are also used to represent the event history in datarace detection for multi-threaded objectoriented programs (see =-=[6]-=-); another example is the internet IP addresses lookup problem (see [62, 77]), where the search time for the IP address problem is directly related to the distribution of the fill-up level (see below ... |

189 | Ramanujan’s Notebooks. Part I - Berndt - 1985 |

178 | Mellin transforms and asymptotics: harmonic sums
- Flajolet, Gourdon, et al.
- 1995
(Show Context)
Citation Context ...ation fQ k.z/ D Qg k.z/ C Q fk 1.pz/ C Q fk 1.qz/; with a suitable Qg k.z/. This equation can be solved explicitly by a simple iteration argument and asymptotically by using the Mellin transform (see =-=[24, 82]-=-). The final step is to invert from the asymptotics of the Poisson generating function Q fk.z/ to recover the asymptotics of xn;k. This last step is guided by the Poisson heuristic, which roughly stat... |

139 |
The Art of Computer Programming. Volume III: Sorting and Searching, Addison-Wesley -<1973). J1973). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
- Knuth
(Show Context)
Citation Context |

127 | Fast address lookups using controlled prefix expansion
- Srinivasan, Varghsee
- 1999
(Show Context)
Citation Context ...s [83]); they are also used to represent the event history in datarace detection for multi-threaded objectoriented programs (see [6]); another example is the internet IP addresses lookup problem (see =-=[62, 77]-=-), where the search time for the IP address problem is directly related to the distribution of the fill-up level (see below for a more precise definition) and other trie parameters. For applications t... |

84 | Mellin transforms and asymptotics: finite differences and Rice’s integrals
- Flajolet, Sedgewick
- 1995
(Show Context)
Citation Context ... 1 p n 0 j n n j Q j;k D X 2 j n q n p n C q n k 1 n j . 1/j j 1 p j .n; k 1/: q j p j k 1 j C q : (12) The last sum falls under the so called Rice integral representation for finite differences (see =-=[26, 82]-=-) from which we conclude n;k D 1 Z �.n C 1/�. s/ s 1 ps 2 i . / �.n C 1 s/ q s p s C q s k 1 ds: This gives (11). Absolute convergence of the integral in (11) when <.s/> 2 is justified as above. Note ... |

60 |
Asymptotic growth of a class of random trees
- Pittel
- 1985
(Show Context)
Citation Context ..., 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see [8, 35, 37, 46, 51, 59, 69, 70, 73, 74, 75]; height: the length of the longest path from the root, or maxfj W Bn;j > 0g; see =-=[8, 11, 12, 13, 14, 23, 27, 34, 66, 67, 80]-=-; 3sshortest path: the length of the shortest path from the root to an external node, or minfj W Bn;j > 0g; see [66, 67]; fill-up (or saturation) level: the largest full level, or maxfj W In;j D 2 j g... |

59 | A general limit theorem for recursive algorithms and combinatorial structures
- Neininger, Rüschendorf
(Show Context)
Citation Context ... the expected external profile divided by n; see [10, 12, 13, 21, 34, 37, 43, 54, 67, 74, 78, 79]; total path length: the sum of distances between nodes and the root, or equivalently, P j jIn;j ; see =-=[8, 11, 44, 60, 59, 73, 74, 75, 78]-=-; size: the total number of internal nodes, or P j In;j ; see [8, 35, 37, 46, 51, 59, 69, 70, 73, 74, 75]; height: the length of the longest path from the root, or maxfj W Bn;j > 0g; see [8, 11, 12, 1... |

56 |
Paths in a random digital tree: limiting distributions
- Pittel
- 1986
(Show Context)
Citation Context ...ed and analyzed through the profiles studied in this paper: depth: the distance from the root to a randomly selected node; its distribution is given by the expected external profile divided by n; see =-=[10, 12, 13, 21, 34, 37, 43, 54, 67, 74, 78, 79]-=-; total path length: the sum of distances between nodes and the root, or equivalently, P j jIn;j ; see [8, 11, 44, 60, 59, 73, 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see ... |

55 | Dynamical Sources in Information Theory: A General Analysis of Trie Structures, Algorithmica 29
- CLÉMENT, FLAJOLET, et al.
- 2001
(Show Context)
Citation Context ..., however, the typical behaviors under such a model often hold under more general models such as Markovian or dynamical sources, although the technicalities are usually more involved; see for example =-=[8, 12, 15, 36]-=-. The motivation of studying the profiles is multifold. First, they are fine shape measures closely connected to many other cost measures on tries; some of them are indicated below. Second, they are a... |

55 | Universal limit laws for depths in random trees
- Devroye
- 1999
(Show Context)
Citation Context ...ed and analyzed through the profiles studied in this paper: depth: the distance from the root to a randomly selected node; its distribution is given by the expected external profile divided by n; see =-=[10, 12, 13, 21, 34, 37, 43, 54, 67, 74, 78, 79]-=-; total path length: the sum of distances between nodes and the root, or equivalently, P j jIn;j ; see [8, 11, 44, 60, 59, 73, 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see ... |

54 | Autocorrelation on words and its applications. Analysis of Suffix Trees by String Ruler Approach
- JACQUET, SZPANKOWSKI
- 1994
(Show Context)
Citation Context ...on tries; some of them are indicated below. Second, they are also asymptotically close to the profiles of suffix trees, which in turn have a direct combinatorial interpretation in terms of words; see =-=[37, 61, 81, 82]-=- for more information and another interpretation in terms of urn models. Third, not only the analytic problems are mathematically challenging, but the diverse new phenomena they exhibit are highly int... |

54 | Analytical de-Poissonization and its applications
- Jacquet, Szpankowski
- 1998
(Show Context)
Citation Context ...luding page usage or b-tries); see [23, 43, 46, 60, 74, 79]; one-sided height (or leader election or loser selection); see [22, 39, 68, 84, 85]. The reader is referred to the book [82] and the papers =-=[15, 38, 74]-=- for a systematic treatment of several of these quantities. The general analytic context. The major difference between most previous study and the current paper is that we are dealing with asymptotics... |

53 | A Generalized Suffix Tree and Its (Un)Expected Asymptotic Behaviors
- Szpankowski
- 1993
(Show Context)
Citation Context ...on tries; some of them are indicated below. Second, they are also asymptotically close to the profiles of suffix trees, which in turn have a direct combinatorial interpretation in terms of words; see =-=[37, 61, 81, 82]-=- for more information and another interpretation in terms of urn models. Third, not only the analytic problems are mathematically challenging, but the diverse new phenomena they exhibit are highly int... |

53 | Taxonomies and toolkits of regular language algorithms - Watson - 1995 |

49 | Probability metrics and recursive algorithms
- RACHEV, RÄUSCHENDORF
- 1995
(Show Context)
Citation Context ...; total path length: the sum of distances between nodes and the root, or equivalently, P j jIn;j ; see [8, 11, 44, 60, 59, 73, 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see =-=[8, 35, 37, 46, 51, 59, 69, 70, 73, 74, 75]-=-; height: the length of the longest path from the root, or maxfj W Bn;j > 0g; see [8, 11, 12, 13, 14, 23, 27, 34, 66, 67, 80]; 3sshortest path: the length of the shortest path from the root to an exte... |

43 |
Trie partitioning process: limiting distributions
- Jacquet, Régnier
- 1986
(Show Context)
Citation Context ...ed and analyzed through the profiles studied in this paper: depth: the distance from the root to a randomly selected node; its distribution is given by the expected external profile divided by n; see =-=[10, 12, 13, 21, 34, 37, 43, 54, 67, 74, 78, 79]-=-; total path length: the sum of distances between nodes and the root, or equivalently, P j jIn;j ; see [8, 11, 44, 60, 59, 73, 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see ... |

41 | Hattab, The profile of binary search trees
- Chauvin, Drmota, et al.
(Show Context)
Citation Context ...entative have received much recent attention, and are showed to exhibit several interesting phenomena such as bimodality of the variance, and multifaceted behaviors of the limiting distributions; see =-=[5, 19, 20, 29, 32]-=- for more information. In contrast, profiles of digital type search trees were much less addressed and most properties remain unknown; see [14, 15, 67] for tries and [2, 40] for digital search trees. ... |

40 |
Analysis of Digital Tries with Markovian Dependency
- Jacquet, Szpankowski
- 1991
(Show Context)
Citation Context ..., however, the typical behaviors under such a model often hold under more general models such as Markovian or dynamical sources, although the technicalities are usually more involved; see for example =-=[8, 12, 15, 36]-=-. The motivation of studying the profiles is multifold. First, they are fine shape measures closely connected to many other cost measures on tries; some of them are indicated below. Second, they are a... |

39 |
On the performance evaluation of extendible hashing and trie searching
- Flajolet
- 1983
(Show Context)
Citation Context ..., 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see [8, 35, 37, 46, 51, 59, 69, 70, 73, 74, 75]; height: the length of the longest path from the root, or maxfj W Bn;j > 0g; see =-=[8, 11, 12, 13, 14, 23, 27, 34, 66, 67, 80]-=-; 3sshortest path: the length of the shortest path from the root to an external node, or minfj W Bn;j > 0g; see [66, 67]; fill-up (or saturation) level: the largest full level, or maxfj W In;j D 2 j g... |

37 | On the distribution for the duration of a randomized leader election algorithm
- Fill, Mahmoud, et al.
- 1996
(Show Context)
Citation Context ...e of two randomly chosen nodes; see [1, 7]; pattern occurrences in tries (including page usage or b-tries); see [23, 43, 46, 60, 74, 79]; one-sided height (or leader election or loser selection); see =-=[22, 39, 68, 84, 85]-=-. The reader is referred to the book [82] and the papers [15, 38, 74] for a systematic treatment of several of these quantities. The general analytic context. The major difference between most previou... |

32 | Analysis of an asymmetric leader election algorithm - Janson, Szpankowski - 1997 |

27 |
A study of trie-like structures under the density model
- Devroye
- 1992
(Show Context)
Citation Context ..., however, the typical behaviors under such a model often hold under more general models such as Markovian or dynamical sources, although the technicalities are usually more involved; see for example =-=[8, 12, 15, 36]-=-. The motivation of studying the profiles is multifold. First, they are fine shape measures closely connected to many other cost measures on tries; some of them are indicated below. Second, they are a... |

26 |
A note on the average depth of tries
- Devroye
- 1982
(Show Context)
Citation Context |

26 |
A probabilistic analysis of the height of tries and of the complexity of triesort, Acta Informatica 21
- Devroye
- 1984
(Show Context)
Citation Context ... the expected external profile divided by n; see [10, 12, 13, 21, 34, 37, 43, 54, 67, 74, 78, 79]; total path length: the sum of distances between nodes and the root, or equivalently, P j jIn;j ; see =-=[8, 11, 44, 60, 59, 73, 74, 75, 78]-=-; size: the total number of internal nodes, or P j In;j ; see [8, 35, 37, 46, 51, 59, 69, 70, 73, 74, 75]; height: the length of the longest path from the root, or maxfj W Bn;j > 0g; see [8, 11, 12, 1... |

25 |
New results on the size of tries
- Jacquet, Régnier
- 1989
(Show Context)
Citation Context ...; total path length: the sum of distances between nodes and the root, or equivalently, P j jIn;j ; see [8, 11, 44, 60, 59, 73, 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see =-=[8, 35, 37, 46, 51, 59, 69, 70, 73, 74, 75]-=-; height: the length of the longest path from the root, or maxfj W Bn;j > 0g; see [8, 11, 12, 13, 14, 23, 27, 34, 66, 67, 80]; 3sshortest path: the length of the shortest path from the root to an exte... |

24 | A diffusion limit for a class of randomly-growing binary trees. Probab. Theory Related Fields
- Aldous, Shields
- 1988
(Show Context)
Citation Context ...ributions; see [5, 19, 20, 29, 32] for more information. In contrast, profiles of digital type search trees were much less addressed and most properties remain unknown; see [14, 15, 67] for tries and =-=[2, 40]-=- for digital search trees. We will show that the limiting behaviors of the profiles are very different from those of non-digital search trees. In particular, while in no range will the normalized prof... |

23 |
On the height of digital trees and related problems
- Szpankowski
- 1991
(Show Context)
Citation Context ..., 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see [8, 35, 37, 46, 51, 59, 69, 70, 73, 74, 75]; height: the length of the longest path from the root, or maxfj W Bn;j > 0g; see =-=[8, 11, 12, 13, 14, 23, 27, 34, 66, 67, 80]-=-; 3sshortest path: the length of the shortest path from the root to an external node, or minfj W Bn;j > 0g; see [66, 67]; fill-up (or saturation) level: the largest full level, or maxfj W In;j D 2 j g... |

22 |
Normal limiting distribution of the size of tries
- Jacquet, Régnier
- 1987
(Show Context)
Citation Context ...; total path length: the sum of distances between nodes and the root, or equivalently, P j jIn;j ; see [8, 11, 44, 60, 59, 73, 74, 75, 78]; size: the total number of internal nodes, or P j In;j ; see =-=[8, 35, 37, 46, 51, 59, 69, 70, 73, 74, 75]-=-; height: the length of the longest path from the root, or maxfj W Bn;j > 0g; see [8, 11, 12, 13, 14, 23, 27, 34, 66, 67, 80]; 3sshortest path: the length of the shortest path from the root to an exte... |

22 | How to select a loser - Prodinger - 1993 |

21 |
On some applications of formulae of Ramanujan in the analysis of algorithms
- Kirschenhofer, Prodinger
- 1991
(Show Context)
Citation Context |

19 | Laws of large numbers and tail inequalities for random tries and Patricia trees
- Devroye
(Show Context)
Citation Context |

19 | Profile of random trees: correlation and width of random recursive trees and binary search trees
- Drmota, Hwang
(Show Context)
Citation Context ...entative have received much recent attention, and are showed to exhibit several interesting phenomena such as bimodality of the variance, and multifaceted behaviors of the limiting distributions; see =-=[5, 19, 20, 29, 32]-=- for more information. In contrast, profiles of digital type search trees were much less addressed and most properties remain unknown; see [14, 15, 67] for tries and [2, 40] for digital search trees. ... |

19 | Average profile of the Lempel-Ziv parsing scheme for a Markovian source
- Jacquet, Szpankowski, et al.
- 2001
(Show Context)
Citation Context ...ributions; see [5, 19, 20, 29, 32] for more information. In contrast, profiles of digital type search trees were much less addressed and most properties remain unknown; see [14, 15, 67] for tries and =-=[2, 40]-=- for digital search trees. We will show that the limiting behaviors of the profiles are very different from those of non-digital search trees. In particular, while in no range will the normalized prof... |

18 | Pro of random trees: limit theorems for random recursive trees and binary search trees. Algorithmica
- Fuchs, Hwang, et al.
- 2006
(Show Context)
Citation Context ...entative have received much recent attention, and are showed to exhibit several interesting phenomena such as bimodality of the variance, and multifaceted behaviors of the limiting distributions; see =-=[5, 19, 20, 29, 32]-=- for more information. In contrast, profiles of digital type search trees were much less addressed and most properties remain unknown; see [14, 15, 67] for tries and [2, 40] for digital search trees. ... |

18 | Pro of random trees: plane-oriented recursive trees
- Hwang
(Show Context)
Citation Context |

17 |
La Briandais. File searching using variable length keys
- De
- 1959
(Show Context)
Citation Context ...rant FA8655-04-1-3074, and NIH Grant R01 GM068959-01. 1s1 Introduction Tries are prototype data structures useful for many indexing and retrieval purposes. They were first proposed by de la Briandais =-=[9]-=- in the late 1950’s for information processing; Fredkin [28] suggested the current name as it being part of retrieval. Tries are multiway trees whose nodes are vectors of characters or digits. Due to ... |

17 | Asymptotic expansions for the Stirling numbers of the first kind
- Hwang
- 1995
(Show Context)
Citation Context ...of the form Z n! z 2 i.`0 1/! n 1 e z .z n/ `0 Z 1 .1 t/ `0 1 f Q.`0/ .n C.z n/t/dtdz; jzjDn j arg.z/j n 2=5 for any`0 1, which is easily seen, by (15), to be bounded above by the O-term in (16); see =-=[31]-=- for similar details. Sinceı.n/ D o.n 1=2 /, this proves the asymptotic nature of (16). Remark. In particular, we have for large n. an D Q f.n/ C O.nı 2 .n/ Q f.n//; an D Q f.n/ 0 n 2 Q f 00 .n/ C O n... |

16 | Selected Papers on the Analysis of Algorithms - Knuth - 2000 |

15 |
A branching process arising in dynamic hashing, trie searching and polynomial factorization
- Flajolet, Steyaert
- 1982
(Show Context)
Citation Context |

15 |
Some results on V -ary asymmetric tries
- Szpankowski
- 1988
(Show Context)
Citation Context |

14 | Bimodality and phase transitions in the profile variance of random binary search trees
- Drmota, Hwang
- 2004
(Show Context)
Citation Context |

14 | Algebraic Methods for Trie Statistics - Flajolet, Regnier, et al. - 1985 |