Results 1  10
of
12
Motif Statistics
, 1999
"... We present a complete analysis of the statistics of number of occurrences of a regular expression pattern in a random text. This covers "motifs" widely used in computational biology. Our approach is based on: (i) a constructive approach to classical results in theoretical computer science (automata ..."
Abstract

Cited by 48 (4 self)
 Add to MetaCart
We present a complete analysis of the statistics of number of occurrences of a regular expression pattern in a random text. This covers "motifs" widely used in computational biology. Our approach is based on: (i) a constructive approach to classical results in theoretical computer science (automata and formal language theory), in particular, the rationality of generating functions of regular languages; (ii) analytic combinatorics that is used for deriving asymptotic properties from generating functions; (iii) computer algebra for determining generating functions explicitly, analysing generating functions and extracting coefficients efficiently. We provide constructions for overlapping or nonoverlapping matches of a regular expression. A companion implementation produces multivariate generating functions for the statistics under study. A fast computation of Taylor coefficients of the generating functions then yields exact values of the moments with typical application to random t...
Asymptotic enumeration methods for analyzing LDPC codes
 IEEE Trans. Inform. Theory
, 2004
"... We show how asymptotic estimates of powers of polynomials with nonnegative coefficients can be used in the analysis of lowdensity paritycheck (LDPC) codes. In particular we show how these estimates can be used to derive the asymptotic distance spectrum of both regular and irregular LDPC code ense ..."
Abstract

Cited by 41 (2 self)
 Add to MetaCart
We show how asymptotic estimates of powers of polynomials with nonnegative coefficients can be used in the analysis of lowdensity paritycheck (LDPC) codes. In particular we show how these estimates can be used to derive the asymptotic distance spectrum of both regular and irregular LDPC code ensembles. We then consider the binary erasure channel (BEC). Using these estimates we derive lower bounds on the error exponent, under iterative decoding, of LDPC codes used over the BEC. Both regular and irregular code structures are considered. These bounds are compared to the corresponding bounds when optimal (maximum likelihood) decoding is applied.
Asymptotics Of Multivariate Sequences, Part I: Smooth Points Of The Singular Variety
 J. COMB. THEORY, SERIES A
, 1999
"... Given a multivariate generating function F (z1 ; : : : ; zd ) = P ar 1 ;:::;r d z r 1 1 z r d d , we determine asymptotics for the coecients. Our approach is to use Cauchy's integral formula near singular points of F , resulting in a tractable oscillating integral. This paper treats the c ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Given a multivariate generating function F (z1 ; : : : ; zd ) = P ar 1 ;:::;r d z r 1 1 z r d d , we determine asymptotics for the coecients. Our approach is to use Cauchy's integral formula near singular points of F , resulting in a tractable oscillating integral. This paper treats the case where the singular point of F is a smooth point of a surface of poles. Companion papers G treat singular points of F where the local geometry is more complicated, and for which other methods of analysis are not known.
Combinatorial Properties of RNA Secondary Structures
 JOURNAL OF COMPUTATIONAL BIOLOGY
, 2001
"... The secondary structure of a RNA molecule is of great importance and possesses inuence, e.g. on the interaction of tRNA molecules with proteins or on the stabilization of mRNA molecules. The classication of secondary structures by means of their order proved useful with respect to numerous applicati ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
The secondary structure of a RNA molecule is of great importance and possesses inuence, e.g. on the interaction of tRNA molecules with proteins or on the stabilization of mRNA molecules. The classication of secondary structures by means of their order proved useful with respect to numerous applications. In 1978 Waterman, who gave the rst precise formal framework for the topic, suggested to determine the number a n;p of secondary structures of size n and given order p. Since then, no satisfactory result has been found. Based on an observation due to Viennot et al. we will derive generating functions for the secondary structures of order p from generating functions for binary tree structures with HortonStrahler number p. These generating functions enable us to compute a precise asymptotic equivalent for a n;p . Furthermore, we will determine the related number of structures when the number of unpaired bases shows up as an additional parameter. Our approach proves to be general enough to compute the average order of a secondary structure together with all the rth moments and to enumerate substructures such as hairpins or bulges in dependence on the order of the secondary structures considered.
On the Number of Descendants and Ascendants in Random Search Trees
, 1997
"... We consider here the probabilistic analysis of the number of descendants and the number of ascendants of a given internal node in a random search tree. The performance of several important algorithms on search trees is closely related to these quantities. For instance, the cost of a successful searc ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We consider here the probabilistic analysis of the number of descendants and the number of ascendants of a given internal node in a random search tree. The performance of several important algorithms on search trees is closely related to these quantities. For instance, the cost of a successful search is proportional to the number of ascendants of the sought element. On the other hand, the probabilistic behavior of the number of descendants is relevant for the analysis of paged data structures and for the analysis of the performance of quicksort, when recursive calls are not made on small subfiles. We also consider the number of ascendants and descendants of a random node in a random search tree, i.e., the grand averages of the quantities mentioned above. We address these questions for standard binary search trees and for locally balanced search trees. These search trees were introduced by Poblete and Munro and are binary search trees such that each subtree of size 3 is balanced; in oth...
Local limit distributions in pattern statistics: beyond the Markovian models
 Proceedings 21st S.T.A.C.S., V. Diekert and M. Habib editors, Lecture Notes in Computer Science
, 2004
"... Abstract Motivated by problems of pattern statistics, we study the limit distribution of the random variable counting the number of occurrences of the symbol ¥ in a word of length ¦ chosen at random in § ¥©¨����� � , according to a probability distribution defined via a finite automaton equipped wit ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract Motivated by problems of pattern statistics, we study the limit distribution of the random variable counting the number of occurrences of the symbol ¥ in a word of length ¦ chosen at random in § ¥©¨����� � , according to a probability distribution defined via a finite automaton equipped with positive real weights. We determine the local limit distribution of such a quantity under the hypothesis that the transition matrix naturally associated with the finite automaton is primitive. Our probabilistic model extends the Markovian models traditionally used in the literature on pattern statistics. This result is obtained by introducing a notion of symbolperiodicity for irreducible matrices whose entries are polynomials in one variable over an arbitrary positive semiring. This notion and the related results we prove are of interest in their own right, since they extend classical properties of the Perron–Frobenius Theory for nonnegative real matrices.
A Unified Approach to the Analysis of HortonStrahler Parameters of Binary Tree Structures
 J. DAIRY SCI
, 2001
"... The HortonStrahler number naturally arose from problems in various fields, e.g. geology, molecular biology and computer science. Consequently, detailed investigations of related parameters for different classes of binary tree structures are of interest. This paper shows one possibility of how to pe ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The HortonStrahler number naturally arose from problems in various fields, e.g. geology, molecular biology and computer science. Consequently, detailed investigations of related parameters for different classes of binary tree structures are of interest. This paper shows one possibility of how to perform a mathematical analysis for parameters related to the HortonStrahler number in a unified way such that only a single analysis is needed to obtain results for many different classes of trees. The method is explained by the examples of the expected HortonStrahler number and the related rth moments, the average number of critical nodes and the expected distance between critical nodes.
Enumeration of Geometric Configurations on a Convex Polygon
, 1999
"... We survey recent work on the enumeration of noncrossing configurations on the set of vertices of a convex polygon, such as triangulations, trees, and forests. Exact formulae and limit laws are determined for several parameters of interest. In the second part of the talk we present results on the en ..."
Abstract
 Add to MetaCart
We survey recent work on the enumeration of noncrossing configurations on the set of vertices of a convex polygon, such as triangulations, trees, and forests. Exact formulae and limit laws are determined for several parameters of interest. In the second part of the talk we present results on the enumeration of chord diagrams (pairings of 2n vertices of a convex polygon by means of n disjoint pairs). We present limit laws for the number of components, the size of the largest component and the number of crossings. The use of generating functions and of a variation of Levy's continuity theorem for characteristic functions enable us to establish that most of the limit laws presented here are Gaussian. (Joint work by Marc Noy with Philippe Flajolet and others.) 1. Analytic Combinatorics of Noncrossing Configurations [3] 1.1. Connected Graphs and General Graphs. Let \Pi n = fv 1 ; : : : ; v n g be a fixed set of points in the plane, conventionally ordered counterclockwise, that are verti...
Random Generation of Words of Algebraic Languages according to the frequencies of the letters
"... Let L be an algebraic language on an alphabet X = fx 1 , x 2 , ..., x k g, and n a positive integer. We consider the problem of generating at random words of L with respect to a given distribution of the number of occurrences of the letters. We consider two alternatives of the problem. In the first ..."
Abstract
 Add to MetaCart
Let L be an algebraic language on an alphabet X = fx 1 , x 2 , ..., x k g, and n a positive integer. We consider the problem of generating at random words of L with respect to a given distribution of the number of occurrences of the letters. We consider two alternatives of the problem. In the first one, a vector of natural numbers (n 1 , n 2 , ..., n k ) such that n 1 +n 2 + +n k = n is given, and the words must be generated uniformly among the set of words of L which contain exactly n i letters x i (1 i k). The second alternative consists, given v = (v 1 , ..., v k ) a vector of positive real numbers such that v 1 + + v k = 1, to generate at random words among the whole set of words of L of length n, in such a way that the expected number of occurrences of any letter x i equals nv i (1 i k), and two words having the same distribution of letters have the same probability to be generated. For this purpose, we design and study two alternatives of the recursive method which is classically employed for the uniform generation of combinatorial structures. This type of "controlled" nonuniform generation is of great interest in the statistical study of genomic sequences.