Results 1  10
of
17
Solving the string statistics problem in time O(n log n)
 Proc. 29th International Colloquium on Automata, Languages, and Programming
, 2002
"... The string statistics problem consists of preprocessing a string of length n such that given a query pattern of length m, the maximum number of nonoverlapping occurrences of the query pattern in the string can be reported efficiently... ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
The string statistics problem consists of preprocessing a string of length n such that given a query pattern of length m, the maximum number of nonoverlapping occurrences of the query pattern in the string can be reported efficiently...
Finding maximal quasiperiodicities in strings
 In Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching (CPM
, 2000
"... Abstract. Apostolico and Ehrenfeucht defined the notion of a maximal quasiperiodic substring and gave an algorithm that finds all maximal quasiperiodic substrings in a string of length n in time O(n log 2 n). In this paper we give an algorithm that finds all maximal quasiperiodic substrings in a str ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Abstract. Apostolico and Ehrenfeucht defined the notion of a maximal quasiperiodic substring and gave an algorithm that finds all maximal quasiperiodic substrings in a string of length n in time O(n log 2 n). In this paper we give an algorithm that finds all maximal quasiperiodic substrings in a string of length n in time O(n log n) andspaceO(n). Our algorithm uses the suffix tree as the fundamental data structure combined with efficient methods for merging and performing multiple searches in search trees. Besides finding all maximal quasiperiodic substrings, our algorithm also marks the nodes in the suffix tree that have a superprimitive pathlabel. 1
On HairpinFree Words and Languages
"... Abstract. The paper examines the concept of hairpinfree words motivated from the biocomputing and bioinformatics fields. Hairpin (free) DNA structures have numerous applications to DNA computing and molecular genetics in general. A word is called hairpinfree if it cannot bewrittenintheformxvyθ(v) ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Abstract. The paper examines the concept of hairpinfree words motivated from the biocomputing and bioinformatics fields. Hairpin (free) DNA structures have numerous applications to DNA computing and molecular genetics in general. A word is called hairpinfree if it cannot bewrittenintheformxvyθ(v)z, with certain additional conditions, for an involution θ (a function θ with the property that θ 2 equals the identity function). We consider three involutions relevant to DNA computing: a) the mirror image function, b) the DNA complementarity function over the DNA alphabet {A, C, G, T} which associates A with T and C with G, and c) the WatsonCrick involution which is the composition of the previous two. We study elementary properties and finiteness of hairpin (free) languages w.r.t. the involutions a) and c). Maximal length of hairpinfree words is also examined. Finally, descriptional complexity of maximal hairpinfree languages is determined.
String Pattern Matching For A Deluge Survival Kit
, 2000
"... String Pattern Matching concerns itself with algorithmic and combinatorial issues related to matching and searching on linearly arranged sequences of symbols, arguably the simplest possible discrete structures. As unprecedented volumes of sequence data are amassed, disseminated and shared at an incr ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
String Pattern Matching concerns itself with algorithmic and combinatorial issues related to matching and searching on linearly arranged sequences of symbols, arguably the simplest possible discrete structures. As unprecedented volumes of sequence data are amassed, disseminated and shared at an increasing pace, effective access to, and manipulation of such data depend crucially on the efficiency with which strings are structured, compressed, transmitted, stored, searched and retrieved. This paper samples from this perspective, and with the authors' own bias, a rich arsenal of ideas and techniques developed in more than three decades of history.
Finger Search Trees
, 2005
"... One of the most studied problems in computer science is the problem of maintaining a sorted sequence of elements to facilitate efficient searches. The prominent solution to the problem is to organize the sorted sequence as a balanced search tree, enabling insertions, deletions and searches in logari ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
One of the most studied problems in computer science is the problem of maintaining a sorted sequence of elements to facilitate efficient searches. The prominent solution to the problem is to organize the sorted sequence as a balanced search tree, enabling insertions, deletions and searches in logarithmic time. Many different search trees have been developed and studied intensively in the literature. A discussion of balanced binary search trees can e.g. be found in [4]. This chapter is devoted to finger search trees which are search trees supporting fingers, i.e. pointers, to elements in the search trees and supporting efficient updates and searches in the vicinity of the fingers. If the sorted sequence is a static set of n elements then a simple and space efficient representation is a sorted array. Searches can be performed by binary search using 1+⌊log n⌋ comparisons (we throughout this chapter let log x denote log 2 max{2, x}). A finger search starting at a particular element of the array can be performed by an exponential search by inspecting elements at distance 2 i − 1 from the finger for increasing i followed by a binary search in a range of 2 ⌊log d ⌋ − 1 elements, where d is the rank difference in the sequence between the finger and the search element. In Figure 11.1 is shown an exponential search for the element 42 starting at 5. In the example d = 20. An exponential search requires
On Maximal Repeats in Strings
"... We clarify in this paper the relationship between the maximal repeats in a string p and the compact suffix automaton CSA(p) built on p. It appears that the maximal repeats are the longest strings reaching each internal state of the CSA(p). This result permits to derive the maximal and the average nu ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We clarify in this paper the relationship between the maximal repeats in a string p and the compact suffix automaton CSA(p) built on p. It appears that the maximal repeats are the longest strings reaching each internal state of the CSA(p). This result permits to derive the maximal and the average number of maximal repeats (under a model of independence and equiprobability of the characters of p) from earlier studies on the size of CSA(p). It also permits to get a simpler enumeration algorithm of all the maximal repeats in p.
Computational Biology
, 2000
"... During four years of arduous service, a Ph. D. student is expected to familiarise himself with his field of research, and, hopefully, contribute to this field. This is reflected by the division of this dissertation into two parts. Part I is a (partial) overview of the field of computational biology ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
During four years of arduous service, a Ph. D. student is expected to familiarise himself with his field of research, and, hopefully, contribute to this field. This is reflected by the division of this dissertation into two parts. Part I is a (partial) overview of the field of computational biology as I conceive it, an overview that is aimed at presenting the context for my contributions to the field of computational biology. These contributions are presented in part II as five independent articles
Fast Optimal Algorithms for Computing All the Repeats in a String ⋆
"... Abstract. Given a string x = x[1..n] on an alphabet of size α, and a threshold pmin ≥ 1, we first describe a new algorithm PSY1 that, based on suffix array construction, computes all the complete nonextendible repeats in x of length p ≥ pmin. PSY1 executes in Θ(n) time independent of alphabet size a ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Given a string x = x[1..n] on an alphabet of size α, and a threshold pmin ≥ 1, we first describe a new algorithm PSY1 that, based on suffix array construction, computes all the complete nonextendible repeats in x of length p ≥ pmin. PSY1 executes in Θ(n) time independent of alphabet size and is an order of magnitude faster than the two other algorithms previously proposed for this problem. Second, we describe a new fast algorithm PSY2 for computing all complete supernonextendible repeats in x that also executes in Θ(n) time independent of alphabet size, thus asymptotically faster than methods previously proposed. Both algorithms require 9n bytes of storage, including preprocessing (with a minor caveat for PSY1). We conclude with a brief discussion of applications to bioinformatics and data compression. 1
Exercise 23.16 in [6].
, 2000
"... 2: Array initialization Section III.8.1 of [15] contains a description of how a bitvector can be intitialized in worst case constant time. ..."
Abstract
 Add to MetaCart
2: Array initialization Section III.8.1 of [15] contains a description of how a bitvector can be intitialized in worst case constant time.
Literature Notes on Homeworks and the Takehome Exam
, 2000
"... lgorithm Exercise 2 of homework 4 in [13]. Prim's algorithm for computing a minimum spanning tree is, e.g., described in Section 24.2 of [6]. 5: The Stable Marriage Problem Exercise 1 of homework 6 in [13]. Gale and Shapley were the rst to investigate the stable marriage problem and gave an O(n ..."
Abstract
 Add to MetaCart
lgorithm Exercise 2 of homework 4 in [13]. Prim's algorithm for computing a minimum spanning tree is, e.g., described in Section 24.2 of [6]. 5: The Stable Marriage Problem Exercise 1 of homework 6 in [13]. Gale and Shapley were the rst to investigate the stable marriage problem and gave an O(n 2 ) solution in [8]. 1 Algorithms January 17, 2000 Homework 3 1: Minimum Spanning Trees A linear time algorithm for nding a minimum spanning tree for planar graph was rst given in [5]. The O(m log n) time algorithm for nding a minimum spanning tree in a general graph was described in [7]the paper introducing Fibonacci heaps. The current best dertministic minimum spanning tree algorithms use time O(m(m;n)), where is an inverse of Ackerman's function [4, 17]. A randomized