Algorithms on Compressed Strings and Arrays
 In Proc. 26th Ann. Conf. on Current Trends in Theory and Practice of Infomatics
, 1999
"... . We survey the complexity issues related to several algorithmic problems for compressed one and twodimensional texts without explicit decompression: patternmatching, equalitytesting, computation of regularities, subsegment extraction, language membership, and solvability of word equations. Our ..."
Abstract

Cited by 18 (0 self)
. We survey the complexity issues related to several algorithmic problems for compressed one and twodimensional texts without explicit decompression: patternmatching, equalitytesting, computation of regularities, subsegment extraction, language membership, and solvability of word equations. Our basic problem is one and twodimensional patternmatching together with its variations. For some types of compression the patternmatching problems are infeasible (NPhard), for other types they are solvable in polynomial time and we discuss how to reduce the degree of corresponding polynomials. 1 Introduction In the last decade a new stream of research related to data compression has emerged: algorithms on compressed objects. It has been caused by the increase in the volume of data and the need to store and transmit masses of information in compressed form. The compressed information has to be quickly accessed and processed without explicit decompression. In this paper we consider severa...
On the Determinization of Weighted Finite Automata
 SIAM J. Comput
, 1998
"... . We study determinization of weighted finitestate automata (WFAs), which has important applications in automatic speech recognition (ASR). We provide the first polynomialtime algorithm to test for the twins property, which determines if a WFA admits a deterministic equivalent. We also provide ..."
Abstract

Cited by 16 (0 self)
. We study determinization of weighted finitestate automata (WFAs), which has important applications in automatic speech recognition (ASR). We provide the first polynomialtime algorithm to test for the twins property, which determines if a WFA admits a deterministic equivalent. We also provide a rigorous analysis of a determinization algorithm of Mohri, with tight bounds for acyclic WFAs. Given that WFAs can expand exponentially when determinized, we explore why those used in ASR tend to shrink. The folklore explanation is that ASR WFAs have an acyclic, multipartite structure. We show, however, that there exist such WFAs that always incur exponential expansion when determinized. We then introduce a class of WFAs, also with this structure, whose expansion depends on the weights: some weightings cause them to shrink, while others, including random weightings, cause them to expand exponentially. We provide experimental evidence that ASR WFAs exhibit this weight dependence. ...
Efficiency of Fast Parallel PatternSearching in Highly Compressed Texts
"... We consider efficiency of NCalgorithms for patternsearching in highly compressed one and twodimensional texts. "Highly compressed" means that the text can be exponentially large with respect to its compressed version, and "fast" means "in polylogarithmic time". Given an uncompressed pattern P an ..."
Abstract

Cited by 2 (0 self)
We consider efficiency of NCalgorithms for patternsearching in highly compressed one and twodimensional texts. "Highly compressed" means that the text can be exponentially large with respect to its compressed version, and "fast" means "in polylogarithmic time". Given an uncompressed pattern P and a compressed version of a text T, the compressed matching problem is to test if P occurs in T. Two types of closely related compressed representations of 1dimensional texts are considered: the LempelZiv encodings (LZ, in short) and restricted LZ encodings (RLZ, in short). For highly compressed texts there is a small difference between them, in extreme situations both of them compress text exponentially, e.g. Fibonacci words of size N have compressed versions of size O(log N) for LZ and Restricted LZ encodings. An efficient sequential algorithm for LZcompressed matching was given in [7], we show that this algorithm is inherently sequential. Despite similarities we prove that LZcompressed m...