Results 1  10
of
25
Word problems and membership problems on compressed words
 SIAM J. Comput., 35(5):1210
"... Abstract. We consider a compressed form of the word problem for finitely presented monoids, where the input consists of two compressed representations of words over the generators of a monoid M, and we ask whether these two words represent the same monoid element of M. Words are compressed using str ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
Abstract. We consider a compressed form of the word problem for finitely presented monoids, where the input consists of two compressed representations of words over the generators of a monoid M, and we ask whether these two words represent the same monoid element of M. Words are compressed using straightline programs, i.e., contextfree grammars that generate exactly one word. For several classes of finitely presented monoids we obtain completeness results for complexity classes in the range from P to EXPSPACE. As a byproduct of our results on compressed word problems we obtain a fixed deterministic contextfree language with a PSPACEcomplete compressed membership problem. The existence of such a language was open so far. Finally, we will investigate the complexity of the compressed membership problem for various circuit complexity classes. Key words. grammarbased compression, word problems for monoids, contextfree languages, complexity AMS subject classifications. 20F10, 68Q17, 68Q42
Structural selectivity estimation for XML documents
 In ICDE
, 2007
"... Estimating the selectivity of queries is a crucial problem in database systems. Virtually all database systems rely on the use of selectivity estimates to choose amongst the many possible execution plans for a particular query. In terms of XML databases, the problem of selectivity estimation of quer ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Estimating the selectivity of queries is a crucial problem in database systems. Virtually all database systems rely on the use of selectivity estimates to choose amongst the many possible execution plans for a particular query. In terms of XML databases, the problem of selectivity estimation of queries presents new challenges: many evaluation operators are possible, such as simple navigation, structural joins, or twig joins, and many different indexes are possible ranging from traditional Btrees to complicated XMLspecific graph indexes. A new synopsis for XML documents is introduced which can be effectively used to estimate the selectivity of complex path queries. The synopsis is based on a lossy compression of the document tree that underlies the XML document, and can be computed in one pass from the document. It has several advantages over existing approaches: (1) it allows one to estimate the selectivity of queries containing all XPath axes, including the ordersensitive ones, (2) the estimator returns a range within which the actual selectivity is guaranteed to lie, with the size of this range implicitly providing a confidence measure of the estimate, and (3) the synopsis can be incrementally updated to reflect changes in the XML database. 1
Random Access to GrammarCompressed Strings
, 2011
"... Let S be a string of length N compressed into a contextfree grammar S of size n. We present two representations of S achieving O(log N) random access time, and either O(n · αk(n)) construction time and space on the pointer machine model, or O(n) construction time and space on the RAM. Here, αk(n) is ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Let S be a string of length N compressed into a contextfree grammar S of size n. We present two representations of S achieving O(log N) random access time, and either O(n · αk(n)) construction time and space on the pointer machine model, or O(n) construction time and space on the RAM. Here, αk(n) is the inverse of the k th row of Ackermann’s function. Our representations also efficiently support decompression of any substring in S: we can decompress any substring of length m in the same complexity as a single random access query and additional O(m) time. Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammarcompressed strings without decompression. For instance, we can find all approximate occurrences of a pattern P with at most k errors in time O(n(min{P k, k 4 + P } + log N) + occ), where occ is the number of occurrences of P in S. Finally, we are able to generalize our results to navigation and other operations on grammarcompressed trees. All of the above bounds significantly improve the currently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two ”biased” weighted ancestor data structures, and a compact representation of heavypaths in grammars.
Monadic secondorder unification is NPcomplete
 In RTA’04, volume 3091 of LNCS
, 2004
"... Abstract. Bounded SecondOrder Unification is the problem of deciding, for a given secondorder equation t? = u and a positive integer m, whether there exists a unifier σ such that, for every secondorder variable F, the terms instantiated for F have at most m occurrences of every bound variable. I ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
Abstract. Bounded SecondOrder Unification is the problem of deciding, for a given secondorder equation t? = u and a positive integer m, whether there exists a unifier σ such that, for every secondorder variable F, the terms instantiated for F have at most m occurrences of every bound variable. It is already known that Bounded SecondOrder Unification is decidable and NPhard, whereas general SecondOrder Unification is undecidable. We prove that Bounded SecondOrder Unification is NPcomplete, provided that m is given in unary encoding, by proving that a sizeminimal solution can be represented in polynomial space, and then applying a generalization of Plandowski’s polynomial algorithm that compares compacted terms in polynomial time. 1
Stratified context unification is npcomplete
 In Proc. of the 3rd International Joint Conference on Automated Reasoning, IJCAR’06
, 2006
"... Abstract. Context Unification is the problem to decide for a given set of secondorder equations E where all secondorder variables are unary, whether there exists a unifier, such that for every secondorder variable X, theabstractionλx.r instantiated for X has exactly one occurrence of the bound va ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Abstract. Context Unification is the problem to decide for a given set of secondorder equations E where all secondorder variables are unary, whether there exists a unifier, such that for every secondorder variable X, theabstractionλx.r instantiated for X has exactly one occurrence of the bound variable x in r. Stratified Context Unification is a specialization where the nesting of secondorder variables in E is restricted. It is already known that Stratified Context Unification is decidable, NPhard, and in PSPACE, whereas the decidability and the complexity of Context Unification is unknown. We prove that Stratified Context Unification is in NP by proving that a sizeminimal solution can be represented in a singleton tree grammar of polynomial size, and then applying a generalization of Plandowski’s polynomial algorithm that compares compacted terms in polynomial time. This also demonstrates the high potential of singleton tree grammars for optimizing programs maintaining large terms. A corollary of our result is that solvability of rewrite constraints is NPcomplete. 1
Xquec: A queryconscious compressed xml database
 ACM Trans. Internet Tech
"... XML compression has gained prominence recently because it counters the disadvantage of the “verbose ” representation XML gives to data. In many applications, such as data exchange and data archiving, entirely compressing and decompressing a document is acceptable. In other applications, where querie ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
XML compression has gained prominence recently because it counters the disadvantage of the “verbose ” representation XML gives to data. In many applications, such as data exchange and data archiving, entirely compressing and decompressing a document is acceptable. In other applications, where queries must be run over compressed documents, compression may not be beneficial since the performance penalty in running the query processor over compressed data outweighs the data compression benefits. While balancing the interests of compression and query processing has received significant attention in the domain of relational databases, these results do not immediately translate to XML data. In this paper, we address the problem of embedding compression into XML databases without degrading query performance. Since the setting is rather different from relational databases, the choice of compression granularity and compression algorithms must be revisited. Query execution in the compressed domain must also be rethought in the framework of XML query processing, due to the richer structure of XML data. Indeed, a proper storage design for the compressed data plays a crucial role here. The XQueC system (standing for XQuery Processor and C ompressor) covers a wide set of
S.: XML tree structure compression
 In: XANTEC
"... In an XML document a considerable fraction consists of markup, that is, begin and endelement tags describing the document’s tree structure. XML compression tools such as XMill separate the tree structure from the data content and compress each separately. The main focus in these compression tools i ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
In an XML document a considerable fraction consists of markup, that is, begin and endelement tags describing the document’s tree structure. XML compression tools such as XMill separate the tree structure from the data content and compress each separately. The main focus in these compression tools is how to group similar data content together prior to performing standard data compression such as gzip, bzip2, or ppm. In contrast, the focus of this paper is on compressing the tree structure part of an XML document. We use a known algorithm to derive a grammar representation of the tree structure which factors out the repetition of tree patterns. We then investigate several succinct binary encodings of these grammars. Our experiments show that we can be consistently smaller than the tree structure compression carried out by XMill, using the same backend compressors as XMill on our encodings. However, the most surprising result is that our own Huffmanlike encoding of the grammars (without any backend compressor whatsoever) consistently outperforms XMill with gzip backend. This is of particular interest because our Huffmannlike encoding can be queried without prior decompression. To the best of our knowledge this offers the smallest queriable XML tree structure representation currently available. 1
Tree automata and XPath on compressed trees
 Proceedings of the Tenth International Conference on Implementation and Application of Automata (CIAA 2005), Sophia Antipolis (France), number 3845 in Lecture Notes in Computer Science
, 2006
"... ..."
Pattern Matching of Compressed Terms and Contexts and Polynomial Rewriting
, 2011
"... Abstract. A generalization of the compressed string pattern match that applies to terms with variables is investigated: Given terms s and t compressed by singleton tree grammars, the task is to find an instance of s that occurs as a subterm in t. We show that this problem is in NP and that the task ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. A generalization of the compressed string pattern match that applies to terms with variables is investigated: Given terms s and t compressed by singleton tree grammars, the task is to find an instance of s that occurs as a subterm in t. We show that this problem is in NP and that the task can be performed in time O(n cVar(s) ), including the construction of the compressed substitution, and a representation of all occurrences. We show that the special case where s is uncompressed can be performed in polynomial time. As a nice application we show that for an equational deduction of t to t ′ by an equality axiom l = r (a rewrite) a single step can be performed in polynomial time in the size of compression of t and l, r if the number of variables is fixed in l. We also show that n rewriting steps can be performed in polynomial time, if the equational axioms are compressed and assumed to be constant for the rewriting sequence. Another potential application are querying mechanisms on compressed XMLdata bases. 1
FirstOrder Unification on Compressed Terms
, 2011
"... Singleton Tree Grammars (STGs) have recently drawn considerable attention. They generalize the sharing of subtrees known from DAGs to sharing of connected subgraphs. This allows to obtain smaller inmemory representations of trees than with DAGs. In the past years some important tree algorithms were ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Singleton Tree Grammars (STGs) have recently drawn considerable attention. They generalize the sharing of subtrees known from DAGs to sharing of connected subgraphs. This allows to obtain smaller inmemory representations of trees than with DAGs. In the past years some important tree algorithms were proved to perform efficiently (without decompression) over STGs; e.g., type checking, equivalence checking, and unification. We present a tool that implements an extension of the unification algorithm for STGs. This algorithm makes extensive use of equivalence checking. For the latter we implemented two variants, the classical exact one and a recent randomized one. Our experiments show that the randomized algorithm performs better. The running times are also compared to those of unification over uncompressed trees. Digital Object Identifier 10.4230/LIPIcs.RTA.2011.51