Results 1  10
of
11
THE SBCTREE: AN INDEX FOR RUNLENGTH COMPRESSED SEQUENCES
, 2005
"... i t, The SBCTree: An Index for RunLength ..."
All correspondence to D. Haussler. Building a Complete Inverted File for a Set of Text Files in Linear Time by
"... Given a finite set of texts S = [~,..... ~vt] over some fixed finite alphabe t E, a complete inverted file for S is an abstract data type that provides the functions find('w), which returns the longest prefix of ~ # which occurs in S; freq(zo), which returns the number of times ~ occurs in S; a ..."
Abstract
 Add to MetaCart
Given a finite set of texts S = [~,..... ~vt] over some fixed finite alphabe t E, a complete inverted file for S is an abstract data type that provides the functions find('w), which returns the longest prefix of ~ # which occurs in S; freq(zo), which returns the number of times ~ occurs in S
Efficient Algorithms for LempelZiv Encoding
"... Abstract. We consider several basic problems for texts and show that if the input texts are given by their LempelZiv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size. The growing importance of massi ..."
Abstract
 Add to MetaCart
of massively stored information requires new approaches to algorithms for compressed texts without decompressing. Denote by LZ(w) the version of a string w produced by LempelZiv encoding algorithm. For given compressed strings LZ(T), LZ(P) we give the first known deterministic polynomial time algorithms
Algorithms for Exploring an Unknown Graph
, 1991
"... We consider the problem of exploring an unknown strongly connected directed graph. We use the exploration model introduced by Deng and Papadimitriou [DP90]. An explorer follows the edges of an unknown graph until she has seen the edges and vertices of the graph. The explorer does not know how many v ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
We consider the problem of exploring an unknown strongly connected directed graph. We use the exploration model introduced by Deng and Papadimitriou [DP90]. An explorer follows the edges of an unknown graph until she has seen the edges and vertices of the graph. The explorer does not know how many vertices and edges the graph has, or how the vertices are connected. At each vertex the explorer can see how many edges are leaving the vertex, but she does not know where they lead to. She chooses one such edge and explores it by traversing it. Deng and Papadimitriou [DP90] have shown that the graph exploration problem for graphs that are very similar to Eulerian graphs can be solved efficiently. They introduce the notion of deficiency for such graphs to measure the "distance" from being Eulerian and give algorithms that solve the exploration problem for deficiencyone and bounded deficiency graphs. We review
Automatic Acquisition of TwoLevel Morphological Rules
 Proceedings of the Fifth Conference on Applied Natural Language Processing
, 1997
"... We describe and experimentally evaluate a complete method for the automatic acquisition of twolevel rules for morphological analyzers/generators. The input to the system is sets of sourcetarget word pairs, where the target is an inflected form of the source. There are two phases in the acqui ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We describe and experimentally evaluate a complete method for the automatic acquisition of twolevel rules for morphological analyzers/generators. The input to the system is sets of sourcetarget word pairs, where the target is an inflected form of the source. There are two phases in the acquisi tion process: (1) segmentation of the target into morphemes and (2) determination of the optimal twolevel rule set with minimal discerning contexts. In phase one, a mini mal acyclic finite state automaton (AFSA) is constructed from string edit sequences of the input pairs. Segmentation of the words into morphemes is achieved through viewing the AFSA as a directed acyclic graph (DAG) and applying heuristics using prop erties of the DAG as well as the elemen tary edit operations. For phase two, the determination of the optimal rule set is made possible with a novel representation of rule contexts, with morpheme boundaries added, in a new DAG. We introduce the notion of a delimiter edge. Delimiter edges are used to select the correct twolevel rule type as well as to extract minimal discerning rule contexts from the DAG. Resuits are presented for English adjectives, Xhosa noun locatives and Afrikaans noun plurals.
Abstract A Hierarchical Approach to Wrapper Induction
"... With the tremendous amount of information that becomes available on the Web on a daily basis, the ability to quickly develop information agents has become a crucial problem. A vital component of any Webbased information agent is a set of wrappers that can extract the relevant data from semistruc ..."
Abstract
 Add to MetaCart
With the tremendous amount of information that becomes available on the Web on a daily basis, the ability to quickly develop information agents has become a crucial problem. A vital component of any Webbased information agent is a set of wrappers that can extract the relevant data from semistructured information sources. Our novel approach to wrapper induction is based on the idea of hierarchical information extraction, which turns the hard problem of extracting data from an arbitrarily complex document into a series of easier extraction tasks. We introduce an inductive algorithm, STALKER, that generates high accuracy extraction rules based on userlabeled training examples. Labeling the training data represents the major bottleneck in using wrapper induction techniques, and our experimental results show that STALKER does significantly better then other approaches; on one hand, STALKER requires up to two orders of magnitude fewer examples than other algorithms, while on the other hand it can handle information sources that could not be wrapped by existing techniques. 1
Parallel Algorithms for Evaluating Sequences of SetManipulation Operations
"... Given an offline sequence S of n setmanipulation operations, we investigate the parallel complexity of evaluating S (Le., finding the response to every operation in S and returning the resulting set). We show that the problem of evaluating S is in N C for various combinations of common setmanipu ..."
Abstract
 Add to MetaCart
Given an offline sequence S of n setmanipulation operations, we investigate the parallel complexity of evaluating S (Le., finding the response to every operation in S and returning the resulting set). We show that the problem of evaluating S is in N C for various combinations of common setmanipulation operations. Once we establish membership in NO (or, if membership in NO is obvious), we develop techniques for improving the time and/or processor
FOR THE COMMER: Ove Agl c 4
, 1990
"... The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either ..."
Abstract
 Add to MetaCart
The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either
Additional Key Words and Phrases: View Creation System
"... The View Creation System (VCS) is an expert system that engages a user in a dialogue about the information requirements for some application, develops an EntityRelationship model for the user’s database view, and then converts the ER model to a set of Fourth Normal Form relations. This paper descr ..."
Abstract
 Add to MetaCart
The View Creation System (VCS) is an expert system that engages a user in a dialogue about the information requirements for some application, develops an EntityRelationship model for the user’s database view, and then converts the ER model to a set of Fourth Normal Form relations. This paper describes the knowledge base of VCS. That is, it presents a formal methodology, capable of mechanization as a computer program, for accepting requirements from a user, identifying and resolving inconsistencies, redundancies, and ambiguities, and ultimately producing a normalized relational representation. Key aspects of the methodology are illustrated by applying VCS’s knowledge base to an actual database design task.
Results 1  10
of
11