Results 1  10
of
4,818
Multiparty Communication Complexity
, 1989
"... A given Boolean function has its input distributed among many parties. The aim is to determine which parties to tMk to and what information to exchange with each of them in order to evaluate the function while minimizing the total communication. This paper shows that it is possible to obtain the Boo ..."
Abstract

Cited by 760 (22 self)
 Add to MetaCart
A given Boolean function has its input distributed among many parties. The aim is to determine which parties to tMk to and what information to exchange with each of them in order to evaluate the function while minimizing the total communication. This paper shows that it is possible to obtain the Boolean answer deterministically with only a polynomial increase in communication with respect to the information lower bound given by the nondeterministic communication complexity of the function.
A Guided Tour to Approximate String Matching
 ACM COMPUTING SURVEYS
, 1999
"... We survey the current techniques to cope with the problem of string matching allowing errors. This is becoming a more and more relevant issue for many fast growing areas such as information retrieval and computational biology. We focus on online searching and mostly on edit distance, explaining t ..."
Abstract

Cited by 598 (36 self)
 Add to MetaCart
We survey the current techniques to cope with the problem of string matching allowing errors. This is becoming a more and more relevant issue for many fast growing areas such as information retrieval and computational biology. We focus on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms and their complexities. We present a number of experiments to compare the performance of the different algorithms and show which are the best choices according to each case. We conclude with some future work directions and open problems.
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
, 1997
"... In semistructured databases there is no schema fixed in advance. To provide the benefits of a schema in such environments, we introduce DataGuides: concise and accurate structural summaries of semistructured databases. DataGuides serve as dynamic schemas, generated from the database; they are ..."
Abstract

Cited by 572 (13 self)
 Add to MetaCart
(Show Context)
In semistructured databases there is no schema fixed in advance. To provide the benefits of a schema in such environments, we introduce DataGuides: concise and accurate structural summaries of semistructured databases. DataGuides serve as dynamic schemas, generated from the database; they are useful for browsing database structure, formulating queries, storing information such as statistics and sample values, and enabling query optimization. This paper presents the theoretical foundations of DataGuides along with an algorithm for their creation and an overview of incremental maintenance. We provide performance results based on our implementation of DataGuides in the Lore DBMS for semistructured data. We also describe the use of DataGuides in Lore, both in the user interface to enable structure browsing and query formulation, and as a means of guiding the query processor and optimizing query execution.
Convolution Kernels on Discrete Structures
, 1999
"... We introduce a new method of constructing kernels on sets whose elements are discrete structures like strings, trees and graphs. The method can be applied iteratively to build a kernel on an infinite set from kernels involving generators of the set. The family of kernels generated generalizes the fa ..."
Abstract

Cited by 506 (0 self)
 Add to MetaCart
(Show Context)
We introduce a new method of constructing kernels on sets whose elements are discrete structures like strings, trees and graphs. The method can be applied iteratively to build a kernel on an infinite set from kernels involving generators of the set. The family of kernels generated generalizes the family of radial basis kernels. It can also be used to define kernels in the form of joint Gibbs probability distributions. Kernels can be built from hidden Markov random elds, generalized regular expressions, pairHMMs, or ANOVA decompositions. Uses of the method lead to open problems involving the theory of infinitely divisible positive definite functions. Fundamentals of this theory and the theory of reproducing kernel Hilbert spaces are reviewed and applied in establishing the validity of the method.
Parsing English with a Link Grammar
, 1991
"... We define a new formal grammatical system called a link grammar . A sequence of words is in the language of a link grammar if there is a way to draw links between words in such a way that (1) the local requirements of each word are satisfied, (2) the links do not cross, and (3) the words form a conn ..."
Abstract

Cited by 437 (4 self)
 Add to MetaCart
(Show Context)
We define a new formal grammatical system called a link grammar . A sequence of words is in the language of a link grammar if there is a way to draw links between words in such a way that (1) the local requirements of each word are satisfied, (2) the links do not cross, and (3) the words form a connected graph. We have encoded English grammar into such a system, and written a program (based on new algorithms) for efficiently parsing with a link grammar. The formalism is lexical and makes no explicit use of constituents and categories. The breadth of English phenomena that our system handles is quite large. A number of sophisticated and new techniques were used to allow efficient parsing of this very complex grammar. Our program is written in C, and the entire system may be obtained via anonymous ftp. Several other researchers have begun to use link grammars in their own research. 1 Introduction Most sentences of most natural languages have the property that if arcs are drawn connecti...
Combating web spam with trustrank
 In VLDB
, 2004
"... Web spam pages use various techniques to achieve higherthandeserved rankings in a search engine’s results. While human experts can identify spam, it is too expensive to manually evaluate a large number of pages. Instead, we propose techniques to semiautomatically separate reputable, good pages fr ..."
Abstract

Cited by 413 (3 self)
 Add to MetaCart
(Show Context)
Web spam pages use various techniques to achieve higherthandeserved rankings in a search engine’s results. While human experts can identify spam, it is too expensive to manually evaluate a large number of pages. Instead, we propose techniques to semiautomatically separate reputable, good pages from spam. We first select a small set of seed pages to be evaluated by an expert. Once we manually identify the reputable seed pages, we use the link structure of the web to discover other pages that are likely to be good. In this paper we discuss possible ways to implement the seed selection and the discovery of good pages. We present results of experiments run on the World Wide Web indexed by AltaVista and evaluate the performance of our techniques. Our results show that we can effectively filter out spam from a significant fraction of the web, based on a good seed set of less than 200 sites. 1
Regular models of phonological rule systems
, 1994
"... This paper presents a set of mathematical and computational tools for manipulating and reasoning about regular languages and regular relations and argues that they provide a solid basis for computational phonology. It shows in detail how this framework applies to ordered sets of contextsensitive re ..."
Abstract

Cited by 382 (6 self)
 Add to MetaCart
This paper presents a set of mathematical and computational tools for manipulating and reasoning about regular languages and regular relations and argues that they provide a solid basis for computational phonology. It shows in detail how this framework applies to ordered sets of contextsensitive rewriting rules and also to grammars in Koskenniemi's twolevel formalism. This analysis provides a common representation of phonological constraints that supports efficient generation and recognition by a single simple interpreter.
Principles and methods of Testing Finite State Machines  a survey
 PROCEEDINGS OF IEEE
, 1996
"... With advanced computer technology, systems are getting larger to fulfill more complicated tasks, however, they are also becoming less reliable. Consequently, testing is an indispensable part of system design and implementation; yet it has proved to be a formidable task for complex systems. This moti ..."
Abstract

Cited by 345 (16 self)
 Add to MetaCart
(Show Context)
With advanced computer technology, systems are getting larger to fulfill more complicated tasks, however, they are also becoming less reliable. Consequently, testing is an indispensable part of system design and implementation; yet it has proved to be a formidable task for complex systems. This motivates the study of testing finite state machines to ensure the correct functioning of systems and to discover aspects of their behavior. A finite state machine contains a finite number of states and produces outputs on state transitions after receiving inputs. Finite state machines are widely used to model systems in diverse areas, including sequential circuits, certain types of programs, and, more recently, communication protocols. In a testing problem we have a machine about which we lack some information; we would like to deduce this information by providing a sequence of inputs to the machine and observing the outputs produced. Because of its practical importance and theoretical interest, the problem of testing finite state machines has been studied in different areas and at various times. The earliest published literature on this topic dates back to the 50’s. Activities in the 60’s and early 70’s were motivated mainly by automata theory and sequential circuit testing. The area seemed to have mostly died down until a few years ago when the testing problem was resurrected and is now being studied anew due to its applications to conformance testing of communication protocols. While some old problems which had been open for decades were resolved recently, new concepts and more intriguing problems from new applications emerge. We review the fundamental problems in testing finite state machines and techniques for solving these problems, tracing progress in the area from its inception to the present and the state of the art. In addition, we discuss extensions of finite state machines and some other topics related to testing.
The even more irresistible SROIQ
 In KR
, 2006
"... We describe an extension of the description logic underlying OWLDL, SHOIN, with a number of expressive means that we believe will make it more useful in practise. Roughly speaking, we extend SHOIN with all expressive means that were suggested to us by ontology developers as useful additions to OWL ..."
Abstract

Cited by 342 (50 self)
 Add to MetaCart
We describe an extension of the description logic underlying OWLDL, SHOIN, with a number of expressive means that we believe will make it more useful in practise. Roughly speaking, we extend SHOIN with all expressive means that were suggested to us by ontology developers as useful additions to OWLDL, and which, additionally, do not affect its decidability. We consider complex role inclusion axioms of the form R ◦ S ˙ ⊑ R or S ◦ R ˙ ⊑ R to express propagation of one property along another one, which have proven useful in medical terminologies. Furthermore, we extend SHOIN with reflexive, symmetric, transitive, and irreflexive roles, disjoint roles, a universal role, and constructs ∃R.Self, allowing, for instance, the definition of concepts such as a “narcist”. Finally, we consider negated role assertions in Aboxes and qualified number restrictions. The resulting logic is called SROIQ. We present a rather elegant tableaubased reasoning algorithm: it combines the use of automata to keep track of universal value restrictions with the techniques developed for SHOIQ. We believe that SROIQ could serve as a logical basis for possible future extensions of OWLDL.