Results 1 - 10
of
84
Complexity and Expressive Power of Logic Programming
, 1997
"... This paper surveys various complexity results on different forms of logic programming. The main focus is on decidable forms of logic programming, in particular, propositional logic programming and datalog, but we also mention general logic programming with function symbols. Next to classical results ..."
Abstract
-
Cited by 240 (51 self)
- Add to MetaCart
This paper surveys various complexity results on different forms of logic programming. The main focus is on decidable forms of logic programming, in particular, propositional logic programming and datalog, but we also mention general logic programming with function symbols. Next to classical results on plain logic programming (pure Horn clause programs), more recent results on various important extensions of logic programming are surveyed. These include logic programming with different forms of negation, disjunctive logic programming, logic programming with equality, and constraint logic programming. The complexity of the unification problem is also addressed.
Semistructured data
, 1997
"... In semistructured data, the information that is normally as-sociated with a schema is contained within the data, which is sometimes called “self-describing”. In some forms of semi-structured data there is no separate schema, in others it exists but only places loose constraints on the data. Semi-str ..."
Abstract
-
Cited by 230 (0 self)
- Add to MetaCart
In semistructured data, the information that is normally as-sociated with a schema is contained within the data, which is sometimes called “self-describing”. In some forms of semi-structured data there is no separate schema, in others it exists but only places loose constraints on the data. Semi-structured data has recently emerged as an important topic of study for a variety of reasons. First, there are data sources such as the Web, which we would like to treat as databases but which cannot be constrained by a schema. Second, it may be desirable to have an extremely flexible format for data exchange between disparate databases. Third, even when dealing with structured data, it may be helpful to view it. as semistructured for the purposes of browsing. This tu-torial will cover a number of issues surrounding such data: finding a concise formulation, building a sufficiently expres-sive language for querying and transformation, and opti-mizat,ion problems. 1 The motivation The topic of semistructured data (also called unstructured data) is relatively recent, and a tutorial on the topic may well be premature. It represents, if anything, the conver-gence of a number of lines of thinking about new ways to represent and query data that do not completely fit with conventional data models. The purpose of this tutorial is to to describe this motivation and to suggest areas in which further research may be fruitful. For a similar exposition, the reader is referred to Serge Abiteboul’s recent survey pa-per PI. The slides for this tutorial will be made available from a section of the Penn database home page
Processing XML Streams with deterministic automata
, 2003
"... Abstract. We consider the problem of evaluating a large number of XPath expressions on an XML stream. Our main contribution consists in showing that Deterministic Finite Automata (DFA) can be used effectively for this problem: in our experiments we achieve a throughput of about 5.4MB/s, independent ..."
Abstract
-
Cited by 107 (3 self)
- Add to MetaCart
Abstract. We consider the problem of evaluating a large number of XPath expressions on an XML stream. Our main contribution consists in showing that Deterministic Finite Automata (DFA) can be used effectively for this problem: in our experiments we achieve a throughput of about 5.4MB/s, independent of the number of XPath expressions (up to 1,000,000 in our tests). The major problem we face is that of the size of the DFA. Since the number of states grows exponentially with the number of XPath expressions, it was previously believed that DFAs cannot be used to process large sets of expressions. We make a theoretical analysis of the number of states in the DFA resulting from XPath expressions, and consider both the case when it is constructed eagerly, and when it is constructed lazily. Our analysis indicates that, when the automaton is constructed lazily, and under certain assumptions about the structure of the input XML data, the number of states in the lazy DFA is manageable. We also validate experimentally our findings, on both synthetic and real XML data sets. 1
Data Integration by Bi-Directional Schema Transformation Rules
, 2003
"... In this paper we describe a new approach to data integration which subsumes the previous approaches of local as view (LAV) and global as view (GAV). Our method, which we term both as view (BAV), is based on the use of reversible schema transformation sequences. We show how LAV and GAV view definitio ..."
Abstract
-
Cited by 72 (9 self)
- Add to MetaCart
In this paper we describe a new approach to data integration which subsumes the previous approaches of local as view (LAV) and global as view (GAV). Our method, which we term both as view (BAV), is based on the use of reversible schema transformation sequences. We show how LAV and GAV view definitions can be fully derived from BA V schema transformation sequences, and how BA V transformation sequences may be partially derived from LAV or GAV view definitions. We also show how BAV supports the evolution of both global and local schemas, and we discuss ongoing implementation of the BA V approach within the AutoMed project.
BioKleisli: A Digital Library for Biomedical Researchers
, 1996
"... Data of interest to biomedical researchers associated with the Human Genome Project (HGP) is stored all over the world in a number of different electronic data formats and accessible through a varietyof interfaces and retrieval languages. These data sources include conventional relational databases ..."
Abstract
-
Cited by 70 (15 self)
- Add to MetaCart
Data of interest to biomedical researchers associated with the Human Genome Project (HGP) is stored all over the world in a number of different electronic data formats and accessible through a varietyof interfaces and retrieval languages. These data sources include conventional relational databases with SQL interfaces, formatted text files on top of which indexing is provided for efficient retrieval (ASN.1-Entrez), and binary files that can be interpreted textually or graphically via special purpose interfaces (ACeDB). Researchers within the HGP wanttocombine data from these different data sources, add value through sophisticated data analysis techniques (such as the biosequence comparison software BLAST and FASTA), and view it using special purpose scientific visualization tools. However, currently there are no commercial tools for enabling such an integrated digital library, and a fundamental barrier to developing such tools appears to be one of language design and optimization: The data f...
Links: web programming without tiers
- In FMCO 2006, volume 4709 of LNCS
, 2007
"... Abstract. Links is a programming language for web applications that generates code for all three tiers of a web application from a single source, compiling into JavaScript to run on the client and into SQL to run on the database. Links supports rich clients running in what has been dubbed ‘Ajax ’ st ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
Abstract. Links is a programming language for web applications that generates code for all three tiers of a web application from a single source, compiling into JavaScript to run on the client and into SQL to run on the database. Links supports rich clients running in what has been dubbed ‘Ajax ’ style, and supports concurrent processes with statically-typed message passing. Links is scalable in the sense that session state is preserved in the client rather than the server, in contrast to other approaches such as Java Servlets or PLT Scheme. Client-side concurrency in JavaScript and transfer of computation between client and server are both supported by translation into continuation-passing style. 1
Toward Routine Automatic Pathway Discovery from On-line Scientific Text Abstracts
- Genome Informatics
, 1999
"... s See-Kiong Ng 1 Marie Wong 2 skng@krdl.org.sg marie@bic.nus.edu.sg 1 Kent Ridge Digital Labs, 21 Heng Mui Keng Terrace, Singapore 119613 2 NUS Bioinformatics Centre, National University of Singapore, Singapore 119260 Abstract We are entering a new era of research where the latest scienti ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
s See-Kiong Ng 1 Marie Wong 2 skng@krdl.org.sg marie@bic.nus.edu.sg 1 Kent Ridge Digital Labs, 21 Heng Mui Keng Terrace, Singapore 119613 2 NUS Bioinformatics Centre, National University of Singapore, Singapore 119260 Abstract We are entering a new era of research where the latest scientific discoveries are often first reported online and are readily accessible by scientists worldwide. This rapid electronic dissemination of research breakthroughs has greatly accelerated the current pace in genomics and proteomics research. The race to the discovery of a gene or a drug has now become increasingly dependent on how quickly a scientist can scan through voluminous amount of information available online to construct the relevant picture (such as protein-protein interaction pathways) as it takes shape amongst the rapidly expanding pool of globally accessible biological data (e.g. GENBANK) and scientific literature (e.g. MEDLINE). We describe a prototype system for automatic...
Curated databases
- PODS'08
, 2008
"... Curated databases are databases that are populated and updated with a great deal of human effort. Most reference works that one traditionally found on the reference shelves of libraries – dictionaries, encyclopedias, gazetteers etc. – are now curated databases. Since it is now easy to publish databa ..."
Abstract
-
Cited by 43 (6 self)
- Add to MetaCart
Curated databases are databases that are populated and updated with a great deal of human effort. Most reference works that one traditionally found on the reference shelves of libraries – dictionaries, encyclopedias, gazetteers etc. – are now curated databases. Since it is now easy to publish databases on the web, there has been an explosion in the number of new curated databases used in scientific research. The value of curated databases lies in the organization and the quality of the data they contain. Like the paper reference works they have replaced, they usually represent the efforts of a dedicated group of people to produce a definitive description of some subject area. Curated databases present a number of challenges for database research. The topics of annotation, provenance, and citation are central, because curated databases are heavily cross-referenced with, and include data from, other databases, and much of the work of a curator is annotating existing data. Evolution of structure is important because these databases often evolve from semistructured representations, and because they have to accommodate new scientific discoveries. Much of the work in these areas is in its infancy, but it is beginning to provide suggest new research for both theory and practice. We discuss some of this research and emphasize the need to find appropriate models of the processes associated with curated databases.
Monads and Effects
- IN INTERNATIONAL SUMMER SCHOOL ON APPLIED SEMANTICS APPSEM’2000
, 2000
"... A tension in language design has been between simple semantics on the one hand, and rich possibilities for side-effects, exception handling and so on on the other. The introduction of monads has made a large step towards reconciling these alternatives. First proposed by Moggi as a way of structu ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
A tension in language design has been between simple semantics on the one hand, and rich possibilities for side-effects, exception handling and so on on the other. The introduction of monads has made a large step towards reconciling these alternatives. First proposed by Moggi as a way of structuring semantic descriptions, they were adopted by Wadler to structure Haskell programs, and now offer a general technique for delimiting the scope of effects, thus reconciling referential transparency and imperative operations within one programming language. Monads have been used to solve long-standing problems such as adding pointers and assignment, inter-language working, and exception handling to Haskell, without compromising its purely functional semantics. The course will introduce monads, effects and related notions, and exemplify their applications in programming (Haskell) and in compilation (MLj). The course will present typed metalanguages for monads and related categorica...
On the Complexity of Nonrecursive XQuery and Functional Query Languages on Complex Values
- In Proc. PODS’05
"... This article studies the complexity of evaluating functional query languages for complex values such as monad algebra and the recursion-free fragment of XQuery. We show that monad algebra with equality restricted to atomic values is complete for the class TA[2O(n) , O(n)] of problems solvable in lin ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
This article studies the complexity of evaluating functional query languages for complex values such as monad algebra and the recursion-free fragment of XQuery. We show that monad algebra with equality restricted to atomic values is complete for the class TA[2O(n) , O(n)] of problems solvable in linear exponential time with a linear number of alternations. The monotone fragment of monad algebra with atomic value equality but without negation is complete for nondeterministic exponential time. For monad algebra with deep equality, we establish TA[2O(n) , O(n)] lower and exponential-space upper bounds. We also study a fragment of XQuery, Core XQuery, that seems to incorporate all the features of a query language on complex values that are traditionally deemed essential. A close connection between monad algebra on lists and Core XQuery (with “child ” as the only axis) is exhibited, and it is shown that these languages are expressively equivalent up to representation issues. We show that Core XQuery is just as hard as monad algebra w.r.t. query and combined complexity, and that it is in TC0 if the query is assumed fixed. As Core XQuery is NEXPTIME-hard, it is commonly believed that any algorithm for evaluating Core XQuery has to require exponential amounts of working memory and doubly exponential time in the worst case. We present a property of queries – the lack of a certain form of composition – that virtually all real-world XQueries have and that allows for query evaluation in singly exponential time and polynomial space. Still, we are able to show for an important special case – Core XQuery with equality testing restricted to atomic values – that the composition-free language is just as expressive as the language with composition. Thus, under widely-held complexitytheoretic assumptions, the composition-free language is an exponentially less succinct version of the language with composition.

