Results 1 -
6 of
6
XSEarch: A Semantic Search Engine for XML
- In VLDB
, 2003
"... XSEarch, a semantic search engine for XML, is presented. XSEarch has a simple query language, suitable for a naive user. It returns semantically related document fragments that satisfy the user's query. Query answers are ranked using extended information-retrieval techniques and are generated ..."
Abstract
-
Cited by 98 (5 self)
- Add to MetaCart
XSEarch, a semantic search engine for XML, is presented. XSEarch has a simple query language, suitable for a naive user. It returns semantically related document fragments that satisfy the user's query. Query answers are ranked using extended information-retrieval techniques and are generated in an order similar to the ranking. Advanced indexing techniques were developed to facilitate e#cient implementation of XSEarch. The performance of the di#erent techniques as well as the recall and the precision were measured experimentally.
Interconnection semantics for keyword search in xml
- in XML. CIKM
, 2005
"... A framework for describing semantic relationships among nodes in XML documents is presented. In contrast to earlier work, the XML documents may have ID references (i.e., they correspond to graphs and not just trees). A specific interconnection semantics in this framework can be defined explicitly or ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
A framework for describing semantic relationships among nodes in XML documents is presented. In contrast to earlier work, the XML documents may have ID references (i.e., they correspond to graphs and not just trees). A specific interconnection semantics in this framework can be defined explicitly or derived automatically. The main advantage of interconnection semantics is the ability to pose queries on XML data in the style of keyword search. Several methods for automatically deriving interconnection semantics are presented. The complexity of the evaluation and the satisfiability problems under the derived semantics is analyzed. For many important cases, the complexity is tractable and hence, the proposed interconnection semantics can be efficiently applied to real-world XML documents.
Efficiently enumerating results of keyword search
- In Proc. of DBPL Conference
, 2005
"... Abstract. Various approaches for keyword search have been explored in different settings, including databases, XML and the Web. It is shown that in many cases, systems that incorporate keyword search actually solve similar problems. This paper describes, for this type of problems, the first algorith ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Abstract. Various approaches for keyword search have been explored in different settings, including databases, XML and the Web. It is shown that in many cases, systems that incorporate keyword search actually solve similar problems. This paper describes, for this type of problems, the first algorithms that are provably efficient, that is, run with polynomial delay. Specifically, algorithms for enumerating K-fragments are given, where a K-fragment is a subtree T of the given data graph, such that T contains all the keywords of K and no proper subtree of T has this property. Three types of K-fragments are considered: rooted, undirected and strong. For all three types, there are algorithms that enumerate all K-fragments with polynomial delay. For rooted K-fragments and acyclic data graphs, there is an algorithm that enumerates with polynomial delay in the order of increasing weight, assuming that K is of a fixed size. 1
Interconnection semantics for XML
, 2004
"... A framework for defining and automatically discovering semantic relationships among nodes in XML documents is presented. A specific interconnection semantics in this framework consists of a set of patterns. Interconnection semantics can be specified explicitly or de-rived automatically. Several meth ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A framework for defining and automatically discovering semantic relationships among nodes in XML documents is presented. A specific interconnection semantics in this framework consists of a set of patterns. Interconnection semantics can be specified explicitly or de-rived automatically. Several methods to automatically derive interconnection semantics are presented. The complexity of determining when nodes are interconnected under these se-mantics is analyzed. For many important cases, the complexity is tractable and hence, the proposed interconnection semantics can be efficiently applied to real-world documents. In particular, for acyclically-labeled documents, determining interconnection for a bounded-size set of nodes is polynomial for most of these semantics. The inverse problem of con-structing a document from a given set of objects and the interconnections that hold among those objects is also considered. It is shown that under a natural condition of unambiguity, a document that satisfies exactly the specified interconnections can be constructed efficiently, if such a document exists. If not, the set of new interconnections that are introduced by the construction is minimal.
An Unsupervised Approach for Acquiring Ontologies and RDF Data from Online Life Science Databases
"... Abstract. In the Linked Open Data cloud one of the largest data sets, comprising of 2.5 billion triples, is derived from the Life Science domain. Yet this represents a small fraction of the total number of publicly available data sources on the Web. We briefly describe past attempts to transform spe ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. In the Linked Open Data cloud one of the largest data sets, comprising of 2.5 billion triples, is derived from the Life Science domain. Yet this represents a small fraction of the total number of publicly available data sources on the Web. We briefly describe past attempts to transform specific Life Science sources from a plethora of open as well as proprietary formats into RDF data. In particular, we identify and tackle two bottlenecks in current practice: Acquiring ontologies to formally describe these data and creating “RDFizer ” programs to convert data from legacy formats into RDF. We propose an unsupervised method, based on transformation rules, for performing these two key tasks, which makes use of our previous work on unsupervised wrapper induction for extracting labelled data from complete Life Science Web sites. We apply our approach to 13 real-world online Life Science databases. The learned ontologies are evaluated by domain experts as well as against gold standard ontologies. Furthermore, we compare the learned ontologies against ontologies that are “lifted ” directly from the underlying relational schema using an existing unsupervised approach. Finally, we apply our approach to three online databases to extract RDF data. Our results indicate that this approach can be used to bootstrap and speed up the migration of life science data into the Linked Open Data cloud. 1

