Results 1 - 10
of
44
XIRQL: A Query Language for Information Retrieval in XML Documents
, 2001
"... Based on the document-centric view of XML, we present the query language XIRQL. Current proposals for XML query languages lack most IR-related features, which are weighting and ranking, relevance-oriented search, datatypes with vague predicates, and semantic relativism. XIRQL integrates these featur ..."
Abstract
-
Cited by 140 (6 self)
- Add to MetaCart
Based on the document-centric view of XML, we present the query language XIRQL. Current proposals for XML query languages lack most IR-related features, which are weighting and ranking, relevance-oriented search, datatypes with vague predicates, and semantic relativism. XIRQL integrates these features by using ideas from logic-based probabilistic IR models, in combination with concepts from the database area. For processing XIRQL queries, a path algebra is presented, that also serves as a starting point for query optimization.
Tree Pattern Relaxation
, 2002
"... Tree patterns are fundamental to querying tree-structured data like XML. Because of the heterogeneity of XML data, it is often more appropriate to permit approximate query matching and return ranked answers, in the spirit of Information Retrieval, than to return only exact answers. In this paper ..."
Abstract
-
Cited by 45 (5 self)
- Add to MetaCart
Tree patterns are fundamental to querying tree-structured data like XML. Because of the heterogeneity of XML data, it is often more appropriate to permit approximate query matching and return ranked answers, in the spirit of Information Retrieval, than to return only exact answers. In this paper, we study the problem of approximate XML query matching, based on tree pattern relaxations, and devise efficient algorithms to evaluate relaxed tree patterns. We consider weighted tree patterns, where exact and relaxed weights, associated with nodes and edges of the tree pattern, are used to compute the scores of query answers. We are
On the Integration of Structure Indexes and Inverted Lists
- In SIGMOD
, 2004
"... Recently, there has been a great deal of interest in the development of techniques to evaluate path expressions over collections of XML documents. In general, these path expressions contain both structural and keyword components. Several methods have been proposed for processing path expressions ove ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
Recently, there has been a great deal of interest in the development of techniques to evaluate path expressions over collections of XML documents. In general, these path expressions contain both structural and keyword components. Several methods have been proposed for processing path expressions over graph/tree-structured XML data. These methods can be classified into two broad classes. The first involves graph traversal where the input query is evaluated by traversing the data graph or some compressed representation. The other class involves information-retrieval style processing using inverted lists. In this framework, structure indexes have been proposed to be used as a substitute for graph traversal. These structure indexes are proven to be very effective when applied to queries that examine the “coarse ” structure of documents. For example, for many
A model of multimedia information retrieval
- Journal of the ACM
, 2001
"... Abstract. Research on multimedia information retrieval (MIR) has recently witnessed a booming interest. A prominent feature of this research trend is its simultaneous but independent materialization within several fields of computer science. The resulting richness of paradigms, methods and systems m ..."
Abstract
-
Cited by 41 (12 self)
- Add to MetaCart
Abstract. Research on multimedia information retrieval (MIR) has recently witnessed a booming interest. A prominent feature of this research trend is its simultaneous but independent materialization within several fields of computer science. The resulting richness of paradigms, methods and systems may, on the long run, result in a fragmentation of efforts and slow down progress. The primary goal of this study is to promote an integration of methods and techniques for MIR by contributing a conceptual model that encompasses in a unified and coherent perspective the many efforts that are being produced under the label of MIR. The model offers a retrieval capability that spans two media, text and images, but also several dimensions: form, content and structure. In this way, it reconciles similarity-based methods with semantics-based ones, providing the guidelines for the design of systems that are able to provide a generalized multimedia retrieval service, in which the existing forms of retrieval not only coexist, but can be combined in any desired manner. The model is formulated in terms of a fuzzy description logic, which plays a twofold role: (1) it directly models semantics-based retrieval, and (2) it offers an ideal framework for the integration of the multimedia and multidimensional aspects of retrieval mentioned above. The model also accounts for relevance feedback in both text and image retrieval, integrating known techniques for taking into account user judgments. The implementation of
XIRQL: An XML Query Language Based on Information Retrieval Concepts
, 2001
"... Most proposals for XML query languages are based on the data-centric view on XML and do not support uncertainty and vagueness, thus being insuitable for information retrieval (IR) of XML documents. Based on the document-centric view, we present the query language XIRQL which implements IR-related fe ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
Most proposals for XML query languages are based on the data-centric view on XML and do not support uncertainty and vagueness, thus being insuitable for information retrieval (IR) of XML documents. Based on the document-centric view, we present the query language XIRQL which implements IR-related features such as weighting and ranking, relevance-oriented search, datatypes with vague predicates, and structural relativism. XIRQL integrates these features by using ideas from logic-based probabilistic IR models, in combination with concepts from the database area. For processing XIRQL queries, a path algebra is presented which also serves as a starting point for query optimization.
Complete Answer Aggregates for Tree-like Databases: A Novel Approach to Combine Querying and Navigation
- ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2001
"... The use of markup languages like SGML, HTML, or XML for encoding the structure of documents or linguistic data has lead . . . ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
The use of markup languages like SGML, HTML, or XML for encoding the structure of documents or linguistic data has lead . . .
Indexing Documents for Queries on Structure, Content and Attributes
- Proc. of International Symposium on Digital Media Information Base (DMIB
, 1997
"... Indexing and retrieval techniques for large text databases are well developed, but most of the techniques developed to date assume that the text to be indexed has little or no structure. With the growth in the use of sophisticated markup languages for text, a database system for structured documents ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Indexing and retrieval techniques for large text databases are well developed, but most of the techniques developed to date assume that the text to be indexed has little or no structure. With the growth in the use of sophisticated markup languages for text, a database system for structured documents should use, not just document content, but structural information and attributes, and should support queries on content, structure and attributes. In this paper we review and compare two recent approaches for accessing document collections. For one of the approaches, position-based indexing, queries are resolved by manipulating ranges of word o sets while for the other, based on a path model, the position of a word is represented in terms of the structural components that enclose it. The former allows slightly smaller indexes; the latter allows more efficient query evaluation.
Improving index structures for structured document retrieval
- In IRSG'99, 21st Annual Colloquium on IR Research
, 1999
"... Structured document retrieval has established itself as a new research area in the overlap between Database Systems and Information Retrieval. This work proposes a filtering technique, that can be added to already existing index structures of many structured document retrieval systems. This new tech ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
Structured document retrieval has established itself as a new research area in the overlap between Database Systems and Information Retrieval. This work proposes a filtering technique, that can be added to already existing index structures of many structured document retrieval systems. This new technique takes the contextual structure information of query and document database into account and reduces the occurrence sets returned by the original index structure drastically. This improves the performance of query evaluation. A measure is introduced that allows to quantify the added value of the proposed index structure. Based on this measure a heuristic is presented that allows to include only valuable context information in the index structure. 1
XQL and Proximal Nodes
, 2000
"... We consider the recently proposed XQL language, which is designed to query XML documents by content and structure. We show that an already existing model, namely "Proximal Nodes", is the only one that addresses all the complex querying operations defined by XQL and that suggests an efficient impleme ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
We consider the recently proposed XQL language, which is designed to query XML documents by content and structure. We show that an already existing model, namely "Proximal Nodes", is the only one that addresses all the complex querying operations defined by XQL and that suggests an efficient implementation for them.

