Results 11 - 20
of
100
Evaluation of Signature Files as Set Access Facilities in OODBs
- In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data
, 1993
"... Object-oriented database systems (OODBs) need efficient support for manipulation of complex objects. In particular, support of queries involving evaluations of set predicates is often required in handling complex objects. In this paper, we propose a scheme to apply signature file techniques, which w ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
Object-oriented database systems (OODBs) need efficient support for manipulation of complex objects. In particular, support of queries involving evaluations of set predicates is often required in handling complex objects. In this paper, we propose a scheme to apply signature file techniques, which were originally invented for text retrieval, to the support of set value accesses, and quantitatively evaluate their potential capabilities. Two signature file organizations, the sequential signature file and the bitsliced signature file, are considered and their performance is compared with that of the nested index for queries involving the set inclusion operator (`). We develop a detailed cost model and present analytical results clarifying their retrieval, storage, and update costs. Our analysis shows that the bitsliced signature file is a very promising set access facility in OODBs. 1 INTRODUCTION Advanced database application areas, such as computer aided design, office automation, and...
Performance of inverted indices in shared-nothing distributed text document information retrieval systems
- In Proceedings of the Second International Conference on Parallel and Distributed Information Systems
, 1993
"... The performance of distributed text document retrieval systems is strongly in uenced bytheorganization of the inverted index. This paper compares the performance impact on query processing of various physical organizations for inverted lists. We present a new probabilistic model of the database and ..."
Abstract
-
Cited by 54 (6 self)
- Add to MetaCart
The performance of distributed text document retrieval systems is strongly in uenced bytheorganization of the inverted index. This paper compares the performance impact on query processing of various physical organizations for inverted lists. We present a new probabilistic model of the database and queries. Simulation experiments determine which variables most strongly inuence response time and throughput. This leadstoa set of design trade-o s over a range of hardware con gurations and new parallel query processing strategies. 1
Integrating Structured Data and Text: A relational approach
- Journal of the American Society of Information Science
, 1997
"... We integrate structured data and text using the unchanged, standard relational model. We started with the premise that a relational system could be used to implement an Information Retrieval (IR) system. After implementing a prototype to verify that premise, we then began to investigate the performa ..."
Abstract
-
Cited by 50 (27 self)
- Add to MetaCart
We integrate structured data and text using the unchanged, standard relational model. We started with the premise that a relational system could be used to implement an Information Retrieval (IR) system. After implementing a prototype to verify that premise, we then began to investigate the performance of a parallel relational database system for this application. We also tested the effect of query reduction on accuracy and found that queries can be reduced prior to their implementation without incurring a significant loss in precision/recall. This reduction also serves to improve run-time performance. After comparing our results to a special purpose IR system, we conclude that the relational model offers scalable performance and includes the ability to integrate structured data and text in a portable fashion. 1 Introduction Increasingly, applications integrate structured and unstructured data, responding to requests such as "Find articles containing vehicle and sales published in jou...
Tree Pattern Relaxation
, 2002
"... Tree patterns are fundamental to querying tree-structured data like XML. Because of the heterogeneity of XML data, it is often more appropriate to permit approximate query matching and return ranked answers, in the spirit of Information Retrieval, than to return only exact answers. In this paper ..."
Abstract
-
Cited by 45 (5 self)
- Add to MetaCart
Tree patterns are fundamental to querying tree-structured data like XML. Because of the heterogeneity of XML data, it is often more appropriate to permit approximate query matching and return ranked answers, in the spirit of Information Retrieval, than to return only exact answers. In this paper, we study the problem of approximate XML query matching, based on tree pattern relaxations, and devise efficient algorithms to evaluate relaxed tree patterns. We consider weighted tree patterns, where exact and relaxed weights, associated with nodes and edges of the tree pattern, are used to compute the scores of query answers. We are
Certification Reports: Supporting Transactions in Wireless Systems
, 1997
"... The emergence of small portable computers and the advances in wireless networking have made mobile computing today a reality. Information systems and databases are among the applications that make mobile computing attractive. While the topic of querying data in wireless and mobile systems has receiv ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
The emergence of small portable computers and the advances in wireless networking have made mobile computing today a reality. Information systems and databases are among the applications that make mobile computing attractive. While the topic of querying data in wireless and mobile systems has received a lot of attention, techniques to efficiently update data in these systems while providing transaction semantics are not fully developed. In this paper, we present a novel protocol that uses the broadcast facility to help mobile units do some of the work of verifying if the transactions being run by them need to be aborted. Only when the mobile unit cannot detect any conflict is the server involved in completing the verification. Of course, if the transaction can commit, the server will install the values in the central database and notify the mobile units (again, using the broadcast channel). The protocol uses a modified version of optimistic control. We study the performance of the protocol by means of a detailed simulation. 1
Index Structures for Information Filtering Under the Vector Space Model
- In Proc. International Conference on Data Engineering
, 1993
"... With the ever increasing volumes of electronic information generation, users of information systems are facing an information overload. It is desirable to support information filtering as a complement to traditional retrieval mechanism. The number of users, and thus profiles (representing users' lon ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
With the ever increasing volumes of electronic information generation, users of information systems are facing an information overload. It is desirable to support information filtering as a complement to traditional retrieval mechanism. The number of users, and thus profiles (representing users' long-term interests), handled by an information filtering system is potentially huge, and the system has to process a constant stream of incoming information in a timely fashion. The efficiency of the filtering process is thus an important issue. In this paper, we study what data structures and algorithms can be used to efficiently perform large-scale information filtering under the vector space model, a retrieval model established as being effective. We apply the idea of the standard inverted index to index user profiles. We devise an alternative to the standard inverted index, in which we, instead of indexing every term in a profile, select only the significant ones to index. We evaluate thei...
Predicate Rewriting for Translating Boolean Queries in a Heterogeneous Information System
, 1996
"... Usually referred to as fielded search, a predicate specifies a pattern to be matched against the content of a field (Figure 2 , Construct 2). Typically, for each searchable field, IR systems build indexes [Salton 1989; Frakes and Baeza-Yates 1992; Faloutsos 1985] to direct the search engine to find ..."
Abstract
-
Cited by 29 (6 self)
- Add to MetaCart
Usually referred to as fielded search, a predicate specifies a pattern to be matched against the content of a field (Figure 2 , Construct 2). Typically, for each searchable field, IR systems build indexes [Salton 1989; Frakes and Baeza-Yates 1992; Faloutsos 1985] to direct the search engine to find documents with some given term, such as the word cat or phrase "Joe Doe". The indexing schemes of a field restrict how it can be queried. Generally, there are two ways of indexing .
Supporting Full-Text Information Retrieval with a Persistent Object Store
- In 4th Intl. Conf. on Extending Database Technology
, 1994
"... Full-text information retrieval systems have unusual and challenging data management requirements. Attempts have been made to satisfy these requirements using traditional (e.g., relational) database management systems. Those attempts, however, have produced rather discouraging results. Instead, info ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
Full-text information retrieval systems have unusual and challenging data management requirements. Attempts have been made to satisfy these requirements using traditional (e.g., relational) database management systems. Those attempts, however, have produced rather discouraging results. Instead, information retrieval systems typically use custom data management facilities that require significant development effort and usually do not provide all of the services available from a standard database management system. Advanced data management systems, such as object-oriented database management systems and persistent object stores, offer a reasonable alternative to the two previous approaches. We have taken an existing information retrieval system (INQUERY) and substituted a persistent object store (Mneme) for the portion of the custom data management system that manages an inverted file index. The result is an improvement in performance and significant opportunities for the inform...
Index Structures for Structured Documents
- In Proceedings of the 1st ACM International Conference on Digital Libraries
, 1996
"... Much research has been carried out in order to manage structured documents such as SGML documents and to provide powerful query facilities which exploit document structures as well as document contents. In order to perform structure queries efficiently in a structured document management system, an ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Much research has been carried out in order to manage structured documents such as SGML documents and to provide powerful query facilities which exploit document structures as well as document contents. In order to perform structure queries efficiently in a structured document management system, an index structure which supports fast document element access must be provided. However, there has been little research on the index structures for structured documents. In this paper, we propose various kinds of new inverted indexing schemes and signature file schemes for efficient structure query processing. We evaluate the storage requirements and disk access times of our schemes and present the analytical and experimental results. 1 Introduction Since the Standard Generalized Markup Language (SGML) [13] [15] was standardized, many structured document management systems have been built to manage structured documents including [1] [2] [3] [4] [5] [6] [17] [18] [20] [21] [23]. In those syste...
Join Queries with External Text Sources: Execution and Optimization Techniques
- In Proceedings of the ACM SIGMOD International Conference on Management of Data
, 1995
"... Text is a pervasive information type, and many applications require querying over text sources in addition to structured data. This paper studies the problem of query processing in a system that loosely integrates an extensible database system and a text retrieval system. We focus on a class of conj ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Text is a pervasive information type, and many applications require querying over text sources in addition to structured data. This paper studies the problem of query processing in a system that loosely integrates an extensible database system and a text retrieval system. We focus on a class of conjunctive queries that include joins between text and structured data, in addition to selections over these two types of data. We adapt techniques from distributed query processing and introduce a novel class of join methods based on probing that is especially useful for joins with text systems, and we present a cost model for the various alternative query processing methods. Experimental results confirm the utility of these methods. The space of query plans is extended due to the additional techniques, and we describe an optimization algorithm for searching this extended space. The techniques we describe in this paper are applicable to other types of external data managers loosely integrated ...

