Results 1 - 10
of
14
An Object-Oriented Architecture for Text Retrieval
- In Conference Proceedings of RIAO'91, Intelligent Text and Image Handling
, 1991
"... For almost all aspects of information access systems it is still the case that their optimal composition and functionality is hotly debated. Moreover, different application scenarios put different demands on individual components. It is therefore of the essence to be able to quickly build systems th ..."
Abstract
-
Cited by 35 (10 self)
- Add to MetaCart
For almost all aspects of information access systems it is still the case that their optimal composition and functionality is hotly debated. Moreover, different application scenarios put different demands on individual components. It is therefore of the essence to be able to quickly build systems that permit exploration of different designs and implementation strategies. This paper presents a software implementation architecture for text retrieval systems that facilitates (a) functional modularization (b) mix-and-match combination of module implementations and (c) definition of inter-module protocols. We show how an object-oriented approach easily accommodates this type of architecture. The design principles are exemplified by code examples in Common Lisp. Taken together these code examples constitute an operational retrieval system. The design principles and protocols implemented have also been instantiated in a large scale retrieval prototype in our research laboratory. 1 Introductio...
Towards a probabilistic modal logic for semantic-based information retrieval
- In Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval
, 1992
"... Abstract: Semantic-based approaches to Information Retrieval make a query evaluation similar to an inference process based on semantic relations. semantic-based approaches find out hidden semantic relationships between a document and a query, but quantitative estimation of the correspondence between ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Abstract: Semantic-based approaches to Information Retrieval make a query evaluation similar to an inference process based on semantic relations. semantic-based approaches find out hidden semantic relationships between a document and a query, but quantitative estimation of the correspondence between them is often empiric. On the other hand, probabilistic approaches usually consider only statistical relationships between terms. It is expected that improvement may be brought by integrating these two approaches. This paper demonstrates, using some particular probabilistic models which are strongly related to modal logic, that such an integration is feasible and natural. A new model is developed on the basis of an extended modal logic. It has the advantages of (1) augmenting a semantic-based approach with a probabilistic measurement, and (2) augmenting a probabilistic approach with finer semantic relations than just statistical ones. It is shown that this model verifies most of the conditions for an absolute probabiliyfinction. 1.
Soft Information retrieval: applications of fuzzy set theory and neural networks
- Neuro-fuzzy Techniques for Intelligent Information Systems
, 1999
"... Abstract. This paper presents a short survey of fuzzy and neural approaches to Information Retrieval. The goal of such approaches is to de ne exible Information Retrieval Systems able to deal with the inherent vagueness and uncertainty of the retrieval process. In this survey we address if and how s ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Abstract. This paper presents a short survey of fuzzy and neural approaches to Information Retrieval. The goal of such approaches is to de ne exible Information Retrieval Systems able to deal with the inherent vagueness and uncertainty of the retrieval process. In this survey we address if and how some approaches met their goal. 1.
Hierarchical text categorization using fuzzy relational thesaurus
"... Text categorization is the classi cation to assign a text document toan appropriate category in a prede ned set of categories. We present a new approach for the text categorization by means of Fuzzy Relational Thesaurus (FRT). FRT isamultilevel category ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
Text categorization is the classi cation to assign a text document toan appropriate category in a prede ned set of categories. We present a new approach for the text categorization by means of Fuzzy Relational Thesaurus (FRT). FRT isamultilevel category
Comparing Boolean and Probabilistic Information Retrieval Systems Across Queries and Disciplines
- Journal of the American Society for Information Science
, 1998
"... Whether using Boolean queries or ranking documents using document and term weights will result in better retrieval performance has been the subject of considerable discussion among document retrieval system users and researchers. We suggest a method that allows one to analytically compare the two ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Whether using Boolean queries or ranking documents using document and term weights will result in better retrieval performance has been the subject of considerable discussion among document retrieval system users and researchers. We suggest a method that allows one to analytically compare the two approaches to retrieval and examine their relative merits. The performance of information retrieval systems may be determined either by using experimental simulation, or through the application of analytic techniques that directly estimate the retrieval performance, given values for query and database characteristics. Using these performance predicting techniques, sample performance figures are provided for queries using the Boolean and and or, as well as for probabilistic systems assuming statistical term independence or term dependence. The variation of performance across sublanguages (used in different academic disciplines) and queries is examined. The performance of models fail...
A hierarchical text categorization approach and its application to FRT expansion
, 2003
"... Text categorization is the classification to assign a text document to an appropriate category in a predefined set of categories. This paper focuses on the special case when categories are organized in hierarchy. We presents a new approach on this recently emerged subfield of text categorization. Th ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Text categorization is the classification to assign a text document to an appropriate category in a predefined set of categories. This paper focuses on the special case when categories are organized in hierarchy. We presents a new approach on this recently emerged subfield of text categorization. The algorithm applies an iterative learning module that allow of gradually creating a classifier by trial-and-error-like method. Experimental results performed on three document corpora (including the well-known Reuters-21578, and 20 newsgroups data sets) with several topic hierarchies show that our approach outperforms existing ones by up to 10%. We also indicate another application of the method on the field of fuzzy relational thesauri (FRT): the expansion of knowledge base can be supported in a cost-effective way.
A logical formulation of the Boolean model and of weighted Boolean models
"... . In this paper the role of logic as a formal basis to exploit the query evaluation process of the boolean model and of weighted boolean models is analysed. The proposed approach is based on the expression of the constraint imposed by a query term on a document representation by means of the imp ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
. In this paper the role of logic as a formal basis to exploit the query evaluation process of the boolean model and of weighted boolean models is analysed. The proposed approach is based on the expression of the constraint imposed by a query term on a document representation by means of the implication connective (by a fuzzy implication in the case of weighted terms). A logical formula corresponds to a query evaluation structure, and the degree of relevance of a document to a user query is obtained as the truth value of the formula expressing the evaluation structure of the considered query under the interpretation corresponding with a document and the query itself. 1 Introduction A recent approach to model information retrieval is the logical approach; the main motivation advocated in the literature to model IR in the logical framework is the need for a more general formal discipline, as logic, to reason about the foundational principles of IR [11, 17]. A common basis of ...
The Fuzzy Set Model Based on N-ary Positively Compensatory Operators
"... this paper we propose n-ary positively compensatory operators by extending the binary forms, which alleviate the aforementioned problem. We show through performance evaluation that the n-ary operators provide better retrieval effectiveness than the binary. 1. INTRODUCTION ..."
Abstract
- Add to MetaCart
this paper we propose n-ary positively compensatory operators by extending the binary forms, which alleviate the aforementioned problem. We show through performance evaluation that the n-ary operators provide better retrieval effectiveness than the binary. 1. INTRODUCTION
Uncertainty Modelling for Adaptive Information Management
"... The management of complex systems strongly depends on the ability to handle huge amounts of information. The experience accumulated on a problem represents knowledge we would like to capitalise on in the future and information retrieval (IR) systems offer a valuable support. Uncertainty is a fundame ..."
Abstract
- Add to MetaCart
The management of complex systems strongly depends on the ability to handle huge amounts of information. The experience accumulated on a problem represents knowledge we would like to capitalise on in the future and information retrieval (IR) systems offer a valuable support. Uncertainty is a fundamental component of the description of a piece of data and its explicit modelling is the purpose of our work. In a standard IR context, uncertainty permeates the behaviour of both the system and the users and we investigate the effects of its explicit modelling on classical IR parameters like precision and recall. We present a keyword based model that, capitalising on the flexibility of fuzzy sets, extends the traditional two dimensional vector approach to data abstraction evolving it into a paradigm where relevance is tightly coupled with uncertainty and the view the system has on data evolves dynamically through an adaptivity process. A prototype system (DUNE) has been derived from the gener...

