## Soft Information retrieval: applications of fuzzy set theory and neural networks (1999)

Venue: | Neuro-fuzzy Techniques for Intelligent Information Systems |

Citations: | 17 - 3 self |

### BibTeX

@INPROCEEDINGS{Crestani99softinformation,

author = {Fabio Crestani and Gabriella Pasi},

title = {Soft Information retrieval: applications of fuzzy set theory and neural networks},

booktitle = {Neuro-fuzzy Techniques for Intelligent Information Systems},

year = {1999},

pages = {287--315},

publisher = {Physica Verlag (Springer Verlag}

}

### OpenURL

### Abstract

Abstract. This paper presents a short survey of fuzzy and neural approaches to Information Retrieval. The goal of such approaches is to de ne exible Information Retrieval Systems able to deal with the inherent vagueness and uncertainty of the retrieval process. In this survey we address if and how some approaches met their goal. 1.

### Citations

3651 |
Fuzzy sets
- Zadeh
- 1965
(Show Context)
Citation Context ...al networks) theory. The use of fuzzy set or connectionist techniques in IR has been recently refered to as Soft Information Retrieval in analogy with the area called Soft Computing. Fuzzy set theory =-=[64]-=- is a formal framework well suited to model vagueness: in IR it has been successfully employed at several levels [29, 60], in particular for the de nition of a superstructure of the Boolean model, wit... |

3392 |
Introduction to Modern Information Retrieval
- Salton, McGill
- 1983
(Show Context)
Citation Context ...of index terms and logical connectives (eg. AND, OR, NOT). A document is considered relevant and retrieved by the IRS if it satis es the logical formula representing the query. The Vector Space model =-=[50]-=- is based on a spatial interpretation of both documents and queries. Here an improvement of the documents representation over the Boolean model is obtained by associating with each index term a numeri... |

1907 |
Introduction to the theory of neural computation
- Hertz, Krogh, et al.
- 1991
(Show Context)
Citation Context ...tant paradigms of learning used in the NN eld: supervised learning and unsupervised learning. We refer to the NN literature for a detailed explanation of these two learning paradigms (see for example =-=[23, 55]-=-). 4.1 Supervised Learning Techniques A supervised learning procedure is a process which incorporates an \external teacher". This means that the teacher speci es the desired output of the NN. During t... |

909 | The concept of a linguistic variable and its application to approximation reasoning
- Zadeh
- 1975
(Show Context)
Citation Context ...nt, etc. To this aim Bordogna and Pasi [8] have de ned a fuzzy retrieval model in which the linguistic descriptors are formalised within the framework of fuzzy set theory through linguistic variables =-=[65]-=-. A ht; li pair identi es a qualitative selection criterion, where t is a term and l isavalue belonging to the term set of the linguistic variable Importance, which has a base variable ranging over th... |

564 |
On ordered weighted averaging aggregation operators in multicriteria decision making
- Yager
- 1988
(Show Context)
Citation Context ...inguistic quanti er indicates the number of documents' sections in which a term must be present to be considered fully signi cant. Linguistic quanti ers have been formalised by means of OWA operators =-=[61]-=-. This fuzzy representation of structured documents has been implemented and evaluated, showing that it improves the e ectiveness of a system with respect to the use of the traditional fuzzy represent... |

231 |
Adaptive Pattern Recognition and Neural Networks
- Pao
- 1989
(Show Context)
Citation Context ...di erent NN architectures and learning algorithms, has been created to demonstrate potential applications of NN in the eld of IR. The pattern clustering algorithm developed by M.S. Klassen and Y. Pao =-=[45]-=- is incorporated into Mnemosine. By encoding lexicon terms into sparse numeric matrices which become the inputs to the clustering module, it is possible to produce clusters of lexically related terms ... |

204 |
Extended Boolean Information Retrieval
- Salton, Fox, et al.
- 1983
(Show Context)
Citation Context ...erned the de nition of softer aggregation operators. To this aim new de nitions of aggregation operators have been proposed; for example, Salton, Fox and Wu proposed a model based on a pnorm operator =-=[49]-=-. Hayashi [22], Sanchez [51] and Paice [44] consider "soft" Boolean operators weighted between the AND and the OR as a compromise. However, in these approaches di erent soft interpretations of the Boo... |

188 |
A SelfOrganising Semantic Map for Information Retrieval
- Lin, Soergel, et al.
- 1991
(Show Context)
Citation Context ...e ectiveness to hierarchic (sequential) clustering algorithms. Another clustering e ort based on a fully distributed data representation scheme was performed be X. Lin, D. Soergel, and G. Marchionini =-=[32]-=-. They used a Kohonen feature map for clustering documents of arti cial intelligence literature: 140 titles of documents were used and 25 index terms were extracted. Documents were represented by vect... |

129 | Application of spreading activation techniques in information retrieval
- Crestani
- 1997
(Show Context)
Citation Context ...lications of the size of \real world" applications. Besides, many research papers claim to be about applications of NN to IR, while they are often applications of spreading activation techniques (see =-=[19]-=- for a few examples). In this section, we summarise some of the most innovative research in the application of NN and connectionist models to IR. We are aware that a numberofinteresting and recent wor... |

94 |
Fuzzy sets in Information Retrieval and Cluster Analysis
- Miyamoto
- 1990
(Show Context)
Citation Context ...and precision. These approaches are not described here, but the interested reader can read, among others, the contributions in [33, 5, 6] relating to the former applications, and the contributions in =-=[14, 38]-=- relating to the latter applications. 3.1 Extended Boolean models: fuzzy document representations A rst natural extension of the Boolean model is to represent a document asa fuzzy set of terms, thus m... |

84 |
An Extended Fuzzy Linguistic Approach to Generalize Boolean Information Retrieval," accepted by
- Pasi
- 1994
(Show Context)
Citation Context ...evels [29, 60], in particular for the de nition of a superstructure of the Boolean model, with the appealing consequence that existing Boolean IRSs can be improved without redesigning them completely =-=[10, 11,13]-=-. Through these extensions the gradual nature of relevance of documents to user queries can be modelled. A di erent approach is based on the application of the connectionist theory [48] to IR. Neural ... |

80 |
Adaptive Information Retrieval: Using a Connectionist Representation to Retrieve and Learn about Documents
- Belew
- 1989
(Show Context)
Citation Context ...d with those produced by conventional IRS. Another not dissimilar approach is that of a document retrieval system implemented on a Connection Machine (CM) by C. Stan ll and B. Kahle [57]. R. K. Belew =-=[2, 3]-=-investigated in depth the use of various NN techniques in an IRS called AIR. AIR 's structure is made of three layers (see Figure 4.2). Nodes on the rst layer represent descriptors; nodes in the secon... |

66 |
the PDP Research Group. Parallel distributed processing. Vol I
- Rumelhart, McClelland
- 1986
(Show Context)
Citation Context ...pletely [10, 11,13]. Through these extensions the gradual nature of relevance of documents to user queries can be modelled. A di erent approach is based on the application of the connectionist theory =-=[48]-=- to IR. Neural networks have been used in this context to design and implement IRSs that are able to adapt to the characteristics of the IR environment, and in particular to the user's interpretation ... |

64 | Is this document relevant? . . . probably”. A survey of probabilistic models in information retrieval
- Crestani, Lalmas, et al.
- 1998
(Show Context)
Citation Context ...n and uncertainty independently on the application domain. The most long standing set of approaches belonging tos2 Fabio Crestani and Gabriella Pasi this class goes under the name of Probabilistic IR =-=[20]-=-. The aim of Probabilistic IR is develop ad hoc models able to cope with the uncertainty of the retrieval process. However, there is another set of approaches receiving increasing interest that aims a... |

61 |
Progress in the application of Natural Language Processing to Information Retrieval tasks. The Computer Journal
- Smeaton
- 1992
(Show Context)
Citation Context ...h in IR has aimed at modelling the vagueness and uncertainty which invariably characterise the management of information. A rst class of approaches is based on methods of analysis of natural language =-=[56]-=-. The main limitation of these methods is the level of deepness of the analysis of the language, and their consequent range of applicability: a satisfying interpretation of the documents' meaning need... |

58 |
Parallel free-text search on the connection machine system
- STANFILL, KAHLE
- 1986
(Show Context)
Citation Context ...t could be compared with those produced by conventional IRS. Another not dissimilar approach is that of a document retrieval system implemented on a Connection Machine (CM) by C. Stan ll and B. Kahle =-=[57]-=-. R. K. Belew [2, 3]investigated in depth the use of various NN techniques in an IRS called AIR. AIR 's structure is made of three layers (see Figure 4.2). Nodes on the rst layer represent descriptors... |

40 |
A neural network for probabilistic information retrieval
- Kwok
- 1989
(Show Context)
Citation Context ...word frequency in the initial indexing and the opinions of the users. K. L. Kwok attempted to use the NN paradigm to reformulate the probabilistic model of IR with single terms as document components =-=[30, 31]-=-. The model proposed by Kwok is represented in Figure 4.3. It is a three layers16 Fabio Crestani and Gabriella Pasi brain neuron parallel informat memory associat AND81 WIN84 BAR81 KOW81 MOZER ANDERSO... |

40 |
A mathematical model of a weighted Boolean retrieval system
- Waller, Kraft
- 1979
(Show Context)
Citation Context ...rmation Retrieval in analogy with the area called Soft Computing. Fuzzy set theory [64] is a formal framework well suited to model vagueness: in IR it has been successfully employed at several levels =-=[29, 60]-=-, in particular for the de nition of a superstructure of the Boolean model, with the appealing consequence that existing Boolean IRSs can be improved without redesigning them completely [10, 11,13]. T... |

36 |
A note on weighted queries in information retrieval systems, Journal of the American Society for Information Science 38
- Yager
- 1987
(Show Context)
Citation Context ... weights implies a di erent de nition of function g. Some authors, among which Radecki, Bookstein, Yager have interpreted query weights as indicators of the relative importance among terms in a query =-=[7, 46, 62]-=-. The problem with this semantics however is related with its dependence on the type of the aggregation operator which connects pairs of selection criteria. When using an AND, for example, a very smal... |

32 |
Fuzzy Set Theoretical Approach to Document Retrieval
- Radecki
- 1979
(Show Context)
Citation Context ...cument representations A rst natural extension of the Boolean model is to represent a document asa fuzzy set of terms, thus making the description of the document's information contents more accurate =-=[46]-=-. For each term associated with a document a numeric weight is speci ed (the membership degree), which expresses the level of concern of the term with respect to the information contained in the docum... |

27 |
Fuzzy Set and Generalized Boolean Retrieval Systems
- Kraft, Buell
- 1993
(Show Context)
Citation Context ...rmation Retrieval in analogy with the area called Soft Computing. Fuzzy set theory [64] is a formal framework well suited to model vagueness: in IR it has been successfully employed at several levels =-=[29, 60]-=-, in particular for the de nition of a superstructure of the Boolean model, with the appealing consequence that existing Boolean IRSs can be improved without redesigning them completely [10, 11,13]. T... |

26 |
Query term weights as constraints in fuzzy information retrieval
- Bordogna, Carrara, et al.
- 1991
(Show Context)
Citation Context ...evels [29, 60], in particular for the de nition of a superstructure of the Boolean model, with the appealing consequence that existing Boolean IRSs can be improved without redesigning them completely =-=[10, 11,13]-=-. Through these extensions the gradual nature of relevance of documents to user queries can be modelled. A di erent approach is based on the application of the connectionist theory [48] to IR. Neural ... |

26 | A connectionist view on document classification
- Merkl
- 1995
(Show Context)
Citation Context ...to the one of a real information retrieval system. Moreover, there is no evaluation of the implemented system, not only for the quality of clustering but also for the retrieval e ectiveness. D. Merkl =-=[36, 37]-=- used the self-organising map for clustering software library documents of the National Institute of Health (NIH, a US government organisation) class library (NIHCL). NIHCL is a collection of classes ... |

22 |
Neural networks in natural language processing and information retrieval. The Netherlands: NorthHolland
- Scholtes
- 1993
(Show Context)
Citation Context ...uzzy cognitive maps with competitive di erential Hebbian learning and ART might also be employed to support hypertexts. J.C. Scholtes made a very extensive survey of the application of NN . Moreover, =-=[52]-=- contains a chapter that presents an implemented NN method for free-text search. A speci c interest (a query) is taught to a Kohonen feature map. Subsequently, large amounts of unstructured documents ... |

21 |
Adaptive Information Retrieval: Machine Learning in Associative Networks
- Belew
- 1986
(Show Context)
Citation Context ...d with those produced by conventional IRS. Another not dissimilar approach is that of a document retrieval system implemented on a Connection Machine (CM) by C. Stan ll and B. Kahle [57]. R. K. Belew =-=[2, 3]-=-investigated in depth the use of various NN techniques in an IRS called AIR. AIR 's structure is made of three layers (see Figure 4.2). Nodes on the rst layer represent descriptors; nodes in the secon... |

21 |
A model for a weighted retrieval system
- Buell, Kraft
- 1981
(Show Context)
Citation Context ...evels [29, 60], in particular for the de nition of a superstructure of the Boolean model, with the appealing consequence that existing Boolean IRSs can be improved without redesigning them completely =-=[10, 11,13]-=-. Through these extensions the gradual nature of relevance of documents to user queries can be modelled. A di erent approach is based on the application of the connectionist theory [48] to IR. Neural ... |

21 |
Inductive Information Retrieval Using Parallel Distributed Computation
- Mozer
- 1984
(Show Context)
Citation Context ...ans that the teacher speci es the desired output of the NN. During the learning phase the NN adapt the values of the weights on the connections in order to obtain the desired output [54]. M. C. Mozer =-=[41]-=- was one of the rst researcher to start working on the application of NN to IR. Some of his ideas are still research ground for many researchers. The dynamic of this model was based on McClelland and ... |

18 |
Linguistic aggregation operators of selection criteria in fuzzy information retrieval
- Bordogna, Pasi
- 1995
(Show Context)
Citation Context ...ation of some level of importance associated with the terms in a query, linguistic weights have been formalised, such asimportant, very important, fairly important, etc. To this aim Bordogna and Pasi =-=[8]-=- have de ned a fuzzy retrieval model in which the linguistic descriptors are formalised within the framework of fuzzy set theory through linguistic variables [65]. A ht; li pair identi es a qualitativ... |

17 |
A neural algorithm for document clustering
- Macleod, Robertson
- 1991
(Show Context)
Citation Context ...classes of applications we will review. We will only report some of the most signi cant approaches and for a more complete review we refer the interested reader to [53]. K.J. MacLeod and W. Robertson =-=[34]-=- were among the rst to examined in depth the suitability of current NN models for performing document clustering. The Adaptive Resonance Theory as well as Backpropagation models were examined. The cit... |

14 |
87b: Knowledge-Assisted Document Retrieval: II. The Retrieval Process
- Biswas, Marques, et al.
- 1987
(Show Context)
Citation Context ...zzy measures for evaluating the e ectiveness of IRSs, in terms of recall and precision. These approaches are not described here, but the interested reader can read, among others, the contributions in =-=[33, 5, 6]-=- relating to the former applications, and the contributions in [14, 38] relating to the latter applications. 3.1 Extended Boolean models: fuzzy document representations A rst natural extension of the ... |

13 |
Importance in knowledge systems
- Sanchez
- 1989
(Show Context)
Citation Context ...r aggregation operators. To this aim new de nitions of aggregation operators have been proposed; for example, Salton, Fox and Wu proposed a model based on a pnorm operator [49]. Hayashi [22], Sanchez =-=[51]-=- and Paice [44] consider "soft" Boolean operators weighted between the AND and the OR as a compromise. However, in these approaches di erent soft interpretations of the Boolean connectives in the same... |

11 |
Soft evaluation of Boolean search queries in Information Retrieval systems
- Paice
- 1984
(Show Context)
Citation Context ...perators. To this aim new de nitions of aggregation operators have been proposed; for example, Salton, Fox and Wu proposed a model based on a pnorm operator [49]. Hayashi [22], Sanchez [51] and Paice =-=[44]-=- consider "soft" Boolean operators weighted between the AND and the OR as a compromise. However, in these approaches di erent soft interpretations of the Boolean connectives in the same query are not ... |

11 |
Higher structures in multi-criteria decision making
- Yager
- 1992
(Show Context)
Citation Context ...e following query can be formulated:sSoft Information Retrieval 11 all( expert, systems) and possibly at least 1 (fuzzy, ANN) The and possibly operator has been de ned as a non-monotonic intersection =-=[63]-=- and provides a further level of softening of the retrieval mechanism, not discarding documents which satisfy only the essential criteria. 3.3 Fuzzy Thesauri of terms Associative mechanisms are de ned... |

8 |
Performance measurement in a fuzzy retrieval environment
- Buell, Kraft
- 1981
(Show Context)
Citation Context ...and precision. These approaches are not described here, but the interested reader can read, among others, the contributions in [33, 5, 6] relating to the former applications, and the contributions in =-=[14, 38]-=- relating to the latter applications. 3.1 Extended Boolean models: fuzzy document representations A rst natural extension of the Boolean model is to represent a document asa fuzzy set of terms, thus m... |

8 | A Model for Adaptive Information Retrieval
- Crestani, Rijsbergen
- 1997
(Show Context)
Citation Context ...ve relevant documents that the original or the expanded one are not able to retrieve. An integration of the proposed prototype system into a more general network model for adaptive IR is presented in =-=[18]-=-. 4.2 Unsupervised Learning Techniques In unsupervised learning procedures the NN does not receive any teaching or learning feedback, but it is left to learn by itself. This procedure is also often re... |

7 |
A user-adaptive neural network supporting a rule-based relevance feedback
- Bordogna, Pasi
- 1996
(Show Context)
Citation Context ...ern vector feed into the Kohonen self-organising map. The paper does not report e ectiveness results, but reports an example of the actual subject-related document clusters produced by the system. In =-=[12]-=- G. Bordogna and G. Pasi have proposed a neural relevance feedback model based on the de nition of an associative neural network, and on a rule-based mechanism to expand the query evaluation with the ... |

7 |
Comparing Neural and Probabilistic Relevance Feedback in an Interactive Information Retrieval System
- Crestani
- 1994
(Show Context)
Citation Context ...e feedback because is uses a non-linear discriminating function. Query adaptation produced by this techniques, in fact, gives performance similar to those provided by probabilistic relevance feedback =-=[16]-=-. The interesting thing is that the adapted query is most of the times quite di erent from its original formulation and to the expanded query produced by relevance feedback. Accordingly, the sets of d... |

7 |
FIRST: Fuzzy information retrieval system
- Lucarella, Morara
- 1991
(Show Context)
Citation Context ...zzy measures for evaluating the e ectiveness of IRSs, in terms of recall and precision. These approaches are not described here, but the interested reader can read, among others, the contributions in =-=[33, 5, 6]-=- relating to the former applications, and the contributions in [14, 38] relating to the latter applications. 3.1 Extended Boolean models: fuzzy document representations A rst natural extension of the ... |

7 |
Fuzzy Information Retrieval Based on a Fuzzy Pseudothesaurus
- Miyamoto, Nakayama
- 1986
(Show Context)
Citation Context ...rse relation; the relation related term (RT) is de ned to exploit synonyms or near-synonyms. Fuzzy thesauri have been de ned in order to express the strength in the association between pairs of terms =-=[38, 39,42]-=-. The rst works on fuzzy thesauri introduced the notion of fuzzy relations to represent associations between terms. Kohout, Keravanou, and Bandler [27] consider a synonym link to be a fuzzy binary rel... |

6 |
Application of the Interactive Activation Model to Document Retrieval
- Bein, Smolenksy
- 1988
(Show Context)
Citation Context ... expresses the level of concern of the term with respect to the information contained in the document; formally, the function de ning the relation between documents and terms is de ned as: F : D T ,! =-=[0; 1]-=-. A document is then represented as a fuzzy set of terms, f (t)=tg, in which (t) = F (d; t). The fuzzy document representation is then based on the de nition of a weighted indexing function, which for... |

6 |
Application of Neural Networks to Information Retrieval
- Kwok
- 1990
(Show Context)
Citation Context ...word frequency in the initial indexing and the opinions of the users. K. L. Kwok attempted to use the NN paradigm to reformulate the probabilistic model of IR with single terms as document components =-=[30, 31]-=-. The model proposed by Kwok is represented in Figure 4.3. It is a three layers16 Fabio Crestani and Gabriella Pasi brain neuron parallel informat memory associat AND81 WIN84 BAR81 KOW81 MOZER ANDERSO... |

5 |
A fuzzy representation of HTML documents for information retrieval system
- Moliniari, Pasi
- 1996
(Show Context)
Citation Context ...ri and Pasi have proposed an approach to index documents written in HTML (HyperText Markup Language), in which another kind of structure is exploited, based on the syntactic structure of the language =-=[40]-=-. The basic assumption is that, when writing a document in HTML, one associates a di erent importance with di erent documents' subparts, by delimiting them by means of appropriate tags. For example, i... |

4 |
Controlling retrieval trough a user-adaptive representation of documents
- Bordogna, Pasi
- 1995
(Show Context)
Citation Context ...bstract and introduction should be rst analysed. To face with this problem Bordogna and Pasi have proposed a fuzzy representation of structured documents, which can be biased by user's interpretation =-=[9]-=-. The signi cance of a term t in a given document d is computed by rst evaluating the signi cance of t in each of the n sections; this is done by means of the application of a function Fci which has t... |

4 |
A Proposal of Fuzzy Connectives with Learning Function Using Steepest Method
- Hayashi, Wakami
- 1991
(Show Context)
Citation Context ...ition of softer aggregation operators. To this aim new de nitions of aggregation operators have been proposed; for example, Salton, Fox and Wu proposed a model based on a pnorm operator [49]. Hayashi =-=[22]-=-, Sanchez [51] and Paice [44] consider "soft" Boolean operators weighted between the AND and the OR as a compromise. However, in these approaches di erent soft interpretations of the Boolean connectiv... |

4 |
Fuzzy Query Processing Using Clustering Techniques
- Kamel, Hadfield, et al.
- 1990
(Show Context)
Citation Context ... fuzzy clustering is the fact that the degreesSoft Information Retrieval 13 of fuzziness is controlled. Several researchers have worked on fuzzy clustering for retrieval, who include as Kamel, et al. =-=[26]-=-, and Miyamoto [38], and De Mantaras et al. [21]. 4. Application of Neural Networks to Information Retrieval The application of connectionist models to Information Retrieval (IR) is not a recent pheno... |

3 |
Transitive closures of fuzzy thesauri for information retrieval systems,” Int
- Bezdek, Huang
- 1986
(Show Context)
Citation Context ...han term t 1)by means of fuzzy implication. Miyamoto and Nakayama [39] have introduced the concept of fuzzy pseudo-thesauri and fuzzy associations based on a citation index. Bezdek, Biswas, and Huang =-=[4]-=- generate a thesaurus based on the maxstar transitive closure for linguistic completion of a thesaurus generated initially by an expert linking terms. Miyamoto has introduced the following de nition o... |

3 | A natural language information retrieval system with extensions towards fuzzy reasoning
- Bolc, Kowalski, et al.
- 1985
(Show Context)
Citation Context ...zzy measures for evaluating the e ectiveness of IRSs, in terms of recall and precision. These approaches are not described here, but the interested reader can read, among others, the contributions in =-=[33, 5, 6]-=- relating to the former applications, and the contributions in [14, 38] relating to the latter applications. 3.1 Extended Boolean models: fuzzy document representations A rst natural extension of the ... |

3 |
Domain Knowledge Acquisition for Information Retrieval Using Neural Networks
- Crestani
- 1994
(Show Context)
Citation Context ...ion - specialisation relationship) the strength of the connection between a pair of terms is di erent according to the direction under consideration. A di erent approach isfollowed by F. Crestani. In =-=[17]-=- an approach to using NN as a blackbox to acquire domain knowledge from and IR application is investigated with a series of experiments. Crestani developed a prototype adaptive IR system. The use of t... |

3 |
Document retrieval using a neural network
- Hingston, Wilkinson
- 1990
(Show Context)
Citation Context ...t are not part of the usersSoft Information Retrieval 15 speci ed query and among documents which are not directly associated with the descriptors in the user query. Also P. Hingston and R. Wilkinson =-=[24]-=- continued in the re nement of Mozer's ideas. The architecture of their model is, in fact, almost the same as the one proposed initially by Mozer. The major contribution of their work is in the propos... |

3 |
Connectionist learning in costructing thesaurus-like knowledge structure
- Jung, Raghavan
- 1990
(Show Context)
Citation Context ...arning process makes this approach impracticable for real size collections. Other attempts to combine sounded classical IR techniques with NN can be found in the work of G. S. Jung and V. V. Raghavan =-=[25]-=-. They attempted to marry the Vector Space model with learning paradigms of the connectionist model. The main contribution of their work concerns the construction of a thesaurus-like knowledge represe... |