Results 1 -
3 of
3
Exploiting the similarity of non-matching terms at retrieval time
- Journal of Information Retrieval
, 2000
"... Abstract. In classic Information Retrieval systems a relevant document will not be retrieved in response to a query if the document and query representations do not share at least one term. This problem, known as “term mismatch”, has been recognised for a long time by the Information Retrieval commu ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
Abstract. In classic Information Retrieval systems a relevant document will not be retrieved in response to a query if the document and query representations do not share at least one term. This problem, known as “term mismatch”, has been recognised for a long time by the Information Retrieval community and a number of possible solutions have been proposed. Here I present a preliminary investigation into a new class of retrieval models that attempt to solve the term mismatch problem by exploiting complete or partial knowledge of term similarity in the term space. The use of term similarity enables to enhance classic retrieval models by taking into account non-matching terms. The theoretical advantages and drawbacks of these models are presented and compared with other models tackling the same problem. A preliminary experimental investigation into the performance gain achieved by exploiting term similarity with the proposed models is presented and discussed.
Mixing and Merging for Spoken Document Retrieval
- in Proceedings of SIGIR
, 1998
"... . This paper describes a number of experiments that explored the issues surrounding the retrieval of spoken documents. Two such issues were examined. First, attempting to find the best use of speech recogniser output to produce the highest retrieval effectiveness. Second, investigating the potential ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
. This paper describes a number of experiments that explored the issues surrounding the retrieval of spoken documents. Two such issues were examined. First, attempting to find the best use of speech recogniser output to produce the highest retrieval effectiveness. Second, investigating the potential problems of retrieving from a so-called "mixed collection ", i.e. one that contains documents from both a speech recognition system (producing many errors) and from hand transcription (producing presumably near perfect documents). The result of the first part of the work found that merging the transcripts of multiple recognisers showed most promise. The investigation in the second part showed how the term weighting scheme used in a retrieval system was important in determining whether the system was affected detrimentally when retrieving from a mixed collection. 1 Introduction Over the past few years the field of Information Retrieval (IR) has directed increasing interest towards the retri...
Retrieval of Spoken Documents: First Experiences
, 1997
"... We report on our first experiences in dealing with the retrieval of spoken documents. While lacking the tools and the know-how for performing speech recognition on the spoken documents, we tried to use in the best possible way our knowledge of probabilistic indexing and retrieval of textual document ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We report on our first experiences in dealing with the retrieval of spoken documents. While lacking the tools and the know-how for performing speech recognition on the spoken documents, we tried to use in the best possible way our knowledge of probabilistic indexing and retrieval of textual documents. The techniques we used and the results we obtained are encouraging, motivating our future involvement in other further experimentation in this new area of research. Supported by a "Marie Curie" Research Fellowship from the European Community. Contents 1 Introduction 3 2 Probabilistic Information Retrieval 3 2.1 The binary independence retrieval model . . . . . . . . . . . . . . 4 2.2 Term weighting schemas . . . . . . . . . . . . . . . . . . . . . . . 9 3 The SIRE Information Retrieval system 11 4 The Abbot speech recognition system 12 5 Spoken document retrieval at TREC-6 14 6 The SDR TREC-6 data set 15 7 Experimenting probabilistic retrieval of spoken documents 18 7.1 The PFT wei...

