Results 1 - 10
of
44
Building a reusable test collection for question answering
- JASIST
"... In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such systems possess language processing capabiliti ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such systems possess language processing capabilities, they still rely on traditional document retrieval techniques to generate an initial candidate set of documents. In this paper, we argue that document retrieval for question answering represents a different task than retrieving documents in response to more general retrospective information needs. Thus, to guide future system development, specialized question answering test collections must be constructed. We have shown that the current evaluation resources have major shortcomings, and to remedy the situation, we have manually created a small, reusable question answering test collection for research purposes. This article describes our methodology for building this test collection and discusses issues we encountered along the way regarding the notion of “answer correctness”. 1.
Entropy-biased Models for Query Representation On The Click Graph
, 2009
"... Query log analysis has received substantial attention in recent years, in which the click graph is an important technique for describing the relationship between queries and URLs. State-of-the-art approaches based on the raw click frequencies for modeling the click graph, however, are not noise-elim ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Query log analysis has received substantial attention in recent years, in which the click graph is an important technique for describing the relationship between queries and URLs. State-of-the-art approaches based on the raw click frequencies for modeling the click graph, however, are not noise-eliminated. Nor do they handle heterogeneous query-URL pairs well. In this paper, we investigate and develop a novel entropy-biased framework for modeling click graphs. The intuition behind this model is that various query-URL pairs should be treated differently, i.e., common clicks on less frequent but more specific URLs are of greater value than common clicks on frequent and general URLs. Based on this intuition, we utilize the entropy information of the URLs and introduce a new concept, namely the inverse query frequency (IQF), to weigh the importance (discriminative ability) of a click on a certain URL. The IQF weighting scheme is never explicitly explored or statistically examined for any bipartite graphs in the information retrieval literature. We not only formally define and quantify this scheme, but also incorporate it with the click frequency and user frequency information on the click graph for an effective query representation. To illustrate our methodology, we conduct experiments with the AOL query log data for query similarity analysis and query suggestion tasks. Experimental results demonstrate that considerable improvements in performance are obtained with our entropy-biased models.
S.: Term-weighting for summarization of multi-party spoken dialogues
- In: Proc. of MLMI 2007
, 2007
"... Abstract. This paper explores the issue of term-weighting in the genre of spontaneous, multi-party spoken dialogues, with the intent of using such term-weights in the creation of extractive meeting summaries. The field of text information retrieval has yielded many term-weighting techniques to impor ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
Abstract. This paper explores the issue of term-weighting in the genre of spontaneous, multi-party spoken dialogues, with the intent of using such term-weights in the creation of extractive meeting summaries. The field of text information retrieval has yielded many term-weighting techniques to import for our purposes; this paper implements and compares several of these, namely tf.idf, Residual IDF and Gain. Weproposethat term-weighting for multi-party dialogues can exploit patterns in word usage among participant speakers, and introduce the su.idf metric as one attempt to do so. Results for all metrics are reported on both manual and automatic speech recognition (ASR) transcripts, and on both the ICSI and AMI meeting corpora. 1
An Exploration of the Principles Underlying Redundancy-Based Factoid Question Answering
, 2007
"... The so-called “redundancy-based” approach to question answering represents a successful strategy for mining answers to factoid questions such as “Who shot Abraham Lincoln? ” from the World Wide Web. Through contrastive and ablation experiments with Aranea, a system that has performed well in several ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
The so-called “redundancy-based” approach to question answering represents a successful strategy for mining answers to factoid questions such as “Who shot Abraham Lincoln? ” from the World Wide Web. Through contrastive and ablation experiments with Aranea, a system that has performed well in several TREC QA evaluations, this work examines the underlying assumptions and principles behind redundancy-based techniques. Specifically, we develop two theses: that stable characteristics of data redundancy allow factoid systems to rely on external “black box” components, and that despite embodying a data-driven approach, redundancy-based methods encode a substantial amount of knowledge in the form of heuristics. Overall, this work attempts to address the broader question of “what really matters” and to provide guidance for future researchers.
Relevance information: A loss of entropy but a gain for idf
- In Proceedings of SIGIR
, 2005
"... When investigating alternative estimates for term discriminativeness, we discovered that relevance information and idf are much closer related than formulated in classical literature. Therefore, we revisited the justification of idf as it follows from the binary independent retrieval (BIR) model. Th ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
When investigating alternative estimates for term discriminativeness, we discovered that relevance information and idf are much closer related than formulated in classical literature. Therefore, we revisited the justification of idf as it follows from the binary independent retrieval (BIR) model. The main result is a formal framework uncovering the close relationship of a generalised idf and the BIR model. The framework makes explicit how to incorporate relevance information into any retrieval function that involves an idf-component. In addition to the idf-based formulation of the BIR model, we propose Poisson-based estimates as an alternative to the classical estimates, this being motivated by the superiority of Poisson-based estimates for the within-document term frequencies. The main experimental finding is that a Poisson-based idf is superior to the classical idf, where the superiority is particularly evident for long queries.
Context-Based Matching and Ranking of Web Services for Composition
"... In this work we propose a two-step, context-based semantic approach to the problem of matching and ranking web services for possible service composition. We present an analysis of different methods for classifying Web services for possible composition and supply a context-based semantic matching me ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this work we propose a two-step, context-based semantic approach to the problem of matching and ranking web services for possible service composition. We present an analysis of different methods for classifying Web services for possible composition and supply a context-based semantic matching method for ranking these possibilities. Semantic understanding of Web services may provide added value by identifying new possibilities for compositions of services. The semantic matching ranking approach is unique since it provides the Web service designer with an explicit numeric estimation of the extent to which a possible composition “makes sense. ” First, we analyze two common methods for text processing, TF/IDF and context analysis, and two types of service description, free text and WSDL. Second, we present a method for evaluating the proximity of services for possible compositions. Each Web service WSDL context descriptor is evaluated according to its proximity to other services ’ free text context descriptors. The methods were tested on a large repository of real-world Web services. The experimental results indicate that context analysis is more useful than TF/IDF. Furthermore, the method evaluating the proximity of the WSDL description to the textual description of other services provides high recall and precision results.
A comparison of document, sentence, and term event spaces
- In: ACL. (2006
"... The trend in information retrieval systems is from document to sub-document retrieval, such as sentences in a summarization system and words or phrases in question-answering system. Despite this trend, systems continue to model language at a document level using the inverse document frequency (IDF). ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The trend in information retrieval systems is from document to sub-document retrieval, such as sentences in a summarization system and words or phrases in question-answering system. Despite this trend, systems continue to model language at a document level using the inverse document frequency (IDF). In this paper, we compare and contrast IDF with inverse sentence frequency (ISF) and inverse term frequency (ITF). A direct comparison reveals that all language models are highly correlated; however, the average ISF and ITF values are 5.5 and 10.4 higher than IDF. All language models appeared to follow a power law distribution with a slope coefficient of 1.6 for documents and 1.7 for sentences and terms. We conclude with an analysis of IDF stability with respect to random, journal, and section partitions of the 100,830 full-text scientific articles in our experimental corpus. 1
Idf term weighting and ir research lessons
- Journal of Documentation
, 2004
"... Abstract: Robertson comments on the theoretical status of IDF term weighting. Its history illustrates how ideas develop in a specific research context, in theory/experiment interaction, and in operational practice. It is an honour to have the small proposal for term weighting that I published more t ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract: Robertson comments on the theoretical status of IDF term weighting. Its history illustrates how ideas develop in a specific research context, in theory/experiment interaction, and in operational practice. It is an honour to have the small proposal for term weighting that I published more than 30 years ago the subject of Stephen Robertson's (2004) paper. I would like to comment on some points that I see as suggesting lessons for information retrieval research. First, the context that prompted the proposal The proposal came from trying to explain why earlier ideas about how to do automatic indexing did not work. They were plausible in themselves, but had quite different objectives. My previous research had concentrated on automatic methods for constructing term classifications intended, by analogy with manual thesauri, as recall-promoting devices. Classes were based on term cooccurrences in documents, following the generic statistical approach to retrieval initially suggested by Luhn (see Schultz, 1968), and applied within the coordination level matching framework. But these classifications, on Cleverdon's (1967) Cranfield data and using the test and evaluation methods that he and Salton (1968) had
Who’s Who in the World Wide Web: Approaches to Name Disambiguation
, 2007
"... Personal names are fundamental in our civilization. Names serve to refer to individuals, but in contrast to e.g. social assurance numbers that are unique for each citizen, names are not treated that strictly. Names do not identify persons in a non-ambiguous way. There are people that share the same ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Personal names are fundamental in our civilization. Names serve to refer to individuals, but in contrast to e.g. social assurance numbers that are unique for each citizen, names are not treated that strictly. Names do not identify persons in a non-ambiguous way. There are people that share the same name. Furthermore, the fact that names are not treated as carefully as numbers often leads to misspellings and confusion.
The increasing importance of the Web in our lives has as consequence that people are more frequently confronted with names of things, places and persons. On the one hand, the Web provides access to more information sources and by this to more information, on the other hand the search for relevant information is getting more difficult. A particular problem occurs in (digital) libraries. They are
expected to catalog publications in a convenient way that facilitates and supports literature research. It is necessary to distinguish authors that are referred to by their names. The problem of homonymous authors arises, i.e. there may be several authors sharing a name. Inversely, authors may publish under different names or name variations, deliberately or unintendedly. Digital libraries spend much effort on the disambiguation of author names.
This work reports the results of a literature research focusing on disambiguation of homonymous authors and proposes a different perception of the data that is processed during name disambiguation. Different pieces of information are integrated into a graph based approach. An implementation of some ideas of the presented approach serves as a proof of concept and gives further insights in the nature of name disambiguation.
Deriving TF-IDF as a fisher kernel
- In 12th International Conference on String Processing and Information Retrieval (SPIRE
, 2005
"... Abstract. The Dirichlet compound multinomial (DCM) distribution has recently been shown to be a good model for documents because it captures the phenomenon of word burstiness, unlike standard models such as the multinomial distribution. This paper investigates the DCM Fisher kernel, a function for c ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. The Dirichlet compound multinomial (DCM) distribution has recently been shown to be a good model for documents because it captures the phenomenon of word burstiness, unlike standard models such as the multinomial distribution. This paper investigates the DCM Fisher kernel, a function for comparing documents derived from the DCM. We show that the DCM Fisher kernel has components that are similar to the term frequency (TF) and inverse document frequency (IDF) factors of the standard TF-IDF method for representing documents. Experiments show that the DCM Fisher kernel performs better than alternative kernels for nearest-neighbor document classification, but that the TF-IDF representation still performs best. 1

