• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A word clustering approach for language model-based sentence retrieval in question answering systems (2009)

by S Momtazi, D Klakow
Venue:In Proceedings of CIKM
Add To MetaCart

Tools

Sorted by:
Results 1 - 7 of 7

Yahoo! Answers for Sentence Retrieval in Question Answering

by Saeedeh Momtazi, Dietrich Klakow
"... Question answering systems which automatically search for user’s information need are considered as a separate issue from the community-generated question answering which answers users ’ questions by human respondents. Although the two answering systems have different applications, both of them aim ..."
Abstract - Add to MetaCart
Question answering systems which automatically search for user’s information need are considered as a separate issue from the community-generated question answering which answers users ’ questions by human respondents. Although the two answering systems have different applications, both of them aim to present a correct answer to the users ’ question and consequently they can feed each other to improve their performance and efficiency. In this paper, we propose a new idea to use the information derived from a community question answering forum in an automatic question answering system. To this end, two different frameworks, namely the class-based model and the trained trigger model, have been used in a language model-based sentence retrieval system. Both models try to capture word relationships from the question-answer sentence pair of a community forum. Using a standard TREC question answering dataset, we evaluate our proposed models on the subtask of sentence retrieval, while training the models on the Yahoo! Answer corpus. Results show both methods that trained on Yahoo! Answers logs significantly outperform the unigram model, in which the class-based model achieved 4.72 % relative improvement in mean average precision and the trained triggering model achieved 18.10 % relative improvement in the same evaluation metric. Combination of both proposed models also improved the system mean average precision 19.29%. 1.

Effective Term Weighting for Sentence Retrieval

by Saeedeh Momtazi, Matthew Lease, Dietrich Klakow
"... Abstract. A well-known challenge of information retrieval is how to infer a user’s underlying information need when the input query consists of only a few keywords. Question Answering (QA) systems face an equally important but opposite challenge: given a verbose question, how can the system infer th ..."
Abstract - Add to MetaCart
Abstract. A well-known challenge of information retrieval is how to infer a user’s underlying information need when the input query consists of only a few keywords. Question Answering (QA) systems face an equally important but opposite challenge: given a verbose question, how can the system infer the relative importance of terms in order to differentiate the core information need from supporting context? We investigate three simple term-weighting schemes for such estimation within the language modeling retrieval paradigm [6]. While the three schemes described are ad hoc, they address a principled estimation problem underlying the standard word unigram model. We also show these schemes enable better estimation of a state-of-the-art class model based on term clustering [5]. Using a TREC QA dataset, we evaluate the three weighting schemes for both word and class models on the QA subtask of sentence retrieval. Our inverse sentence frequency weighting scheme achieves over 5 % absolute improvement in mean-average precision for the standard word model and nearly 2 % absolute improvement for the class model. 1

and Speech Processing

by Saeedeh Momtazi, Sanjeev Khudanpur, Dietrich Klakow
"... dietrich.klakow ..."
Abstract - Add to MetaCart
dietrich.klakow

Integrating history-length interpolation and classes in language modeling

by Hinrich Schütze
"... Building on earlier work that integrates different factors in language modeling, we view (i) backing off to a shorter history and (ii) class-based generalization as two complementary mechanisms of using a larger equivalence class for prediction when the default equivalence class is too small for rel ..."
Abstract - Add to MetaCart
Building on earlier work that integrates different factors in language modeling, we view (i) backing off to a shorter history and (ii) class-based generalization as two complementary mechanisms of using a larger equivalence class for prediction when the default equivalence class is too small for reliable estimation. This view entails that the classes in a language model should be learned from rare events only and should be preferably applied to rare events. We construct such a model and show that both training on rare events and preferable application to rare events improve perplexity when compared to a simple direct interpolation of class-based with standard language models. 1

Spoken

by Grzegorz Chrupała
"... Word classes automatically induced from distributional evidence have proved useful many NLP tasks including Named Entity Recognition, parsing and sentence retrieval. The Brown hard clustering algorithm is commonly used in this scenario. Here we propose to use Latent Dirichlet Allocation in order to ..."
Abstract - Add to MetaCart
Word classes automatically induced from distributional evidence have proved useful many NLP tasks including Named Entity Recognition, parsing and sentence retrieval. The Brown hard clustering algorithm is commonly used in this scenario. Here we propose to use Latent Dirichlet Allocation in order to induce soft, probabilistic word classes. We compare our approach against Brown in terms of efficiency. We also compare the usefulness of the induced Brown and LDA word classes for the semi-supervised learning of three NLP tasks: fine-grained Named Entity Recognition, Morphological Analysis and semantic Relation Classification. We show that using LDA for word class induction scales better with the number of classes than the Brown algorithm and the resulting classes outperform Brown on the three tasks. 1

Trained Trigger Language Model for Sentence Retrieval in QA: Bridging the Vocabulary Gap

by Saeedeh Momtazi, Dietrich Klakow
"... We propose a novel language model for sentence retrieval in Question Answering (QA) systems called trained trigger language model. This model addresses the word mismatch problem in information retrieval. The proposed model captures pairs of trigger and target words while training on a large corpus. ..."
Abstract - Add to MetaCart
We propose a novel language model for sentence retrieval in Question Answering (QA) systems called trained trigger language model. This model addresses the word mismatch problem in information retrieval. The proposed model captures pairs of trigger and target words while training on a large corpus. The word pairs are extracted based on both unsupervised and supervised approaches while different notions of triggering are used. In addition, we study the impact of corpus size and domain for a supervised model. All notions of the trained trigger model are finally used in a language model-based sentence retrieval framework. Our experiments on TREC QA collection verify that the proposed model significantly improves the sentence retrieval performance compared to the state-of-the-art translation model and class model which address the same problem.

Rel-grams: A Probabilistic Model of Relations in Text

by Niranjan Balasubramanian, Stephen Soderl, Oren Etzioni
"... (bomb; explode near;?) (?; claim; responsibility) We introduce the Rel-grams language model, which is analogous to an n-grams model, but is computed over relations rather than over words. The model encodes the conditional probability of observing a relational tuple R, given that R ′ was observed in ..."
Abstract - Add to MetaCart
(bomb; explode near;?) (?; claim; responsibility) We introduce the Rel-grams language model, which is analogous to an n-grams model, but is computed over relations rather than over words. The model encodes the conditional probability of observing a relational tuple R, given that R ′ was observed in a window of prior relational tuples. We build a database of Rel-grams co-occurence statistics from Re-Verb extractions over 1.8M news wire documents and show that a graphical model based on these statistics is useful for automatically discovering event templates. We make this database freely available and hope it will prove a useful resource for a wide variety of NLP tasks. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University