## Using Markov Chains to Exploit Word Relationships in Information Retrieval

### Cached

### Download Links

- [www.iro.umontreal.ca]
- [riao.free.fr]
- [www.iro.umontreal.ca]
- DBLP

### Other Repositories/Bibliography

Citations: | 7 - 0 self |

### BibTeX

@MISC{Cao_usingmarkov,

author = {Guihong Cao and Jian-yun Nie and Jing Bai},

title = {Using Markov Chains to Exploit Word Relationships in Information Retrieval},

year = {}

}

### OpenURL

### Abstract

Document expansion and query expansion aim to add related terms into document and query representations in order to make them more complete. However, most previous studies are limited in two respects: They use either query expansion or document expansion, but not both; expansion has been limited to directly related words. In this paper, we propose a more general approach: both document and query representations are expanded, and the expansion process also exploits indirect term relationships. The whole process is implemented through Markov chains. Our experiments show that each of these extensions brings additional improvements.

### Citations

8094 | Maximum likelihood from incomplete data via the em algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ... w Q = λP ml w Q + 1 − λ P(w|F) (7) where P(w|F) is the probability of w in F and λ is the coefficient of original query model (set to be 0.5 ) This feedback model can be estimated with EM algorithm (=-=Dempster et al., 1977-=-) by maximizing the likelihood of feedback documents given the query model, as in (Zhai and Lafferty, 2001b). Transition Probability To estimate the transition probability P w i w j, Q , a first appro... |

3699 |
Artificial Intelligence, A Modern Approach (secong edition
- Russell, Norvig
- 2003
(Show Context)
Citation Context ...ansion. However, this also limits the inference power and a possible connection between a document and a query can remain hidden. At this point, one can draw an analogy with the search problem in AI (=-=Russell and Norvig, 2003-=-). One-direction search can be limited to some steps, and this can make a possible connection between data and goal unseen. In comparison, if search is conducted in both directions: from data to goal ... |

3252 | The anatomy of a large-scale hypertextual web search engine
- Brin, Page
- 1998
(Show Context)
Citation Context ...t only inference using direct term relationships is allowed. In this paper, we further extend inference by using indirect term relationships. This is implemented using multi-stage Markov Chains (MC) (=-=Brin and Page, 1998-=-; Toutanova et al., 2004). Our experiments on TREC collections show that each of the above extensions will lead to consistent improvements in retrieval effectiveness, and several ones among them are s... |

701 | A study of smoothing methods for language models applied to information retrieval
- Zhai, Lafferty
(Show Context)
Citation Context ...ents. 1. INTRODUCTION Statistical language modeling (LM) has been widely used in information retrieval (IR) in recent years (Berger and Lafferty, 1999; Lafferty and Zhai, 2001; Ponte and Croft, 1998; =-=Zhai and Lafferty, 2001-=-b). One typical approach is to construct two language models, one for the query (query model) and another for the document (document model). Then the document is ranked according to the negative KL di... |

320 | Relevance-based language models - Lavrenko, Croft - 2001 |

318 |
Markov chains, Gibbs fields, Monte Carlo simulation, and queues
- Br'emaud
- 1999
(Show Context)
Citation Context ...ted. Conference RIAO2007, Pittsburgh PA, U.S.A. May 30-June 1, 2007 - Copyright C.I.D. Paris, Frances4.1 MC Model for Query Expansion Markov Chain (MC) is a stochastic process having Markov property (=-=Brémaud, 1999-=-). Basically, a MC is defined by two probabilities: the initial probability to select a state, and the transition probability from one state to another. The final probability of a state is determined ... |

304 | Document language models, query models, and risk minimization for information retrieval
- LAFFERTY, C
(Show Context)
Citation Context ...h of these extensions brings additional improvements. 1. INTRODUCTION Statistical language modeling (LM) has been widely used in information retrieval (IR) in recent years (Berger and Lafferty, 1999; =-=Lafferty and Zhai, 2001-=-; Ponte and Croft, 1998; Zhai and Lafferty, 2001b). One typical approach is to construct two language models, one for the query (query model) and another for the document (document model). Then the do... |

271 | Information Retrieval as statistical translation
- Berger, Lafferty
- 1999
(Show Context)
Citation Context ...r experiments show that each of these extensions brings additional improvements. 1. INTRODUCTION Statistical language modeling (LM) has been widely used in information retrieval (IR) in recent years (=-=Berger and Lafferty, 1999-=-; Lafferty and Zhai, 2001; Ponte and Croft, 1998; Zhai and Lafferty, 2001b). One typical approach is to construct two language models, one for the query (query model) and another for the document (doc... |

189 | A hidden markov model information retrieval system - Miller, Leek, et al. - 1999 |

174 | Model-based feedback in the language modeling approach to information retrieval
- Zhai, Lafferty
- 2001
(Show Context)
Citation Context ...ade from two sources: the original query and feedback documents. Let F be the set of top N feedback documents of query Q. Then the initial state distribution can be estimated as in the mixture model (=-=Zhai and Lafferty, 2001-=-b): P 0 w Q = λP ml w Q + 1 − λ P(w|F) (7) where P(w|F) is the probability of w in F and λ is the coefficient of original query model (set to be 0.5 ) This feedback model can be estimated with EM algo... |

123 | Automatic Keyword Classification for Information Retrieval - JONES, K - 1971 |

98 | The Limitation of terme Co-occurrence data for query expansion in document retrieval systems
- Peat, Willett
(Show Context)
Citation Context ...M WSJ_MC SJM_MixM SJM_MC 6. Related work Query expansion has been studied for a long time in IR (Sparck Jones, 1971). With classical IR models (e.g. vector space model), it produced variable results (=-=Peat and Willett, 1991-=-). Recently, several models for both document and query expansions have been proposed within the LM framework (Zhai and Lafferty, 2001b; Lavrenko and Croft, 2001; Cao et al., 2005; Bai et al., 2005). ... |

73 | Dependence language model for information retrieval - Gao, Nie, et al. - 2004 |

55 | Query expansion using random walk models
- Collins-Thompson, Callan
- 2005
(Show Context)
Citation Context ...nsion and query expansion simultaneously Each of the above extensions has resulted in some improvements in retrieval effectiveness. MC has also been used in some previous studies in IR. For example, (=-=Collins-Thompson and Callan, 2005-=-) used it for query expansion. However, our method is different from theirs in several ways: first, we do not use many heuristics as in their work. We followed a more principled development and the pa... |

51 |
Integrating word relationships into language models
- Cao, Nie, et al.
- 2005
(Show Context)
Citation Context ...2007, Pittsburgh PA, U.S.A. May 30-June 1, 2007 - Copyright C.I.D. Paris, FrancesSeveral studies have been conducted to relax the independence assumption (Bai et al., 2005; Berger and Lafferty, 1999; =-=Cao et al., 2005-=-; Lafferty and Zhai, 2001): The relationships between query terms and document terms are used to relate a document to a query, even though they contain different (but related) terms. From a broader po... |

51 | Linear discriminant model for information retrieval
- Gao, Qi, et al.
- 2005
(Show Context)
Citation Context ... generative methods to maximize the likelihood of queries (or relevant documents) (Cao et al., 2005; Zhai and Lafferty, 2001b) and discriminative methods to optimize the mean average precision (MAP) (=-=Gao et al., 2005-=-) on some training data. Here we try to optimize MAP. We follow the discriminative training method used in (Toutanova et al., 2004), which defines an objective function to be optimized from the coeffi... |

49 | Learning Random Walk Models for Inducing Word Dependency Distributions
- Toutanova, Manning, et al.
- 2004
(Show Context)
Citation Context ...irect term relationships is allowed. In this paper, we further extend inference to by using indirect term relationships. This is implemented using multi-stage Markov Chains (MC) (Brin and Page, 1998; =-=Toutanova et al., 2004-=-). Our experiments on TREC collections show that each of the above extensions will lead to consistent improvements in retrieval effectiveness. This allows us to conclude that a higher inference capabi... |

44 | Contextual search and name disambiguation in email using graphs
- MINKOV, COHEN, et al.
- 2006
(Show Context)
Citation Context ...r multi-step inference. Markov Chain (MC) is a suitable mechanism to implement multi-step inferences. MC has been widely used in several previous studies (Brin and Page, 1998; Toutanova et al., 2004, =-=Minkov et al., 2006-=-). In LM framework, (Lafferty and Zhai, 2001) also uses MC for query expansion. In that paper, transitions between terms are made via documents: a transition from a term to some documents, then from t... |

41 | G.: Query expansion using term relationships in language models for information retrieval
- Bai, Song, et al.
- 2005
(Show Context)
Citation Context ...e the document and the query. Conference RIAO2007, Pittsburgh PA, U.S.A. May 30-June 1, 2007 - Copyright C.I.D. Paris, FrancesSeveral studies have been conducted to relax the independence assumption (=-=Bai et al., 2005-=-; Berger and Lafferty, 1999; Cao et al., 2005; Lafferty and Zhai, 2001): The relationships between query terms and document terms are used to relate a document to a query, even though they contain dif... |

35 | Towards context-sensitive information inference
- SONG, D
- 1998
(Show Context)
Citation Context ...n” approach. A similar approach can also be used for query expansion. For example, (Bai et al., 2005) used co-occurrence relationships, as well as inference relationships induced by information flow (=-=Song and Bruza, 2003-=-), to expand the query model. Despite the fact that the above models are able to infer new terms according to term relationships, inference has been limited to one step, i.e. only directly related ter... |

21 | 30-June 1, 2007 - Copyright - RIAO2007, PA, et al. |

20 | An outline of a general model for information retrieval systems - Nie - 1988 |

7 | 2006 Context Dependent Term Relations for Information Retrieval
- Bai, Nie, et al.
(Show Context)
Citation Context ...nce feedback allowed us to restrict the expansion within the area of the query, a possible further improvement is to try to determine related terms to the whole query instead of to query terms as in (=-=Bai et al. 2006-=-). This means to extract more complex and context-dependent term relationships such as (Java, computer)�programming, instead of being limited to those between a pair of words such as Java�programming ... |