## Applying Maximum Entropy to Known-Item Email Retrieval

Citations: | 3 - 0 self |

### BibTeX

@MISC{Yahyaei_applyingmaximum,

author = {Sirvan Yahyaei and Christof Monz},

title = {Applying Maximum Entropy to Known-Item Email Retrieval},

year = {}

}

### OpenURL

### Abstract

Abstract. It is becoming increasingly common in information retrieval to combine evidence from multiple resources to compute the retrieval status value of documents. Although this has led to considerable improvements in several retrieval tasks, one of the outstanding issues is estimation of the respective weights that should be associated with the different sources of evidence. In this paper we propose to use maximum entropy in combination with the limited memory LBFG algorithm to estimate feature weights. Examining the effectiveness of our approach with respect to the known-item finding task of enterprise track of TREC shows that it significantly outperforms a standard retrieval baseline and leads to competitive performance. 1

### Citations

1078 | A Maximum Entropy Approach to Natural Language Processing
- Berger, Pietra, et al.
- 1996
(Show Context)
Citation Context ...Maximum Entropy Modeling Statistical modeling is used to build a model to predict the behavior of a process. A labeled training set is employed to learn a model predict future behavior of the process =-=[1]-=-. The first modeling task is feature selection and the second one is model selection. Firstly, a set of statistics is determined and then these statistics will be employed to construct an accurate mod... |

216 |
Updating quasi-Newton matrices with limited storage
- Nocedal
(Show Context)
Citation Context ...ems, equation 1 cannot be solved analytically and numerical methods have to be used to find the optimal weights of the features. We decided to use Nocedalâ€™s limited-memory BFGS optimization algorithm =-=[10]-=- which is a very efficient and robust method to solve large scale optimization problems and significantly outperforms the other two optimization approaches we experimented with. 4 Maximum Entropy in I... |

155 | Simple BM25 extension to multiple weighted fields
- Robertson, Zaragoza, et al.
- 2004
(Show Context)
Citation Context ...r combination of scores. However, Ogilvie and Callan [11] have shown that their mixture language model approach outperforms various meta-search methods in almost all cases. Moreover, Robertson et al. =-=[14]-=- have discussed the dangers of linear combination of entire document similarity scores and criticized it in detail. To deal with the problem of combining evidence, we propose a method that addresses m... |

102 | Exploring the similarity space
- Zobel, Moffat
- 1998
(Show Context)
Citation Context ...ment (e.g., body and anchor text), different ways to compute within-document and collection term frequencies or the combination of different document similarity functions as a whole. Zobel and Moffat =-=[16]-=- show that it is very difficult to find a similarity measure which is best in all cases, but at the same time they show that there is still a lot of room for improvement by varying retrieval strategie... |

85 | Combining document representations for known-item search
- Ogilvie, Callan
- 2003
(Show Context)
Citation Context ...of documents d with k fields {f1, f2, ..., fk}. As mentioned, most of the work in this area is in the form of combining scores, particularly, linear combination of scores. However, Ogilvie and Callan =-=[11]-=- have shown that their mixture language model approach outperforms various meta-search methods in almost all cases. Moreover, Robertson et al. [14] have discussed the dangers of linear combination of ... |

77 | Discriminative models for information retrieval
- Nallapati
- 2004
(Show Context)
Citation Context ...and Ponte [4] showed that ranking formulas of the Binary Independence Model (BIR) and Combination Match Model (CMM) can be derived from the maximum entropy principle with suitable features. Nallapati =-=[9]-=- explored discriminative models for IR and applied maximum entropy and support vector machines to several ad-hoc retrieval test sets. However, because of the rather discouraging results in these tasks... |

54 | Term proximity scoring for keyword-based retrieval systems
- Rasolofo, Savoy
- 2003
(Show Context)
Citation Context ...ity, phrase match and first occurrence position features. Lastly, we use query independent features such as message depth in the thread.There are many different methods to calculate term proximities =-=[13, 8]-=-. Our term proximity feature computes the sum of minimum distances between term pairs. We chose this method of calculating proximity to avoid using features which carry similar information. For exampl... |

33 | From document retrieval to question answering
- Monz
- 2003
(Show Context)
Citation Context ...ity, phrase match and first occurrence position features. Lastly, we use query independent features such as message depth in the thread.There are many different methods to calculate term proximities =-=[13, 8]-=-. Our term proximity feature computes the sum of minimum distances between term pairs. We chose this method of calculating proximity to avoid using features which carry similar information. For exampl... |

13 |
Exploiting the maximum entropy principle to increase retrieval effectiveness
- COOPER
- 1983
(Show Context)
Citation Context ...tion. On the other hand, there are a number of wellstudied optimization algorithms such as IIS, and L-BFGS for maximum entropy. There have been a few attempts to explore maximum entropy in IR. Cooper =-=[2]-=- applied maximum entropy to information retrieval. Kantor and Lee [5] explored the application of maximum entropy, but more recently ([6]) they reported low performance on large document collections. ... |

12 | The Maximum Entropy approach and Probabilistic
- Greiff, Ponte
- 2000
(Show Context)
Citation Context ...entropy to information retrieval. Kantor and Lee [5] explored the application of maximum entropy, but more recently ([6]) they reported low performance on large document collections. Greiff and Ponte =-=[4]-=- showed that ranking formulas of the Binary Independence Model (BIR) and Combination Match Model (CMM) can be derived from the maximum entropy principle with suitable features. Nallapati [9] explored ... |

12 | The maximum entropy principle in information retrieval
- Kantor, Lee
- 1986
(Show Context)
Citation Context ...on algorithms such as IIS, and L-BFGS for maximum entropy. There have been a few attempts to explore maximum entropy in IR. Cooper [2] applied maximum entropy to information retrieval. Kantor and Lee =-=[5]-=- explored the application of maximum entropy, but more recently ([6]) they reported low performance on large document collections. Greiff and Ponte [4] showed that ranking formulas of the Binary Indep... |

11 | Uniform representation of content and structure for structured document retrieval
- Lalmas
- 2000
(Show Context)
Citation Context ...cantly outperforms a standard retrieval baseline and leads to competitive performance. 1 Introduction In several information retrieval tasks, such as web retrieval [15], structured document retrieval =-=[7]-=- and email retrieval [3], a number of approaches combine evidence from multiple resources to compute the retrieval status values. Typically the different sources of evidence include term frequencies w... |

9 |
de Vries, and Ian Soboroff. Overview of the TREC-2005 enterprise track
- Craswell, Arjen
(Show Context)
Citation Context ...ndard retrieval baseline and leads to competitive performance. 1 Introduction In several information retrieval tasks, such as web retrieval [15], structured document retrieval [7] and email retrieval =-=[3]-=-, a number of approaches combine evidence from multiple resources to compute the retrieval status values. Typically the different sources of evidence include term frequencies within different fields o... |

5 | Experiments with language models for known-item finding of e-mail messages
- Ogilvie, Callan
- 2005
(Show Context)
Citation Context ...nto a maximum entropy approach. Another difference between maximum entropy and the above approaches is the estimation of the optimal weights or parameters of the ranking functions. Ogilvie and Callan =-=[12]-=- did not mention any optimization algorithms for finding the appropriate feature weights. Robertson et al. [14] used grid search for findingthe parameters of their function. On the other hand, there ... |