## The Maximum Entropy Approach and Probabilistic IR Models (1998)

Venue: | ACM TRANSACTIONS ON INFORMATION SYSTEMS |

Citations: | 12 - 0 self |

### BibTeX

@ARTICLE{Greiff98themaximum,

author = {Warren R. Greiff and Jay M. Ponte},

title = {The Maximum Entropy Approach and Probabilistic IR Models},

journal = {ACM TRANSACTIONS ON INFORMATION SYSTEMS},

year = {1998},

volume = {18},

pages = {246--287}

}

### OpenURL

### Abstract

The Principle of Maximum Entropy is discussed and two classic probabilistic models of information retrieval, the Binary Independence Model of Robertson and Sparck Jones and the Combination Match Model of Croft and Harper are derived using the maximum entropy approach. The assumptions on which the classical models are based are not made. In their place, the probability distribution of maximum entropy consistent with a set of constraints is determined. It is argued that this subjectivist approach is more philosophically coherent than the frequentist conceptualization of probability that is often assumed as the basis of probabilistic modeling and that this philosophical stance has important practical consequences with respect to the realization of information retrieval research.

### Citations

7146 |
A mathematical theory of communication
- Shannon
- 1948
(Show Context)
Citation Context ...m three intuitively appealing desiderata, Shannon developed a formal expression for a measure of "how much `choice' is involved in the selection of an event or of how uncertain we are of the outc=-=ome" [Sha48]-=-. He showed that for a probability distribution, p = (p 1 ; : : : ; p k ), over k possible elementary events, the quantity: H(p) = k X i=1 pk log pk (2) is, within a constant factor, the unique quanti... |

751 |
Information theory and statistical mechanics
- Jaynes
- 1957
(Show Context)
Citation Context ...Jaynes demonstrates that by viewing it as a problem of statistical inference, Statistical Mechanics can be derived without depending on "additional assumptions not contained in the laws of mechan=-=ics" [Jay57a]-=-. His method of inference is based on what has come to be known as the Principle of Maximum Entropy. In his own words, this principle states that the maximum entropy estimate is: the least biased esti... |

658 |
Relevance weighting of search terms
- Robertson, Jones
- 1976
(Show Context)
Citation Context ...est understood from this perspective. The main focus of the paper will be the ranking formulas corresponding to the Binary Independence Model (bim), presented originally by Robertson and Sparck Jones =-=[RS77]-=- and the Combination Match Model (cmm), developed shortly thereafter by Croft and Harper [CH79]. We will show how these same ranking formulas can result from a probabilistic methodology commonly known... |

409 | A statistical interpretation of term specificity and its application in retrieval
- Jones
- 1972
(Show Context)
Citation Context ...ing retrieval performance. This formula suggests a probabilistic justification of the use of inverse document frequency for the weighting of terms, which was originally proposed by Karen Sparck Jones =-=[Spa72]-=-. 3 The bim-maxent Retrieval Model In this section, we derive a retrieval model based on the Principle of Maximum Entropy. The model, which we shall refer to as bim-maxent, will be constrained in such... |

256 |
The Probability Ranking Principle in IR
- Robertson
- 1977
(Show Context)
Citation Context ... the dice. Each die has been thrown a large number of times, but the only knowledge we have of these experiments is the average value produced by each die. Following the Probability Ranking Principle =-=[Rob77]-=-, we decide to rank the dice by the probability of their producing 4's. How are we to arrive at this probability? Of some things we feel sure. A die whose average is very close to either 1 or 6 should... |

174 |
Using probabilistic models of document retrieval without relevance information
- Croft, Harper
- 1979
(Show Context)
Citation Context ...corresponding to the Binary Independence Model (bim), presented originally by Robertson and Sparck Jones [RS77] and the Combination Match Model (cmm), developed shortly thereafter by Croft and Harper =-=[CH79]-=-. We will show how these same ranking formulas can result from a probabilistic methodology commonly known as Maximum Entropy (maxent). In order to rank documents in response to a query, a probabilisti... |

153 |
Maximum Entropy Econometrics: Robust Estimation with Limited Data
- Golan, Judge, et al.
- 1996
(Show Context)
Citation Context ...iple of maximum entropy has been applied to practical problems in diverse areas [ES88], including image reconstruction [GD78], spectral analysis [Bre88], reliability engineering [Tri69] and economics =-=[GJM96]-=-. In two papers in the early '80s, Cooper and Huizinga [CH82] and Cooper [Coo83], make a strong case for applying the maximum entropy approach to the problems of information retrieval. Cooper points o... |

151 |
Where do we stand on maximum entropy
- Jaynes
- 1979
(Show Context)
Citation Context ...ed to [Fin73, Hac65] for more in depth discussions of these issues. The Principle of Maximum Entropy: At the end of the 19th century, primarily as a result of the work of Maxwell, Boltzmann and Gibbs =-=[Jay79]-=-, the area of Statistical Mechanics was born. As a consequence, the entropy of a physical system became associated with a probability distribution of the phase space of possible atomic configurations.... |

132 | A theoretical basis for the use of co-occurrence data in information - Rijsbergen - 1977 |

124 | Logic of Statistical Inference - Hacking - 1965 |

118 | Probability and the Weighing of Evidence - GOOD - 1950 |

66 |
Image reconstruction from incomplete and noisy data
- Gull, Daniell
- 1978
(Show Context)
Citation Context ...nd Probabilistic IR Modeling Since the publication of Jaynes' articles, the principle of maximum entropy has been applied to practical problems in diverse areas [ES88], including image reconstruction =-=[GD78]-=-, spectral analysis [Bre88], reliability engineering [Tri69] and economics [GJM96]. In two papers in the early '80s, Cooper and Huizinga [CH82] and Cooper [Coo83], make a strong case for applying the ... |

65 |
Rational, Descriptions, Decisions and Designs
- Tribus
- 1969
(Show Context)
Citation Context ...s' articles, the principle of maximum entropy has been applied to practical problems in diverse areas [ES88], including image reconstruction [GD78], spectral analysis [Bre88], reliability engineering =-=[Tri69]-=- and economics [GJM96]. In two papers in the early '80s, Cooper and Huizinga [CH82] and Cooper [Coo83], make a strong case for applying the maximum entropy approach to the problems of information retr... |

61 | The retrieval effects of query expansion on a feedback document retrieval system - Smeaton, Rijsbergen - 1983 |

57 | Evaluation of feedback in document retrieval using co-occurrence data - Harper, Rijsbergen - 1978 |

50 | Theories of Probability: An Examination of Foundations - Fine - 1973 |

48 | Probability forecasting - Dawid - 1986 |

47 |
Fundamental Methods of Mathematical Economics
- Chiang
- 1984
(Show Context)
Citation Context ...p(R = 1) = �� i \Delta ae iff p(X i = 1jR = 1) = �� i probability of an arbitrary event: To maximize the entropy subject to these constraints, we apply the Lagrange method of undetermined mult=-=ipliers [Chi67]. Introd-=-ucing the multipliers,s0 0 ; ��s1 ; : : : ; ��ss ;s1 ; : : : ;ss ; and R , the problem of maximizing H in conformance with the constraints, ( 18)--( 20), is transformed into the maximization o... |

35 | The comparison and evaluation of forecasters - DeGroot, Fienberg - 1983 |

16 |
The maximum entropy principle and its application to the design of probabilistic retrieval systems
- Cooper, Huizinga
- 1982
(Show Context)
Citation Context ... in diverse areas [ES88], including image reconstruction [GD78], spectral analysis [Bre88], reliability engineering [Tri69] and economics [GJM96]. In two papers in the early '80s, Cooper and Huizinga =-=[CH82] and -=-Cooper [Coo83], make a strong case for applying the maximum entropy approach to the problems of information retrieval. Cooper points out that, "A common criticism of most probabilistic approaches... |

16 | Corrigendum: Weight of evidence, corroboration, explanatory power, information and the utility of experiments - Good - 1968 |

15 |
Exploiting the maximum entropy principle to increase retrieval effectiveness
- Cooper
- 1983
(Show Context)
Citation Context ...[ES88], including image reconstruction [GD78], spectral analysis [Bre88], reliability engineering [Tri69] and economics [GJM96]. In two papers in the early '80s, Cooper and Huizinga [CH82] and Cooper =-=[Coo83], mak-=-e a strong case for applying the maximum entropy approach to the problems of information retrieval. Cooper points out that, "A common criticism of most probabilistic approaches to information ret... |

14 |
Automatic indexing using term discrimination and term precision measurements
- SALTON, WONG, et al.
- 1976
(Show Context)
Citation Context ...hese papers, firm first steps are taken in the direction of applying maximum entropy to information retrieval. The maximum entropy approach is used to incorporate the idea of term precision weighting =-=[SWY76] in a-=- probabilistic context. They show how probability-of-relevance computations based on maxent result in an expressive request language combining the capabilities of both Boolean and "weighted-reque... |

12 | The maximum entropy principle in information retrieval - Kantor, Lee - 1986 |

10 | Maximum entropy and the optimal design of automated information retrieval systems - Kantor - 1984 |

9 | Decision Making and Forecasting, with Emphasis on Model Building and Policy Analysis - Marshall, Oliver - 1995 |

8 |
Excerpts from Bayesian Spectrum Analysis and Parameter Estimation,” in
- Bretthorst
- 1988
(Show Context)
Citation Context ...ng Since the publication of Jaynes' articles, the principle of maximum entropy has been applied to practical problems in diverse areas [ES88], including image reconstruction [GD78], spectral analysis =-=[Bre88]-=-, reliability engineering [Tri69] and economics [GJM96]. In two papers in the early '80s, Cooper and Huizinga [CH82] and Cooper [Coo83], make a strong case for applying the maximum entropy approach to... |

7 |
Thirty years of information theory
- Tribus
- 1979
(Show Context)
Citation Context ...oltzmann's H-theorem. In 1957, Edwin Jaynes "converted Shannon's measure to a powerful instrument for the generation of statistical hypotheses and . . . applied it as a tool in statistical infere=-=nce" [Tri79]. In -=-a pair of seminal articles, [Jay57a, Jay57b], Jaynes demonstrates that by viewing it as a problem of statistical inference, Statistical Mechanics can be derived without depending on "additional a... |

2 |
Some inconsistentencies and misnomers in probabilistic information retrieval
- Cooper
- 1991
(Show Context)
Citation Context ... : : ; xs) 2 f0; 1g s : p(x1 ; : : : ; xs jrel) p(x1 ; : : : ; xs jrel) = s Y i=1 p(x i jrel) p(x i jrel) (6) William Cooper later emphasized that equation ( 6) is all that really needs to be assumed =-=[Coo91]. This, &q-=-uot;linked dependence assumption" is weaker than the pair of conditional independence assumptions, ( 4) and ( 5), and is a fairer statement of the properties that need be assumed to hold, in orde... |

1 | Probability theory: The logic of science. available via ftp://bayes.wustl.edu/pub/Jaynes/book.probability.theory - Jaynes - 1994 |

1 |
A study of probabilistic information retrieval in the case of inconsistent expert judgments
- Lee, Kantor
- 1991
(Show Context)
Citation Context ...bilities of both Boolean and "weighted-request" retrieval systems. In [Kan84, KL86], Kantor and Lee extend the analysis of the Principle of Maximum Entropy in the context of information retr=-=ieval. In [LK91]-=- they explore the use of maximum entropy to resolve user estimates of conditional relevance probabilities that may be inconsistent with available term occurrence data. Very recently, [KL98], they have... |