## A Bradley-Terry Artificial Neural Network Model for Individual Ratings in Group Competitions (2006)

Citations: | 4 - 0 self |

### BibTeX

@MISC{Menke06abradley-terry,

author = {Josh Menke and Joshua E. Menke and Tony R. Martinez},

title = {A Bradley-Terry Artificial Neural Network Model for Individual Ratings in Group Competitions},

year = {2006}

}

### OpenURL

### Abstract

A common statistical model for paired comparisons is the Bradley-Terry model. This research re-parameterizes the Bradley-Terry model as a single-layer artificial neural network (ANN) and shows how it can be fitted using the delta rule. The ANN model is appealing because it makes using and extending the Bradley-Terry model accessible to a broader community. It also leads to natural incremental and iterative updating methods. Several extensions are presented that allow the ANN model to learn to predict the outcome of complex, uneven two-team group competitions by rating individual players—no other published model currently does this. An incremental-learning Bradley-Terry ANN yields a probability estimate within less than 5 % of the actual value training over 3,379 multiplayer online matches of a popular teamand objective-based first-person shooter. Keywords: Bradley-Terry model, paired comparisons, neural networks, delta rule, probability estimates

### Citations

339 | Connectionist Learning Procedures
- Hinton
(Show Context)
Citation Context ...in (5) does not include the derivative of the sigmoid. The sigmoid’s derivative, namely Ouput(1 − Output), is also the inverse of the derivative of a different objective function called cross-entropy =-=[13, 14, 15]-=-. Therefore, this version of the delta rule is minimizing the cross-entropy instead of the squared-error. The cross-entropy of a model is also known as the negative log-likelihood. This is appealing b... |

275 | Classification by pairwise coupling
- Hastie, Tibshirani
- 1996
(Show Context)
Citation Context ...roduction The Bradley-Terry model is well known for its use in statistics for paired comparisons [1, 2, 3, 4]. It has also been applied in machine learning to obtain multi-class probability estimates =-=[5, 6, 7]-=-. The original model states that: Pr(A defeats B) = 1 λA λA + λB . (1)swhere λA and λB are both positive and subjects A and B compete in a paired-comparison. λA and λA represent the strengths of subje... |

173 |
The rank analysis of incomplete block designs, i: The method of paired comparisons
- Bradley, Terry
- 1952
(Show Context)
Citation Context ...ords: Bradley-Terry model, paired comparisons, neural networks, delta rule, probability estimates 1 Introduction The Bradley-Terry model is well known for its use in statistics for paired comparisons =-=[1, 2, 3, 4]-=-. It has also been applied in machine learning to obtain multi-class probability estimates [5, 6, 7]. The original model states that: Pr(A defeats B) = 1 λA λA + λB . (1)swhere λA and λB are both posi... |

95 |
The method of paired comparisons
- David
- 1988
(Show Context)
Citation Context ...ords: Bradley-Terry model, paired comparisons, neural networks, delta rule, probability estimates 1 Introduction The Bradley-Terry model is well known for its use in statistics for paired comparisons =-=[1, 2, 3, 4]-=-. It has also been applied in machine learning to obtain multi-class probability estimates [5, 6, 7]. The original model states that: Pr(A defeats B) = 1 λA λA + λB . (1)swhere λA and λB are both posi... |

54 | Equivalence proofs for multi-layer perceptron classifiers and the Bayes discriminant function
- Hampshire, Perlmutter
- 1990
(Show Context)
Citation Context ...in (5) does not include the derivative of the sigmoid. The sigmoid’s derivative, namely Ouput(1 − Output), is also the inverse of the derivative of a different objective function called cross-entropy =-=[13, 14, 15]-=-. Therefore, this version of the delta rule is minimizing the cross-entropy instead of the squared-error. The cross-entropy of a model is also known as the negative log-likelihood. This is appealing b... |

48 | Parameter estimation in large dynamic paired comparison experiments
- Glickman
- 1999
(Show Context)
Citation Context ...ndividual’s true rating is actually significantly higher or lower than average. Therefore, it would be appropriate to include the concept of uncertainty or variance in an individuals rating. Glickman =-=[8]-=- derived both a likelihood-based method and a regular-updating method (like Elo’s) for modeling this type of uncertainty in Bradley-Terry models. However, his methods did not account for finding indiv... |

48 |
Improving the convergence of the backpropagation algorithm
- Ooyen, Nienhuis
- 1992
(Show Context)
Citation Context ...in (5) does not include the derivative of the sigmoid. The sigmoid’s derivative, namely Ouput(1 − Output), is also the inverse of the derivative of a different objective function called cross-entropy =-=[13, 14, 15]-=-. Therefore, this version of the delta rule is minimizing the cross-entropy instead of the squared-error. The cross-entropy of a model is also known as the negative log-likelihood. This is appealing b... |

29 | MM algorithms for generalized Bradley-Terry models
- Hunter
- 2004
(Show Context)
Citation Context ...ords: Bradley-Terry model, paired comparisons, neural networks, delta rule, probability estimates 1 Introduction The Bradley-Terry model is well known for its use in statistics for paired comparisons =-=[1, 2, 3, 4]-=-. It has also been applied in machine learning to obtain multi-class probability estimates [5, 6, 7]. The original model states that: Pr(A defeats B) = 1 λA λA + λB . (1)swhere λA and λB are both posi... |

28 |
An extended transmission/disequilibrium test (TDT) for multi-allele marker loci. Ann Human Genet 59:323–336
- PC, Curtis
- 1995
(Show Context)
Citation Context ... estimates from binary classifiers in machine learning [5, 6, 7], market predictions for economics, biostatistics [1], bibliometrics or deciding which publications are more significant [10], genetics =-=[11]-=-, taste-testing, military encounters, and any other prediction that requires the probability of one subject being more desirable than another. In addition to making the usage of the Bradley-Terry mode... |

26 | Generalized Bradley-Terry models and multi-class probability estimates
- Huang, Weng, et al.
(Show Context)
Citation Context ...roduction The Bradley-Terry model is well known for its use in statistics for paired comparisons [1, 2, 3, 4]. It has also been applied in machine learning to obtain multi-class probability estimates =-=[5, 6, 7]-=-. The original model states that: Pr(A defeats B) = 1 λA λA + λB . (1)swhere λA and λB are both positive and subjects A and B compete in a paired-comparison. λA and λA represent the strengths of subje... |

17 | Reducing multiclass to binary by coupling probability estimates
- Zadrozny
- 2001
(Show Context)
Citation Context ...roduction The Bradley-Terry model is well known for its use in statistics for paired comparisons [1, 2, 3, 4]. It has also been applied in machine learning to obtain multi-class probability estimates =-=[5, 6, 7]-=-. The original model states that: Pr(A defeats B) = 1 λA λA + λB . (1)swhere λA and λB are both positive and subjects A and B compete in a paired-comparison. λA and λA represent the strengths of subje... |

11 |
Citation patterns in the journals of statistics and probability
- Stigler
- 1994
(Show Context)
Citation Context ...ass probability estimates from binary classifiers in machine learning [5, 6, 7], market predictions for economics, biostatistics [1], bibliometrics or deciding which publications are more significant =-=[10]-=-, genetics [11], taste-testing, military encounters, and any other prediction that requires the probability of one subject being more desirable than another. In addition to making the usage of the Bra... |

6 |
A bibliography on the method of paired comparisons
- Davidson, Farquhar
- 1976
(Show Context)
Citation Context |

4 |
A Comprehensive Guide to Chess Ratings
- Glickman
- 1995
(Show Context)
Citation Context ...or determining the ratings of over 4,000 players over 3,379 competitions. Elo suggested an efficient approach to update the ratings of thousands of players competing in thousands of chess tournaments.=-=[9]-=- His method re-parameterizes the Bradley-Terry model by setting λA = 10θA yielding: 1 Pr(A defeats B) = 1 + 10− θA−θB 400 . (2) The scale specific parameters are historical only and can be changed to ... |

3 | Hierarchical Models for Permutations: Analysis of Auto Racing Results
- Graves, S, et al.
- 2003
(Show Context)
Citation Context ...s can be non-zero for a given comparison—in other words, only two subjects are compared at a time. There do exist extensions to the Bradley-Terry model allowing for more than one comparison at a time =-=[4, 12]-=-, but that is beyond the scope of this work. The equation for the model can be written as: 1 Output = Pr(A defeats B) = . (4) 1 + e−(wA−wB) which is the same as (3), except wA and wB are substituted f... |