## Global models of document structure using latent permutations (2009)

Venue: | In NAACL’09 |

Citations: | 18 - 4 self |

@INPROCEEDINGS{Chen09globalmodels,

author = {Harr Chen and S. R. K. Branavan and Regina Barzilay and David R. Karger},

title = {Global models of document structure using latent permutations},

booktitle = {In NAACL’09},

year = {2009},

pages = {371--379}

}

We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that this space of orderings can be elegantly represented using a distribution over permutations called the generalized Mallows model. Our structureaware approach substantially outperforms alternative approaches for cross-document comparison and single-document segmentation. 1 1

