• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A Stochastic Memoizer for Sequence Data

Cached

  • Download as a PDF

Download Links

  • [www.cs.mcgill.ca]
  • [www.gatsby.ucl.ac.uk]
  • [www.cs.berkeley.edu]
  • [www.eecs.berkeley.edu]
  • [www.stat.columbia.edu]
  • [icml2009.org]
  • [www.cs.ucl.ac.uk]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Frank Wood , Cédric Archambeau , Lancelot James , Yee Whye Teh
Citations:7 - 4 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Wood_astochastic,
    author = {Frank Wood and Cédric Archambeau and Lancelot James and Yee Whye Teh},
    title = {A Stochastic Memoizer for Sequence Data},
    year = {}
}

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

We propose an unbounded-depth, hierarchical, Bayesian nonparametric model for discrete sequence data. This model can be estimated from a single training sequence, yet shares statistical strength between subsequent symbol predictive distributions in such a way that predictive performance generalizes well. The model builds on a specific parameterization of an unbounded-depth hierarchical Pitman-Yor process. We introduce analytic marginalization steps (using coagulation operators) to reduce this model to one that can be represented in time and space linear in the length of the training sequence. We show how to perform inference in such a model without truncation approximation and introduce fragmentation operators necessary to do predictive inference. We demonstrate the sequence memoizer by using it as a language model, achieving state-of-the-art results. 1.

Citations

381 Linear pattern matching algorithms - Weiner
270 On-line construction of suffix trees - Ukkonen - 1995
162 The two-parameter PoissonDirichlet distribution derived from a stable subordinator. Annals of Probability - Pitman, Yor - 1997
160 Gibbs sampling methods for stick–breaking priors - Ishwaran, James - 2001
102 Unbounded length contexts for PPM - Cleary, Teahan, et al. - 1995
83 Memo functions and machine learning - Michie - 1968
71 Coalescents with multiple collisions - Pitman - 1999
48 A hierarchical Bayesian language model based on Pitman-Yor processes - Teh - 2006
28 Church: A Language for Generative Models - Goodman, Mansinghka, et al. - 2008
24 Shared segmentation of natural scenes using dependent Pitman-Yor processes - Sudderth, Jordan
16 A scalable hierarchical distributed language model - Mnih, Hinton - 2008
7 The infinite Markov model - Mochihashi, Sumita
3 Coagulation fragmentation laws induced by general coagulations of two-parameter Poisson-Dirichlet processes - Ho, James, et al. - 2006
3 Random discrete distributions derived from subordinators - PERMAN - 1990
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University