• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Exploiting headword dependency and predictive clustering for language modeling (2002)

Cached

  • Download as a PDF

Download Links

  • [www.aclweb.org]
  • [www.aclweb.org]
  • [acl.ldc.upenn.edu]
  • [research.microsoft.com]
  • [www.research.microsoft.com]
  • [research.microsoft.com]
  • [research.microsoft.com]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Jianfeng Gao
Venue:In EMNLP
Citations:7 - 5 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Gao02exploitingheadword,
    author = {Jianfeng Gao},
    title = {Exploiting headword dependency and predictive clustering for language modeling},
    booktitle = {In EMNLP},
    year = {2002},
    pages = {248--256}
}

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

This paper presents several practical ways of incorporating linguistic structure into language models. A headword detector is first applied to detect the headword of each phrase in a sentence. A permuted headword trigram model (PHTM) is then generated from the annotated corpus. Finally, PHTM is extended to a cluster PHTM (C-PHTM) by defining clusters for similar words in the corpus. We evaluated the proposed models on the realistic application of Japanese Kana-Kanji conversion. Experiments show that C-PHTM achieves 15 % error rate reduction over the word trigram model. This

Citations

302 Self-organized language modeling for speech recognition - Jelinek - 1990
154 Adaptive Statistical Language Modeling: A Maximum Entropy Approach - Rosenfeld - 1994
149 On structuring probabilistic dependencies in stochastic language modeling. Computer Speech and Language - Ney, Essen, et al. - 1994
70 A bit of progress in language modeling - Goodman - 2001
57 Structured language modeling - Chelba, Jelinek - 2000
28 Class-based n-gram models of natural language - deSouza, Peter, et al. - 1992
19 Language model size reduction by pruning and clustering - Goodman, Gao - 2000
8 Introducing linguistic constraints into statistical language modeling - Geutner - 1996
6 Mingjing Li and Kai-Fu Lee. 2002. Toward a unified approach to statistical language modeling for Chinese - Gao, Goodman
6 Estimation of probabilities from sparse data for other language component of a speech recognizer - Katz - 1987
4 A Stochastic Language Model for Speech Recognition Integrating Local and Global Constraints - Isotani, Matsunaga - 1994
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University