• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Hierarchical mixtures of experts and the EM algorithm (1994)

Cached

  • Download as a PDF

Download Links

  • [www.cs.toronto.edu]
  • [publications.ai.mit.edu]
  • [www.cs.pitt.edu]
  • [www-clmc.usc.edu]
  • [ftp.cis.ohio-state.edu]
  • [psyche.mit.edu]
  • [publications.ai.mit.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Michael I. Jordan
Venue:Neural Computation
Citations:634 - 19 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@ARTICLE{Jordan94hierarchicalmixtures,
    author = {Michael I. Jordan},
    title = {Hierarchical mixtures of experts and the EM algorithm},
    journal = {Neural Computation},
    year = {1994},
    volume = {6},
    pages = {181--214}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hi-erarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a max-imum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parame-ters of the architecture. We also develop an on-line learning algorithm in which the pa-rameters are updated incrementally. Com-parative simulation results are presented in the robot dynamics domain. 1

Citations

6231 Maximum likelihood from incomplete data via the EM algorithm - Dempster, Laird - 1977
3340 Pattern Classification and Scene Analysis - Duda, Hart - 1973
3143 Classification and Regression Trees - Breiman, Friedman, et al. - 1983
2888 Induction of Decision Trees - Quinlan - 1986
976 Multilayer feedforward networks are universal approximators - Hornik, Stinchcombe, et al. - 1989
969 Generalized Additive Models - Hastie, Tibshirani - 1990
944 J: General Linear Models - McCullagh, Nelder - 1989
901 Adaptive Filter Theory - Haykin - 2001
771 Statistical Analysis with Missing Data - Little, Rubin - 2002
665 Adaptive mixtures of local experts - Jacobs, Jordan, et al. - 1991
537 Neural networks and the bias/variance dilemma - Geman, Bienenstock, et al. - 1992
511 DM: Statistical Analysis of Finite Mixture Distributions Chichester - Titterington - 1985
425 Mixture densities, maximum likelihood, and the EM algorithm - Redner, Walker - 1984
326 Multivariate adaptive regression splines - Friedman - 1991
308 Theory and Practice of Recursive Identification - Ljung, Söderström - 1983
303 MP: Feasibility of Multivariate Density Estimates. Biometrika - DW, Wand - 1991
278 Theoretical Statistics - Cox, Hinkley - 1974
274 Inferring decision trees using the minimum description length principle. Information and Computation 80.227–248 - QUINLAN, RIVEST - 1989
247 Pattern Classi cation and Scene Analysis - Duda, Hart - 1973
245 Learning and relearning in Boltzmann machines - Hinton, Sejnowski - 1986
217 AutoClass: A Bayesian classification system - Cheeseman, Kelly, et al. - 1988
194 Probalistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition - Bridle - 1990
173 Classi cation and Regression Trees - Breiman, Friedman, et al. - 1984
147 Consistent nonparametric regression - Stone - 1977
141 The Analysis of Binary Data - Cox
113 A General Regression Neural Network - Specht - 1991
112 Learning classification trees - BUNTINE - 1992
76 Soft Competitive Adaptation: Neural Network Algorithms based on Fitting Statistical Mixtures - Nowlan - 1991
70 Maximum likelihood competitive learning - Nowlan - 1990
62 Auto-association by multilayer perceptrons and singular value decomposition, Biological Cybernetics - Bourlard, Kamp - 1988
61 Hierarchies of adaptive experts - Jordan, Jacobs - 1992
51 G.: Adaptive mixture of local experts. In: Neural Computation - Jacobs, Jordan, et al. - 1991
34 A new approach to estimating switching regressions - Quandt, Ramsey - 1972
30 Statistical Methods in Biological Assays - FINNEY - 1978
28 An Incremental Method for Finding Multivariate Splits for Decision Trees - Utgoff, Brodley - 1990
27 A tree-structured adaptive network for function approximation in high dimensional spaces - Sanger - 1991
26 Fast learning in multi--resolution hierarchies - Moody - 1989
20 Soft Classification, a.k.a. Risk Estimation, via Penalized Log Likelihood and Smoothing Spline Analysis of Variance - Wahba, Gu, et al. - 1995
14 Learning classi cation trees - Buntine
13 Theory and practice ofrecursive identi cation - Ljung, Soderstrom - 1986
10 Autoclass: A bayesian classi cation system - Chessman, Kelly, et al. - 1988
7 Adaptive Filter Theory. Englewood Clis - Haykin - 1991
6 AutoClass: ABayesian Classi cation System - Cheeseman, Kelly, et al. - 1988
4 Convergence properties of the EM approach to learning in mixture-of-experts architectures - Jordan - 1993
3 An incremental method for nding multivariate splits for decision trees - E, Brodley - 1990
2 OC1: A randomized algorithm for building oblique decision trees - Murthy, Kasif - 1993
2 Silverman: Density Estimation - W - 1986
1 The moving basin: E ective action search in forward models - Jordan - 1993
1 Multivariate Density Estimation. NewYork - Scott - 1992
1 Neural trees---using neural nets in a tree classifier structure - Stromberg, Zrida - 1991
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University