## Nonlinear Independent Factor Analysis by Hierarchical Models (2003)

### Cached

### Download Links

Venue: | in Proc. 4th Int. Symp. on Independent Component Analysis and Blind Signal Separation (ICA2003 |

Citations: | 25 - 13 self |

### BibTeX

@INPROCEEDINGS{Valpola03nonlinearindependent,

author = {Harri Valpola and Tomas Östman and Juha Karhunen},

title = {Nonlinear Independent Factor Analysis by Hierarchical Models},

booktitle = {in Proc. 4th Int. Symp. on Independent Component Analysis and Blind Signal Separation (ICA2003},

year = {2003},

pages = {257--262}

}

### Years of Citing Articles

### OpenURL

### Abstract

The building blocks introduced earlier by us in [1] are used for constructing a hierarchical nonlinear model for nonlinear factor analysis. We call the resulting method hierarchical nonlinear factor analysis (HNFA). The variational Bayesian learning algorithm used in this method has a linear computational complexity, and it is able to infer the structure of the model in addition to estimating the unknown parameters. We show how nonlinear mixtures can be separated by first estimating a nonlinear subspace using HNFA and then rotating the subspace using linear independent component analysis. Experimental results show that the cost function minimised during learning predicts well the quality of the estimated subspace.

### Citations

1675 | Independent Component Analysis
- Hyvärinen, Karhunen, et al.
- 2001
(Show Context)
Citation Context ...ION Blind separation of sources from their nonlinear mixtures—known as nonlinear blind source separation (BSS)—is generally a very difficult problem, from both theoretical and practical point of v=-=iew [2, 3].-=- The task is to extract the sources s(t) that have generated the observations x(t) through a nonlinear mapping f(·): x(t) = f[s(t)] + n(t) , (1) where n(t) is additive noise. Theoretically, the task ... |

235 | Independent factor analysis
- Attias
- 1999
(Show Context)
Citation Context ...common technique is ensemble learning [7, 8, 9] where the KullbackLeibler divergence measures the misfit between the approximation and the true posterior. It has been applied to standard ICA and BSS [=-=10, 11, 12, 13] a-=-nd to their extensions [14, 15, 5, 9], as well as to a wide variety of other models. In ensemble learning, the posterior approximation q(θ) of the unknown variables θ is required to have a suitably ... |

89 | Nonlinear independent component analysis: Existence and uniqueness results - Hyvarinen, Pajunen - 1999 |

89 | An unsupervised ensemble learning method for nonlinear dynamic state-space models
- Valpola, Karhunen
(Show Context)
Citation Context ...are based on approximating the true posterior probability density of the unknown variables of the model by a function with a restricted form. Currently the most common technique is ensemble learning [=-=7, 8, 9]-=- where the KullbackLeibler divergence measures the misfit between the approximation and the true posterior. It has been applied to standard ICA and BSS [10, 11, 12, 13] and to their extensions [14, 15... |

63 | Ensemble learning
- Lappalainen, Miskin
- 2000
(Show Context)
Citation Context ...are based on approximating the true posterior probability density of the unknown variables of the model by a function with a restricted form. Currently the most common technique is ensemble learning [=-=7, 8, 9]-=- where the KullbackLeibler divergence measures the misfit between the approximation and the true posterior. It has been applied to standard ICA and BSS [10, 11, 12, 13] and to their extensions [14, 15... |

58 | Bayesian nonlinear independent component analysis by multi-layer perceptrons
- Lappalainen, Honkela
- 2000
(Show Context)
Citation Context ...hat can be separated in practice to be quite small in many instances. Existing nonlinear BSS methods have been reviewed in Chapter 17 of [3] and in [4]. The method introduced in this paper stems from =-=[5]-=-, where a variational Bayesian learning method called ensemble learning was used to estimate the generative nonlinear mixture model (1). In this paper we study the approach outlined in [1]. We constru... |

45 | Ensemble learning for independent component analysis
- Lappalainen
- 1999
(Show Context)
Citation Context ...common technique is ensemble learning [7, 8, 9] where the KullbackLeibler divergence measures the misfit between the approximation and the true posterior. It has been applied to standard ICA and BSS [=-=10, 11, 12, 13] a-=-nd to their extensions [14, 15, 5, 9], as well as to a wide variety of other models. In ensemble learning, the posterior approximation q(θ) of the unknown variables θ is required to have a suitably ... |

43 | Ensemble learning for blind image separation and deconvolution
- Miskin, MacKay
- 2000
(Show Context)
Citation Context ...common technique is ensemble learning [7, 8, 9] where the KullbackLeibler divergence measures the misfit between the approximation and the true posterior. It has been applied to standard ICA and BSS [=-=10, 11, 12, 13] a-=-nd to their extensions [14, 15, 5, 9], as well as to a wide variety of other models. In ensemble learning, the posterior approximation q(θ) of the unknown variables θ is required to have a suitably ... |

31 | Advances in nonlinear blind source separation
- Jutten, Karhunen
(Show Context)
Citation Context ...issues. They have restricted the number of sources that can be separated in practice to be quite small in many instances. Existing nonlinear BSS methods have been reviewed in Chapter 17 of [3] and in =-=[4]-=-. The method introduced in this paper stems from [5], where a variational Bayesian learning method called ensemble learning was used to estimate the generative nonlinear mixture model (1). In this pap... |

29 | Building blocks for hierarchical latent variable models
- Valpola, Raiko, et al.
- 2001
(Show Context)
Citation Context ...ural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Espoo, Finland firstname.lastname@hut.fi http://www.cis.hut.fi/projects/ica/bayes/ ABSTRACT The building blocks introduced earlier by us in =-=[1]-=- are used for constructing a hierarchical nonlinear model for nonlinear factor analysis. We call the resulting method hierarchical nonlinear factor analysis (HNFA). The variational Bayesian learning a... |

24 | An ensemble learning approach to independent component analysis
- Choudrey, Penny, et al.
- 2000
(Show Context)
Citation Context |

23 | Ensemble Learning in Bayesian Neural Networks
- Barber, Bishop
- 1998
(Show Context)
Citation Context ...are based on approximating the true posterior probability density of the unknown variables of the model by a function with a restricted form. Currently the most common technique is ensemble learning [=-=7, 8, 9]-=- where the KullbackLeibler divergence measures the misfit between the approximation and the true posterior. It has been applied to standard ICA and BSS [10, 11, 12, 13] and to their extensions [14, 15... |

19 | Variational learning of clusters of undercomplete nonsymmetric independent components
- Chan, Lee, et al.
- 2001
(Show Context)
Citation Context ..., 8, 9] where the KullbackLeibler divergence measures the misfit between the approximation and the true posterior. It has been applied to standard ICA and BSS [10, 11, 12, 13] and to their extensions =-=[14, 15, 5, 9], as w-=-ell as to a wide variety of other models. In ensemble learning, the posterior approximation q(θ) of the unknown variables θ is required to have a suitably factorial form q(θ) = � qi(θi) , (2) i ... |

6 |
graphical models and variational methods. In Independent component analysis: principles and practice
- Attias
(Show Context)
Citation Context ..., 8, 9] where the KullbackLeibler divergence measures the misfit between the approximation and the true posterior. It has been applied to standard ICA and BSS [10, 11, 12, 13] and to their extensions =-=[14, 15, 5, 9], as w-=-ell as to a wide variety of other models. In ensemble learning, the posterior approximation q(θ) of the unknown variables θ is required to have a suitably factorial form q(θ) = � qi(θi) , (2) i ... |

6 | Speeding up cyclic update schemes by pattern searches
- Honkela
- 2002
(Show Context)
Citation Context ...e by minimising (4). In addition, several other operations are performed: • addition of hidden nodes; • addition of weights; • pruning of weights; and • line search. Line search has been expla=-=ined in [16].-=- The idea is to monitor the individual updates during one iteration and then perform a line search simultaneously for all qi(θi). We applied the line search after every tenth iteration. The addition ... |

1 |
On the effect of the form of the posterior approximation
- Ilin, Valpola
- 2003
(Show Context)
Citation Context ...resented in [1]. of increasing the misfit between the approximated and the true posterior density. Minimisation of the cost function (3) favours solutions where the misfit is as small as possible. In =-=[6]-=-, it is shown how this can lead to suboptimal separation in linear ICA. It is difficult to analyse the situation in linear models mathematically, but it seems that models with fewer hidden nodes and t... |