Results 11  20
of
1,643
Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
 IEEE Trans. On Audio, Speech and Lang. Processing
, 2007
"... Abstract—An unsupervised learning algorithm for the separation of sound sources in onechannel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a timevarying gain ..."
Abstract

Cited by 94 (11 self)
 Add to MetaCart
Abstract—An unsupervised learning algorithm for the separation of sound sources in onechannel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a timevarying gain. Each sound source, in turn, is modeled as a sum of one or more components. The parameters of the components are estimated by minimizing the reconstruction error between the input spectrogram and the model, while restricting the component spectrograms to be nonnegative and favoring components whose gains are slowly varying and sparse. Temporal continuity is favored by using a cost term which is the sum of squared differences between the gains in adjacent frames, and sparseness is favored by penalizing nonzero gains. The proposed iterative estimation algorithm is initialized with random values, and the gains and the spectra are then alternatively updated using multiplicative update rules until the values converge. Simulation experiments were carried out using generated mixtures of pitched musical instrument samples and drum sounds. The performance of the proposed method was compared with independent subspace analysis and basic nonnegative matrix factorization, which are based on the same linear model. According to these simulations, the proposed method enables a better separation quality than the previous algorithms. Especially, the temporal continuity criterion improved the detection of pitched musical sounds. The sparseness criterion did not produce significant improvements. Index Terms—Acoustic signal analysis, audio source separation, blind source separation, music, nonnegative matrix factorization, sparse coding, unsupervised learning. I.
A Survey of Dimension Reduction Techniques
, 2002
"... this paper, we assume that we have n observations, each being a realization of the p dimensional random variable x = (x 1 , . . . , x p ) with mean E(x) = = ( 1 , . . . , p ) and covariance matrix E{(x )(x = # pp . We denote such an observation matrix by X = i,j : 1 p, 1 ..."
Abstract

Cited by 89 (0 self)
 Add to MetaCart
this paper, we assume that we have n observations, each being a realization of the p dimensional random variable x = (x 1 , . . . , x p ) with mean E(x) = = ( 1 , . . . , p ) and covariance matrix E{(x )(x = # pp . We denote such an observation matrix by X = i,j : 1 p, 1 n}. If i and # i = # (i,i) denote the mean and the standard deviation of the ith random variable, respectively, then we will often standardize the observations x i,j by (x i,j i )/ # i , where i = x i = 1/n j=1 x i,j , and # i = 1/n j=1 (x i,j x i )
Segmenting Motion Capture Data into Distinct Behaviors
 In Graphics Interface
, 2004
"... Much of the motion capture data used in animations, commercials, and video games is carefully segmented into distinct motions either at the time of capture or by hand after the capture session. As we move toward collecting more and longer motion sequences, however, automatic segmentation techniques ..."
Abstract

Cited by 89 (5 self)
 Add to MetaCart
Much of the motion capture data used in animations, commercials, and video games is carefully segmented into distinct motions either at the time of capture or by hand after the capture session. As we move toward collecting more and longer motion sequences, however, automatic segmentation techniques will become important for processing the results in a reasonable time frame.
EigenSkin: Real Time Large Deformation Character Skinning in Hardware
 In ACM SIGGRAPH Symposium on Computer Animation
, 2002
"... We present a technique which allows subtle nonlinear quasistatic deformations of articulated characters to be compactly approximated by datadependent eigenbases which are optimized for real time rendering on commodity graphics hardware. The method extends the common SkeletalSubspace Deformation ( ..."
Abstract

Cited by 89 (4 self)
 Add to MetaCart
We present a technique which allows subtle nonlinear quasistatic deformations of articulated characters to be compactly approximated by datadependent eigenbases which are optimized for real time rendering on commodity graphics hardware. The method extends the common SkeletalSubspace Deformation (SSD) technique to provide efficient approximations of the complex deformation behaviours exhibited in simulated, measured, and artistdrawn characters. Instead of storing displacements for key poses (which may be numerous), we precompute principal components of the deformation influences for individual kinematic joints, and so construct erroroptimal eigenbases describing each joint's deformation subspace. Posedependent deformations are then expressed in terms of these reduced eigenbases, allowing precomputed coefficients of the eigenbasis to be interpolated at run time. Vertex program hardware can then efficiently render nonlinear skin deformations using a small number of eigendisplacements stored in graphics hardware. We refer to the final resulting character skinning construct as the model's EigenSkin. Animation results are presented for a very large nonlinear finite element model of a human hand rendered in real time at minimal cost to the main CPU.
An Unsupervised Ensemble Learning Method for Nonlinear Dynamic StateSpace Models
 Neural Computation
, 2001
"... A Bayesian ensemble learning method is introduced for unsupervised extraction of dynamic processes from noisy data. The data are assumed to be generated by an unknown nonlinear mapping from unknown factors. The dynamics of the factors are modeled using a nonlinear statespace model. The nonlinear map ..."
Abstract

Cited by 89 (32 self)
 Add to MetaCart
A Bayesian ensemble learning method is introduced for unsupervised extraction of dynamic processes from noisy data. The data are assumed to be generated by an unknown nonlinear mapping from unknown factors. The dynamics of the factors are modeled using a nonlinear statespace model. The nonlinear mappings in the model are represented using multilayer perceptron networks. The proposed method is computationally demanding, but it allows the use of higher dimensional nonlinear latent variable models than other existing approaches. Experiments with chaotic data show that the new method is able to blindly estimate the factors and the dynamic process which have generated the data. It clearly outperforms currently available nonlinear prediction techniques in this very di#cult test problem.
A Fast FixedPoint Algorithm for Independent Component Analysis of Complex Valued Signals
, 2000
"... Separation of complex valued signals is a frequently arising problem in signal processing. For example, separation of convolutively mixed source signals involves computations on complex valued signals. In this article it is assumed that the original, complex valued source signals are mutually statis ..."
Abstract

Cited by 87 (1 self)
 Add to MetaCart
Separation of complex valued signals is a frequently arising problem in signal processing. For example, separation of convolutively mixed source signals involves computations on complex valued signals. In this article it is assumed that the original, complex valued source signals are mutually statistically independent, and the problem is solved by the independent component analysis (ICA) model. ICA is a statistical method for transforming an observed multidimensional random vector into components that are mutually as independent as possible. In this article, a fast xedpoint type algorithm that is capable of separating complex valued, linearly mixed source signals is presented and its computational efficiency is shown by simulations. Also, the local consistency of the estimator given by the algorithm is proved.
Sparse deep belief net model for visual area V2
 Advances in Neural Information Processing Systems 20
, 2008
"... Abstract 1 Motivated in part by the hierarchical organization of the neocortex, a number of recently proposed algorithms have tried to learn hierarchical, or “deep, ” structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed ..."
Abstract

Cited by 81 (15 self)
 Add to MetaCart
Abstract 1 Motivated in part by the hierarchical organization of the neocortex, a number of recently proposed algorithms have tried to learn hierarchical, or “deep, ” structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed in visual area V1 (and the cochlea), little attempt has been made thus far to evaluate these algorithms in terms of their fidelity for mimicking computations at deeper levels in the cortical hierarchy. This thesis describes an unsupervised learning model that faithfully mimics certain properties of visual area V2. Specifically, we develop a sparse variant of the deep belief networks described by Hinton et al. (2006). We learn two layers of representation in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the gabor functions known to model simple cell receptive fields in area V1. Further, the second layer in our model encodes various combinations of the first layer responses in the data. Specifically, it picks up both collinear (“contour”) features as well as corners and junctions. More interestingly, in a quantitative comparison, the encoding of these more complex “corner ” features matches well with the results from Ito & Komatsu’s study of neural responses to angular stimuli in area V2 of the macaque. This suggests that our sparse variant of deep belief networks holds promise for modeling more higherorder features that are encoded in visual cortex. Conversely, one may also interpret the results reported here as suggestive that visual area V2 is performing computations on its input similar to those performed in (sparse) deep belief networks. This plausible relationship generates some intriguing hypotheses about V2 computations. 1 This thesis is an extended version of an earlier paper by Honglak Lee, Chaitanya Ekanadham, and Andrew Ng titled “Sparse deep belief net model for visual area V2.” 1
A robust and precise method for solving the permutation problem of frequencydomain blind source separation
 IEEE Trans. on Speech and Audio Processing 12
, 2004
"... This paper presents a robust and precise method for solving the permutation problem of frequencydomain blind source separation. It is based on two previous approaches: the direction of arrival estimation and the interfrequency correlation. We discuss the advantages and disadvantages of the two app ..."
Abstract

Cited by 81 (21 self)
 Add to MetaCart
This paper presents a robust and precise method for solving the permutation problem of frequencydomain blind source separation. It is based on two previous approaches: the direction of arrival estimation and the interfrequency correlation. We discuss the advantages and disadvantages of the two approaches, and integrate them to exploit their respective advantages. We also present a closed form formula to estimate the directions of source signals from a separating matrix obtained by ICA. Experimental results show that our method solved permutation problems almost perfectly for a situation that two sources were mixed in a room whose reverberation time was 300 ms. 1.
Probabilistic Independent Component Analysis
, 2003
"... Independent Component Analysis is becoming a popular exploratory method for analysing complex data such as that from FMRI experiments. The application of such 'modelfree' methods, however, has been somewhat restricted both by the view that results can be uninterpretable and by the lack of ..."
Abstract

Cited by 79 (12 self)
 Add to MetaCart
Independent Component Analysis is becoming a popular exploratory method for analysing complex data such as that from FMRI experiments. The application of such 'modelfree' methods, however, has been somewhat restricted both by the view that results can be uninterpretable and by the lack of ability to quantify statistical significance. We present an integrated approach to Probabilistic ICA for FMRI data that allows for nonsquare mixing in the presence of Gaussian noise. We employ an objective estimation of the amount of Gaussian noise through Bayesian analysis of the true dimensionality of the data, i.e. the number of activation and nonGaussian noise sources. Reduction of the data to this 'true' subspace before the ICA decomposition automatically results in an estimate of the noise, leading to the ability to assign significance to voxels in ICA spatial maps. Estimation of the number of intrinsic sources not only enables us to carry out probabilistic modelling, but also achieves an asymptotically unique decomposition of the data. This reduces problems of interpretation, as each final independent component is now much more likely to be due to only one physical or physiological process. We also describe other improvements to standard ICA, such as temporal prewhitening and variance normafisation of timeseries, the latter being particularly useful in the context of dimensionality reduction when weak activation is present. We discuss the use of prior information about the spatiotemporal nature of the source processes, and an alternativehypothesis testing approach for inference, using Gaussian mixture models. The performance of our approach is illustrated and evaluated on real and complex artificial FMRI data, and compared to the spatiotemporal accuracy of restfits obtaine...
A TwoLayer Sparse Coding Model Learns Simple and Complex Cell Receptive Fields and Topography From Natural Images
 VISION RESEARCH
, 2001
"... The classical receptive fields of simple cells in the visual cortex have been shown to emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse, i.e. significantly activated only rarely. Here, we show that this single principle of sparseness can ..."
Abstract

Cited by 69 (17 self)
 Add to MetaCart
The classical receptive fields of simple cells in the visual cortex have been shown to emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse, i.e. significantly activated only rarely. Here, we show that this single principle of sparseness can also lead to emergence of topography (columnar organization) and complex cell properties as well. These are obtained by maximizing the sparsenesses of locally pooled energies, which correspond to complex cell outputs. Thus we obtain a highly parsimonious model of how these properties of the visual cortex are adapted to the characteristics of the natural input.