## Spectral and cepstral projection bases constructed by independent component analysis (2000)

Venue: | in Proceedings of the ICSLP |

Citations: | 2 - 0 self |

### BibTeX

@INPROCEEDINGS{Potamitis00spectraland,

author = {I. Potamitis and N. Fakotakis and G. Kokkinakis},

title = {Spectral and cepstral projection bases constructed by independent component analysis},

booktitle = {in Proceedings of the ICSLP},

year = {2000},

pages = {63--66}

}

### OpenURL

### Abstract

The present paper addresses the question of the efficiency of Independent Component Analysis (ICA) as a statistical process for deriving optimal representational bases for the projection of spectrum and cepstrum in the context of Automatic Speech Recognition (ASR). Several decorrelation strategies have been applied on the log-spectrum and cepstrum to fulfill the practical need of a diagonal covariance HMM for uncorrelated features. In our work we question the optimality of a fixed decorrelation strategy as DCT and follow an emerging trend in ASR that designs projection bases based on the statistics of speech. We differentiate our approach from the second order statistics of

### Citations

757 |
Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Davis, Mermelstein
- 1980
(Show Context)
Citation Context ...egies have been proposed, applied to log-spectrum and/or cepstrum. To name the most popular ones: DCT, which projects on the direction of global variance and achieves partial decorelation of features =-=[1]-=-; LDA which applies a linear transformation that minimizes a class separation cost function reducing the feature space and achieving better class discrimination; [2], Principal Component Analysis whic... |

429 | Oja E: A fast fixed-point algorithm for independent component analysis. Neural Computation
- Hyvärinen
- 1997
(Show Context)
Citation Context ...nformation transfer of spectral and cepstral coefficients through several nonlinear transfer functions. We used an efficient implementation with a modified linear approximation of the maximum entropy =-=[13]-=-. 1.5 1 0.5 0 -0.5 -1 Fig. 4: ICA projected spectrum covariance. -1.5 0 10 20 1.5 1 0.5 0 -0.5 -1 -1.5 0 10 20 2 1 0 -1 -2 -3 0 10 20 1.5 1 0.5 0 -0.5 -1 -1.5 0 10 20 -2 0 10 20 Fig. 5: First 6 ICA ba... |

73 |
Linear discriminant analysis for improved large vocabulary continuous speech recognition
- Haeb-Umbach, Ney
- 1992
(Show Context)
Citation Context ...s partial decorelation of features [1]; LDA which applies a linear transformation that minimizes a class separation cost function reducing the feature space and achieving better class discrimination; =-=[2]-=-, Principal Component Analysis which constructs an orthonormal set of axes pointing in the directions of maximum variance, thus forming a representational basis that projects on the direction of maxim... |

50 |
A comparison of several acoustic representations for speech recognition with degraded and undegraded speech
- Hunt, Lefèbvre
- 1989
(Show Context)
Citation Context ...amely at the feature extraction stage and at the state level. Reported results on the efficiency of the method as applied to the front-end and to the HMM framework seem controversial. In [2], [6] and =-=[14]-=-. consistent gains have been reported. On the other hand [7] reported that LDA and PCA derived, spectral basis did not offer the expected advantage. In [8] it is found that LDA as a preprocessing step... |

36 | Maximum likelihood and minimum classification error factor analysis for automatic speech recognition
- Saul, Rahim
- 2000
(Show Context)
Citation Context ...ojects on the direction of maximum variability [3] and Factor Analysis on the state level that focuses on mapping the systematic variations of the covariance matrix into a lower dimensional subspace. =-=[4]-=-. Comparative studies on the problem of decorrelation can be found in [5] and [6]. Our approach follows an emerging trend in ASR that is closely linked to the statistics of speech signals by analyzing... |

10 |
On the robustness of linear discriminant analysis as a preprocessing step for noisy speech recognition
- Siohan
- 1995
(Show Context)
Citation Context ...amework seem controversial. In [2], [6] and [14]. consistent gains have been reported. On the other hand [7] reported that LDA and PCA derived, spectral basis did not offer the expected advantage. In =-=[8]-=- it is found that LDA as a preprocessing step is very sensitive to SNR mismatches in training and test data. Last but not least in [9] is shown that several stacked frames transformed with LDA into on... |

8 | Experiments with linear feature extraction in speech recognition
- Beulen, Welling, et al.
- 1995
(Show Context)
Citation Context ... derived, spectral basis did not offer the expected advantage. In [8] it is found that LDA as a preprocessing step is very sensitive to SNR mismatches in training and test data. Last but not least in =-=[9]-=- is shown that several stacked frames transformed with LDA into one feature vector, compared with time derivatives gave only marginal improvement. LDA postulates such hypothesis as (a) linear separati... |

4 |
Feature vector transformation using ICA and its application to speaker veri cation
- Jang, Yun, et al.
- 1999
(Show Context)
Citation Context ...n since cepstral vectors are not normally distributed). The assumptions of ICA conform nicely with the framework of log-spectral analysis due to the property of homomorphic processing as mentioned in =-=[10]-=-. Viewing the speech spectrum |S(ω)| as consisting of a quickly varying part |E(ω)| and a slowly varying part |Θ(ω)| we can form the following equation: log|S(ω)|=log|E(ω)|+log|Θ(ω)| (6) We assume tha... |

2 |
Nabyath N., “Spectral basis functions from discrimination analysis,” ICSLP
- Hermasnky
- 1988
(Show Context)
Citation Context ... Reported results on the efficiency of the method as applied to the front-end and to the HMM framework seem controversial. In [2], [6] and [14]. consistent gains have been reported. On the other hand =-=[7]-=- reported that LDA and PCA derived, spectral basis did not offer the expected advantage. In [8] it is found that LDA as a preprocessing step is very sensitive to SNR mismatches in training and test da... |

1 |
Ariki Y., “Effectiveness of KLTransformation in spectral delta expansion”, Eurospeech ,Vol1
- Tokuhira
- 1999
(Show Context)
Citation Context ...ponent Analysis which constructs an orthonormal set of axes pointing in the directions of maximum variance, thus forming a representational basis that projects on the direction of maximum variability =-=[3]-=- and Factor Analysis on the state level that focuses on mapping the systematic variations of the covariance matrix into a lower dimensional subspace. [4]. Comparative studies on the problem of decorre... |

1 |
et all., “A comparative study of linear feature transformation techniques for automatic speech recognition”, ICSLP
- Thomas
- 1996
(Show Context)
Citation Context ... the state level that focuses on mapping the systematic variations of the covariance matrix into a lower dimensional subspace. [4]. Comparative studies on the problem of decorrelation can be found in =-=[5]-=- and [6]. Our approach follows an emerging trend in ASR that is closely linked to the statistics of speech signals by analyzing the moments of speech features before designing the projection axes. We ... |

1 |
Nadeu C., Fonollosa J., “Feature decorrelation methods in speech recognition. A comparative study”, ICSLP
- Eloi
- 1998
(Show Context)
Citation Context ...te level that focuses on mapping the systematic variations of the covariance matrix into a lower dimensional subspace. [4]. Comparative studies on the problem of decorrelation can be found in [5] and =-=[6]-=-. Our approach follows an emerging trend in ASR that is closely linked to the statistics of speech signals by analyzing the moments of speech features before designing the projection axes. We go furth... |

1 |
Source separation using higher order moments”, ICASSP
- Cardoso
- 1989
(Show Context)
Citation Context ...crosscumulants; a technique that computationally is very costly and not optimal since there is an infinite number of cross-cumulants and fourth order independence is the most these methods can ensure =-=[11]-=-. A later approach that bypassed the problems of cross-cumulants used the concept of minimum mutual information as a cost function, a measure that is described by all the higher order cross statistics... |

1 |
Sejnowski T.J, “An information-maximisation approach to blind separation and blind deconvolution
- Bell
- 1995
(Show Context)
Citation Context ...mplete cross statistics optimization. This unsupervised algorithm forms a transformation matrix W that maps inputs to outputs and which is calculated my maximizing the joint entropy of the output. In =-=[12]-=- it is shown that ∂I ( y, x) ∂H ( y) ∆W ∝ = (7) ∂W ∂W and consequently it is proven that ∂H ( y) Τ Τ ∆W ∝ W W = [ I −ϕ ( u) u ] W (8) ∂W where φ(u) is source probability density function, H(y) joint e... |