## Automatic Speaker Clustering (1997)

Venue: | DARPA Speech Recognition Workshop |

Citations: | 42 - 6 self |

### BibTeX

@INPROCEEDINGS{Jin97automaticspeaker,

author = {Hubert Jin and Francis Kubala and Rich Schwartz},

title = {Automatic Speaker Clustering},

booktitle = {DARPA Speech Recognition Workshop},

year = {1997},

pages = {108--111}

}

### Years of Citing Articles

### OpenURL

### Abstract

This paper presents a fully automatic speaker clustering algorithm, which consists of three components: building a distance matrix based on Gaussian models of the acoustic segments; performing hierarchical clustering on the distance matrix with the prior assumption that consecutive segments should be more likely to come from the same speaker; and selecting the best clustering solution automatically by minimizing the within-cluster dispersion with some penalty against too many clusters. We applied this automatic speaker clustering technique in 1996 Hub4 evaluation, and the results show that it contributed significantly to the word error rate (WER) reduction in unsupervised adaptation. From our experiments, the algorithm seldom misclassifies segments from the same speaker into different clusters. We used the same clustering procedure for both partitioned evaluation (PE) and unpartitioned evaluation (UE) tests [1]. Experiments also show that this automatic speaker clustering algorithm imp...

### Citations

548 |
Cluster Analysis
- Everitt, Landau, et al.
- 2001
(Show Context)
Citation Context ...ispersion [7] is defined as W = k X j=1 N js\Sigma j where \Sigma j is the covariance matrix and N j is the total number of feature vectors in cluster p j . There are several good clustering criteria =-=[6]-=-. We prefer to use the determinant of W to measure the goodness of speaker clustering. That is, the best clustering solution can be obtained by minimizing the measure over the parameter space. However... |

163 |
Mathematical Statistics
- Wilks
- 1962
(Show Context)
Citation Context ...ultivariate Gaussian distribution and that the vectors are statistically independent. A good clustering solution should have relatively small dispersion within clusters. The within-cluster dispersion =-=[7]-=- is defined as W = k X j=1 N js\Sigma j where \Sigma j is the covariance matrix and N j is the total number of feature vectors in cluster p j . There are several good clustering criteria [6]. We prefe... |

160 | A Compact Model for Speaker-Adaptive Training
- Anastaskos, McDonough, et al.
- 1996
(Show Context)
Citation Context ...ow the effectiveness of this algorithm. Finally in section 4, we will discuss other alternative model selection criteria, potential application of speaker clustering in speaker adapted training (SAT) =-=[2]-=- [3]. 2. DESCRIPTION OF ALGORITHM Consider that we have a collection of segments S = fs1 ; s2 ; :::; sng, and each s i represents a sequence of spectral feature vectors, i.e. the Cepstral vectors in o... |

38 | Speaker Clustering and Transformation for Speaker Adaptation
- Padmanabhan, Bahl, et al.
- 1998
(Show Context)
Citation Context ...al alternative criteria could be j Wk;ff j +Csp k or j Wk;ff j +Cslog k for some constant C. Speaker clustering could also be used in speaker adapted training (SAT), similar to Padmanabhan's approach =-=[8]-=-. The training data of our Hub4 models includes speech from almost 2400 speakers, and most of them have less than 20 seconds total speech. Too little speech per speaker could cause unrobustness of the... |

13 | L.: Practical implementations of speaker adaptive training
- Schwartz
- 1997
(Show Context)
Citation Context ...he effectiveness of this algorithm. Finally in section 4, we will discuss other alternative model selection criteria, potential application of speaker clustering in speaker adapted training (SAT) [2] =-=[3]-=-. 2. DESCRIPTION OF ALGORITHM Consider that we have a collection of segments S = fs1 ; s2 ; :::; sng, and each s i represents a sequence of spectral feature vectors, i.e. the Cepstral vectors in our i... |

10 | Efficient 2-pass N-best Decoder
- Nguyen, Schwartz
- 1997
(Show Context)
Citation Context ...me speaker clustering procedure was used in both PE and UE Hub4 evaluation. In our Hub4 PE system, there is a procedure that chopped the original segments into shorter ones so that the BYBLOS decoder =-=[4]-=- could handle them more efficiently. From experiments on the development data, our speaker clustering algorithm seldom misclassifies chopped segments into different clusters due to ignoring the segmen... |

4 | A Compact Model for Speaker-AdaptiveTraining - Anastasakos, McDonough, et al. - 1996 |

2 |
elsewhere this volume
- Kubala, Jin, et al.
- 1997
(Show Context)
Citation Context ...gorithm seldom misclassifies segments from the same speaker into different clusters. We used the same clustering procedure for both partitioned evaluation (PE) and unpartitioned evaluation (UE) tests =-=[1]-=-. Experiments also show that this automatic speaker clustering algorithm improves unsupervised adaptation as much as the hand labeled ideal case where the clusters are generated based on true speaker,... |

1 |
et al, "Segregation of Speakers for Speech Recognition and Speaker Identification
- Gish
- 1991
(Show Context)
Citation Context ...ierarchical clustering to generate a list of clustering solutions. ffl conducting model selection by the clustering criterion with a penalty against too many clusters. 2.1. Distance Matrix Gish et al =-=[5]-=- introduced a distance measure between any two speech segments to reflect whether the two segments are from the same speaker. We use the same distance measure as the basis for the speaker clustering a... |

1 | et al, \Segregation of Speakers for Speech Recognition and Speaker Identi cation - Gish - 1991 |