Results 1 - 10
of
357
Crowd++: Unsupervised speaker count with smartphones,”
- in ACM UbiComp,
, 2013
"... ABSTRACT Smartphones are excellent mobile sensing platforms, with the microphone in particular being exercised in several audio inference applications. We take smartphone audio inference a step further and demonstrate for the first time that it's possible to accurately estimate the number of p ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
of people talking in a certain place -with an average error distance of 1.5 speakers -through unsupervised machine learning analysis on audio segments captured by the smartphones. Inference occurs transparently to the user and no human intervention is needed to derive the classification model. Our results
Differentiable pooling for unsupervised speaker adaptation
- in Proc. ICASSP
, 2015
"... This paper proposes a differentiable pooling mechanism to perform model-based neural network speaker adaptation. The proposed tech-nique learns a speaker-dependent combination of activations within pools of hidden units, was shown to work well unsupervised, and does not require speaker-adaptive trai ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper proposes a differentiable pooling mechanism to perform model-based neural network speaker adaptation. The proposed tech-nique learns a speaker-dependent combination of activations within pools of hidden units, was shown to work well unsupervised, and does not require speaker
Crowd++: Unsupervised Speaker Count with Smartphones
"... Smartphones are excellent mobile sensing platforms, with the microphone in particular being exercised in several audio in-ference applications. We take smartphone audio inference a step further and demonstrate for the first time that it’s possi-ble to accurately estimate the number of people talking ..."
Abstract
- Add to MetaCart
talking in a certain place – with an average error distance of 1.5 speak-ers – through unsupervised machine learning analysis on au-dio segments captured by the smartphones. Inference occurs transparently to the user and no human intervention is needed to derive the classification model. Our results
Speaker Model Quantization for Unsupervised Speaker Indexing
"... Speaker indexing sequentially detects points where speaker identity changes in a multi-speaker audio stream, and classifies each detected segment according to the speaker’s identity. In unsupervised speaker indexing scenarios, there is no prior information/data about the speakers in the target data. ..."
Abstract
- Add to MetaCart
Speaker indexing sequentially detects points where speaker identity changes in a multi-speaker audio stream, and classifies each detected segment according to the speaker’s identity. In unsupervised speaker indexing scenarios, there is no prior information/data about the speakers in the target data
Unsupervised Speaker Adaptation using Reference Speaker Weighting
"... Abstract. Recently, we revisited the fast adaptation method called reference speaker weighting (RSW), and suggested a few modifications. We then showed that the algorithmically simplest technique actually outperformed conventional adaptation techniques like MAP and MLLR for 5or 10-second supervised ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
adaptation on the Wall Street Journal 5K task. In this paper, we would like to further investigate the performance of RSW in unsupervised adaptation mode, which is the more natural way of doing adaptation in practice. Moreover, various analyses were carried out on the reference speakers computed
Unsupervised speaker adaptation for telephone call transcription
- in IEEE International Conference on Acoustics, Speech, and Signal Processing
, 2008
"... The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Super-vised and unsupervised experiment ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Super-vised and unsupervised
Unsupervised speaker change detection for broadcast news segmentation
- in Proc. EUSIPCO
, 2006
"... This paper presents a speaker change detection system for news broadcast segmentation based on a vector quantization (VQ) approach. The system does not make any assumption about the number of speakers or speaker identity. The system uses mel frequency cepstral coefficients and change detection is do ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper presents a speaker change detection system for news broadcast segmentation based on a vector quantization (VQ) approach. The system does not make any assumption about the number of speakers or speaker identity. The system uses mel frequency cepstral coefficients and change detection
Real-Time Unsupervised Speaker Change Detection
"... The information of speaker change point is very useful for speaker tracking and other applications. In this paper, we presented an effective algorithm for automatic speaker change detection based on LSP correlation analysis. Moreover, a general case is considered, in which the speaker and speaker nu ..."
Abstract
- Add to MetaCart
The information of speaker change point is very useful for speaker tracking and other applications. In this paper, we presented an effective algorithm for automatic speaker change detection based on LSP correlation analysis. Moreover, a general case is considered, in which the speaker and speaker
Unsupervised Speaker Clustering in a Linear Discriminant Subspace
"... Abstract—We present an approach for grouping single-speaker speech segments into speaker-specific clusters. Our approach is based on applying the K-means clustering algorithm to a suitable discriminant subspace, where the euclidean distance reflect speaker differences. A core feature of our approach ..."
Abstract
- Add to MetaCart
Abstract—We present an approach for grouping single-speaker speech segments into speaker-specific clusters. Our approach is based on applying the K-means clustering algorithm to a suitable discriminant subspace, where the euclidean distance reflect speaker differences. A core feature of our
Iterative Unsupervised Speaker Adaptation for Batch Dictation
"... This paper describes an automatic batch-style dictation paradigm in which the entire dictated speech is fully utilized for speaker adaptation and is recognized using the speaker adaptation results. The key point is that the same speech data is used both for recognition as the target and for speaker ..."
Abstract
- Add to MetaCart
This paper describes an automatic batch-style dictation paradigm in which the entire dictated speech is fully utilized for speaker adaptation and is recognized using the speaker adaptation results. The key point is that the same speech data is used both for recognition as the target and for speaker
Results 1 - 10
of
357