MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Toward Content-Based Audio Indexing and Retrieval and a New Speaker Discrimination Technique (1995) [12 citations — 0 self]

by Lonce Wyse ,  Stephen W. Smoliar
In Proc. ICJAI '95
Add To MetaCart

Abstract:

Several techniques for identifying segment transitions in an audio stream are discussed. Gross features are first identified that control more detailed and computationally expensive analysis down stream. Pitch is tracked using some basic streaming principles, and then used as one cue to speaker transitions. A novel speaker discrimination technique is described that makes segmentation decisions when a continuously updated model of the current speaker suddenly ceases to sufficiently account for the input data.

Citations

346 Perceptual linear predictive PLP analysis for speech – Hermansky - 1990
148 Content-Based Video Indexing and Retrieval – Smoliar, Zhang
90 Control methods used in a study of the vowels – Peterson, Barney - 1952
32 Structure out of Sound – Hawley - 1993
18 Research on individuality features in speech waves and automatic speaker recognition techniques – Furui - 1986
16 A spectral network model of pitch perception – Grossberg - 1995
2 Spectral analysis of sung vowels. III. Characteristics of singers and modes of singing – Bloothooft, Plomp - 1986
2 Auditory Scene Analysis (M.I.T – Bregman - 1990
2 Suggested formlae for calculating auditory filter bandwidths and excitation patterns – Moore, Glasberg - 1983