Results 1 - 10
of
37
Sound-Source Recognition: A Theory and Computational Model
, 1999
"... The ability of a normal human listener to recognize objects in the environment from only the sounds they produce is extraordinarily robust with regard to characteristics of the acoustic environment and of other competing sound sources. In contrast, computer systems designed to recognize sound source ..."
Abstract
-
Cited by 61 (0 self)
- Add to MetaCart
The ability of a normal human listener to recognize objects in the environment from only the sounds they produce is extraordinarily robust with regard to characteristics of the acoustic environment and of other competing sound sources. In contrast, computer systems designed to recognize sound sources function precariously, breaking down whenever the target sound is degraded by reverberation, noise, or competing sounds. Robust listening requires extensive contextual knowledge, but the potential contribution of sound-source recognition to the process of auditory scene analysis has largely been neglected by researchers building computational models of the scene analysis process. This thesis proposes a theory of sound-source recognition, casting recognition as a process of gathering information to enable the listener to make inferences about
Computer Identification of Musical Instruments Using Pattern Recognition With Cepstral Coefficients as Features
, 1997
"... Cepstral coefficients based on a constant Q transform have been calculated for 28 short (1-2 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax clas ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
Cepstral coefficients based on a constant Q transform have been calculated for 28 short (1-2 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax class. The training set consisted of longer sounds of 1 minute or more for each of the instruments. A k-means algorithm was used to calculate clusters for the training data, and Gaussian probability density functions were formed from the mean and variance of each of the clusters. Each member of the test set was then analyzed to determine the probability that it belonged to each of the two classes; and a Bayes decision rule was invoked to assign it to one of the classes. Results have been extremely good and are compared to a human perception experiment identifying a subset of these same sounds.
Automatic Classification of Drum Sounds: A Comparison of Feature Selection and Classification Techniques
- Proceedings of 2nd International Conference on Music and Artificial Intelligence
, 2002
"... We present a comparative evaluation of automatic classification of a sound database containing more than six hundred drum sounds (kick, snare, hihat, toms and cymbals). A preliminary set of fifty descriptors has been refined with the help of different techniques and some final reduced sets includ ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
We present a comparative evaluation of automatic classification of a sound database containing more than six hundred drum sounds (kick, snare, hihat, toms and cymbals). A preliminary set of fifty descriptors has been refined with the help of different techniques and some final reduced sets including around twenty features have been selected as the most relevant. We have then tested different classification techniques (instance-based, statistical-based, and tree-based) using ten-fold cross-validation. Three levels of taxonomic classification have been tested: membranes versus plates (super-category level), kick vs. snare vs. hihat vs. toms vs. cymbals (basic level), and some basic classes (kick and snare) plus some sub-classes --i.e. ride, crash, open-hihat, closed hihat, high-tom, medium-tom, low-tom- (sub-category level). Very high hit-rates have been achieved (99%, 97%, and 90% respectively) with several of the tested techniques.
Instrument recognition in polyphonic music based on automatic taxonomies
- IEEE Transactions on Speech and Audio Processing
, 2006
"... We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic dist ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic distances, we obtain a taxonomy of musical ensembles which is used to efficiently classify possible combinations of instruments played simultaneously. Moreover, a wide set of acoustic features is studied including some new proposals. In particular, Signal to Mask Ratios are found to be useful features for audio classification. This study focuses on a single music genre (i.e. jazz) but combines a variety of instruments among which are percussion and singing voice. Using a varied database of sound excerpts from commercial recordings, we show that the segmentation of music with respect to the instruments played can be achieved with an average accuracy of 53%.
Towards Instrument Segmentation for Music Content Description: A critical review of instrument classification techniques
- International Symposium on Music Information Retrieval
, 2000
"... this paper we concentrate on reviewing the different techniques that have been so far proposed for automatic classification of musical instruments. As most of the techniques to be discussed are usable only in "solo" performances we will evaluate their applicability to the more complex case of descri ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
this paper we concentrate on reviewing the different techniques that have been so far proposed for automatic classification of musical instruments. As most of the techniques to be discussed are usable only in "solo" performances we will evaluate their applicability to the more complex case of describing sound mixes. We conclude this survey discussing the necessity of developing new strategies for classifying sound mixes without a priori separation of sound sources
Instrument Sound Description in the Context of MPEG-7
, 2000
"... We review a proposal made in the framework of MPEG-7 for the description of instrument sounds based on perceptual features. Results ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
We review a proposal made in the framework of MPEG-7 for the description of instrument sounds based on perceptual features. Results
Musical instrument recognition by pairwise classification strategies
- IEEE Transactions On Audio, Speech And Language Processing
, 2006
"... Abstract—Musical instrument recognition is an important aspect of music information retrieval. In this paper, statistical pattern recognition techniques are utilized to tackle the problem in the context of solo musical phrases. Ten instrument classes from different instrument families are considered ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Abstract—Musical instrument recognition is an important aspect of music information retrieval. In this paper, statistical pattern recognition techniques are utilized to tackle the problem in the context of solo musical phrases. Ten instrument classes from different instrument families are considered. A large sound database is collected from excerpts of musical phrases acquired from commercial recordings translating different instrument instances, performers, and recording conditions. More than 150 signal processing features are studied including new descriptors. Two feature selection techniques, inertia ratio maximization with feature space projection and genetic algorithms are considered in a class pairwise manner whereby the most relevant features are fetched for each instrument pair. For the classification task, experimental results are provided using Gaussian mixture models (GMMs) and support vector machines (SVMs). It is shown that higher recognition rates can be reached with pairwise optimized subsets of features in association with SVM classification using a radial basis function kernel. Index Terms—Feature selection, Gaussian mixture model (GMM), genetic algorithms, inertia ratio maximization with feature space projection (IRMFSP), musical instrument recognition, pairwise classification, support vector machine (SVM). I.
Envelope Model Of Isolated Musical Sounds
- PROCEEDINGS OF THE DAFX
, 1999
"... This paper presents a model of the envelope of the additive parameters of isolated musical sounds, along with a new method for the estimation of the important envelope splitpoint times. The model consists of start, attack, sustain, release, and end segments with variable split-point amplitude and ti ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
This paper presents a model of the envelope of the additive parameters of isolated musical sounds, along with a new method for the estimation of the important envelope splitpoint times. The model consists of start, attack, sustain, release, and end segments with variable split-point amplitude and time. The estimation of the times is done using smoothed derivatives of the envelopes. The estimated split-point values can be used together with a curve-form model introduced in this paper in the analysis/synthesis of musical sounds. The envelope model can recreate noise-less musical sounds with good fidelity, and the method for the estimation of the envelope times performs significantly better than the classical percentage-based method.
Content-based Transformations
- Journal of New Music Research
, 2003
"... Content processing is a vast and growing field that integrates different approaches borrowed from the signal processing, information retrieval and machine learning disciplines. In this article we deal with a particular type of content processing: the so-called content-based transformations. We will ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Content processing is a vast and growing field that integrates different approaches borrowed from the signal processing, information retrieval and machine learning disciplines. In this article we deal with a particular type of content processing: the so-called content-based transformations. We will not focus on any particular application but rather try to give an overview of different techniques and conceptual implications. We first describe the transformation process itself, including the main model schemes that are commonly used, which lead to the establishment of the formal basis for a definition of content-based transformations. Then we take a quick look at a general spectral based analysis/synthesis approach to process audio signals and how to extract features that can be used in the content-based transformation context. Using this analysis/synthesis approach we give some examples on how content-based transformations can be applied to modify the basic perceptual axis of a sound and how we can even combine different basic effects in order to perform more meaningful transformations. We finish by going a step further in the abstraction ladder and present transformations that are related to musical (and thus symbolic) properties rather than to those of the sound or the signal itself.

