Results 1 - 10
of
76
Sound-Source Recognition: A Theory and Computational Model
, 1999
"... The ability of a normal human listener to recognize objects in the environment from only the sounds they produce is extraordinarily robust with regard to characteristics of the acoustic environment and of other competing sound sources. In contrast, computer systems designed to recognize sound source ..."
Abstract
-
Cited by 96 (0 self)
- Add to MetaCart
The ability of a normal human listener to recognize objects in the environment from only the sounds they produce is extraordinarily robust with regard to characteristics of the acoustic environment and of other competing sound sources. In contrast, computer systems designed to recognize sound sources function precariously, breaking down whenever the target sound is degraded by reverberation, noise, or competing sounds. Robust listening requires extensive contextual knowledge, but the potential contribution of sound-source recognition to the process of auditory scene analysis has largely been neglected by researchers building computational models of the scene analysis process. This thesis proposes a theory of sound-source recognition, casting recognition as a process of gathering information to enable the listener to make inferences about
Structured Audio: Creation, Transmission, and Rendering of Parametric Sound Representations
- PROC. IEEE
, 1998
"... ..."
(Show Context)
Computational Auditory Scene Recognition
- In IEEE Int’l Conf. on Acoustics, Speech, and Signal Processing
, 2001
"... v 1 ..."
Instrument recognition in polyphonic music based on automatic taxonomies
- IEEE Transactions on Speech and Audio Processing
, 2006
"... We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic dist ..."
Abstract
-
Cited by 63 (9 self)
- Add to MetaCart
(Show Context)
We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic distances, we obtain a taxonomy of musical ensembles which is used to efficiently classify possible combinations of instruments played simultaneously. Moreover, a wide set of acoustic features is studied including some new proposals. In particular, Signal to Mask Ratios are found to be useful features for audio classification. This study focuses on a single music genre (i.e. jazz) but combines a variety of instruments among which are percussion and singing voice. Using a varied database of sound excerpts from commercial recordings, we show that the segmentation of music with respect to the instruments played can be achieved with an average accuracy of 53%.
Instrument-specific harmonic atoms for mid-level music representation
- IEEE Trans. on Audio, Speech and Lang. Proc
, 2008
"... Abstract—Several studies have pointed out the need for accurate mid-level representations of music signals for information retrieval and signal processing purposes. In this paper, we propose a new mid-level representation based on the decomposition of a signal into a small number of sound atoms or m ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
(Show Context)
Abstract—Several studies have pointed out the need for accurate mid-level representations of music signals for information retrieval and signal processing purposes. In this paper, we propose a new mid-level representation based on the decomposition of a signal into a small number of sound atoms or molecules bearing explicit musical instrument labels. Each atom is a sum of windowed harmonic sinusoidal partials whose relative amplitudes are specific to one instrument, and each molecule consists of several atoms from the same instrument spanning successive time windows. We design efficient algorithms to extract the most prominent atoms or molecules and investigate several applications of this representation, including polyphonic instrument recognition and music visualization. Index Terms—Mid-level representation, music information retrieval, music visualization, sparse decomposition. I.
Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps
, 2007
"... We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play, partials (harmonic components) of their sounds overlap and interfere, which makes the acoustic features ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play, partials (harmonic components) of their sounds overlap and interfere, which makes the acoustic features different from those of monophonic sounds. To cope with this, we weight features based on how much they are affected by overlapping. First, we quantitatively evaluate the influence of overlapping on each feature as the ratio of the within-class variance to the between-class variance in the distribution of training data obtained from polyphonic sounds. Then, we generate feature axes using a weighted mixture that minimizes the influence via linear discriminant analysis. In addition, we improve instrument identification using musical context. Experimental results showed that the recognition rates using both feature weighting and musical context were 84.1 % for duo, 77.6 % for trio, and 72.3 % for quartet; those without using either were 53.4, 49.6, and 46.5%, respectively.
Exploration of Techniques for Automatic Labeling of Audio Drum Tracks' Instruments
, 2001
"... We report on the progress of current work regarding the automatic recognition of percussive instruments embedded in audio excerpts of performances on drum sets. Content-based transformation of audio drum tracks and loops requires the identification of the instruments that are played in the sound f ..."
Abstract
-
Cited by 31 (9 self)
- Add to MetaCart
We report on the progress of current work regarding the automatic recognition of percussive instruments embedded in audio excerpts of performances on drum sets. Content-based transformation of audio drum tracks and loops requires the identification of the instruments that are played in the sound file. Some supervised and unsupervised techniques are examined in this paper, and classification results for a small number of classes are discussed. In order to cope with the issue of classifying percussive events embedded in continuous audio streams, we rely on a method based on an automatic adaptation of the analysis frame size to the smallest metrical pulse, called the "tick". The success rate with some of the explored techniques has been quite good (around 80%), but some enhancements are still needed in order to be able to accurately classify sounds in real applications' conditions.
Musical instrument recognition by pairwise classification strategies
- IEEE Transactions On Audio, Speech And Language Processing
, 2006
"... Abstract—Musical instrument recognition is an important aspect of music information retrieval. In this paper, statistical pattern recognition techniques are utilized to tackle the problem in the context of solo musical phrases. Ten instrument classes from different instrument families are considered ..."
Abstract
-
Cited by 30 (8 self)
- Add to MetaCart
(Show Context)
Abstract—Musical instrument recognition is an important aspect of music information retrieval. In this paper, statistical pattern recognition techniques are utilized to tackle the problem in the context of solo musical phrases. Ten instrument classes from different instrument families are considered. A large sound database is collected from excerpts of musical phrases acquired from commercial recordings translating different instrument instances, performers, and recording conditions. More than 150 signal processing features are studied including new descriptors. Two feature selection techniques, inertia ratio maximization with feature space projection and genetic algorithms are considered in a class pairwise manner whereby the most relevant features are fetched for each instrument pair. For the classification task, experimental results are provided using Gaussian mixture models (GMMs) and support vector machines (SVMs). It is shown that higher recognition rates can be reached with pairwise optimized subsets of features in association with SVM classification using a radial basis function kernel. Index Terms—Feature selection, Gaussian mixture model (GMM), genetic algorithms, inertia ratio maximization with feature space projection (IRMFSP), musical instrument recognition, pairwise classification, support vector machine (SVM). I.
Mosievius: Feature Driven Interactive Audio Mosaicing
, 2003
"... The process of creating an audio mosaic consists of the concatenation of segments of sound. Segments are chosen to correspond best with a description of a target sound specified by the desired features of the final mosaic. Current audio mosaicing techniques take advantage of the description of futur ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
The process of creating an audio mosaic consists of the concatenation of segments of sound. Segments are chosen to correspond best with a description of a target sound specified by the desired features of the final mosaic. Current audio mosaicing techniques take advantage of the description of future target units in order to make more intelligent decisions when choosing individual segments. In this paper, we investigate ways to expand mosaicing techniques in order to use the mosaicing process as an interactive means of musical expression in real time.