Results 1 - 10
of
44
Computer Identification of Musical Instruments Using Pattern Recognition With Cepstral Coefficients as Features
, 1997
"... Cepstral coefficients based on a constant Q transform have been calculated for 28 short (1-2 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax clas ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
Cepstral coefficients based on a constant Q transform have been calculated for 28 short (1-2 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax class. The training set consisted of longer sounds of 1 minute or more for each of the instruments. A k-means algorithm was used to calculate clusters for the training data, and Gaussian probability density functions were formed from the mean and variance of each of the clusters. Each member of the test set was then analyzed to determine the probability that it belonged to each of the two classes; and a Bayes decision rule was invoked to assign it to one of the classes. Results have been extremely good and are compared to a human perception experiment identifying a subset of these same sounds.
Discovering Structural Regularity in 3D Geometry
, 2008
"... We introduce a computational framework for discovering regular or repeated geometric structures in 3D shapes. We describe and classify possible regular structures and present an effective algorithm for detecting such repeated geometric patterns in point- or meshbased models. Our method assumes no p ..."
Abstract
-
Cited by 42 (9 self)
- Add to MetaCart
We introduce a computational framework for discovering regular or repeated geometric structures in 3D shapes. We describe and classify possible regular structures and present an effective algorithm for detecting such repeated geometric patterns in point- or meshbased models. Our method assumes no prior knowledge of the geometry or spatial location of the individual elements that define the pattern. Structure discovery is made possible by a careful analysis of pairwise similarity transformations that reveals prominent lattice structures in a suitable model of transformation space. We introduce an optimization method for detecting such uniform grids specifically designed to deal with outliers and missing elements. This yields a robust algorithm that successfully discovers complex regular structures amidst clutter, noise, and missing geometry. The accuracy of the extracted generating transformations is further improved using a novel simultaneous registration method in the spatial domain. We demonstrate the effectiveness of our algorithm on a variety of examples and show applications to compression, model repair, and geometry synthesis.
Meter and Periodicity in Musical Performance
- Journal of New Music Research
, 2001
"... This paper presents a psychoacoustically based method of data reduction motivated by the desire to analyze the rhythm of musical performances. The resulting information is then analyzed by the “Periodicity Transform ” (which is based on a projection onto “periodic subspaces”) to locate periodicities ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
This paper presents a psychoacoustically based method of data reduction motivated by the desire to analyze the rhythm of musical performances. The resulting information is then analyzed by the “Periodicity Transform ” (which is based on a projection onto “periodic subspaces”) to locate periodicities in the resulting data. These periodicities represent the rhythm at several levels, including the “pulse”, the “measure”, and larger structures such as musical “phrases.” The implications (and limitations) of such automated grouping of rhythmic features is discussed. The method is applied to a number of musical examples, its output is compared to that of the Fourier Transform, and both are compared to a more traditional “musical ” analysis of the rhythm. Unlike many methods of rhythm analysis, the techniques can be applied directly to the digitized performance (i.e., a soundfile) and do not require a musical score or a MIDI transcription. Several examples are presented that highlight both the strengths and weaknesses of the approach. 1
A New Method for Tracking Modulations in Tonal Music in Audio Data Format
- INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS
, 2000
"... Cq-profiles are 12-dimensional vectors, each component referring to a pitch class. They can be employed to represent keys. Cq-profiles are calculated with the constant Q filter bank [4]. They have the following advantages: (i) They correspond to probe tone ratings. (ii) Calculation is possible in ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Cq-profiles are 12-dimensional vectors, each component referring to a pitch class. They can be employed to represent keys. Cq-profiles are calculated with the constant Q filter bank [4]. They have the following advantages: (i) They correspond to probe tone ratings. (ii) Calculation is possible in real-time. (iii) Stability is obtained with respect to sound quality. (iv) They are transposable. By using the cq-profile technique as a simple auditory model in combination with the SOM [11] an arrangement of keys emerges, that resembles results from psychological experiments [13], and from music theory [1]. Cq-profiles are reliably applied to modulation tracking by introducing a special distance measure.
Automatic Chord Recognition from Audio Using an HMM with Supervised Learning
- In Proc. ISMIR
, 2006
"... A novel approach for obtaining labeled training data is presented to directly estimate the model parameters in a supervised learning algorithm for automatic chord recognition from the raw audio. To this end, harmonic analysis is first performed on symbolic data to generate label files. In parallel, ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
A novel approach for obtaining labeled training data is presented to directly estimate the model parameters in a supervised learning algorithm for automatic chord recognition from the raw audio. To this end, harmonic analysis is first performed on symbolic data to generate label files. In parallel, we synthesize audio data from the same symbolic data, which are then provided to a machine learning algorithm along with label files to estimate model parameters. Experimental results show higher performance in frame-level chord recognition than the previous approaches.
Real-Time Fundamental Frequency Estimation by Least-Square Fitting
- IEEE Transactions on Speech and Audio Processing
, 1995
"... The real-time performance of a fundamental frequency estimation algorithm depends not only on its computational efficiency but also on its ability to obtain accurate estimates from short signal segments. Previous frequency-domain algorithms make use of spectral analysis algorithms that require the a ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
The real-time performance of a fundamental frequency estimation algorithm depends not only on its computational efficiency but also on its ability to obtain accurate estimates from short signal segments. Previous frequency-domain algorithms make use of spectral analysis algorithms that require the application of a window function, which cause them to fail when signal segments are short and their fundamental frequencies are low. A new spectral analysis algorithm based on least-square fitting, which does not require the application of a window function, is introduced. This algorithm operates by minimizing the square error of fitting a sinusoid to the signal segment. Special properties of the shape of the error function allow the spectrum of the signal segment to be deduced from it and the algorithm to be implemented efficiently. The proofs of these properties are given. A fundamental frequency estimation algorithm based on this spectral analysis algorithm is then described. Its computati...
Multiphonic Note Identification
- Australian Computer Science Communications
, 1996
"... This paper describes work in progress which addresses several aspects of perception of sound by a computer. This work forms part of a project to build an automatic music transcription system. In recent years there has been considerable advances in the area of note identification (frequency tracking) ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
This paper describes work in progress which addresses several aspects of perception of sound by a computer. This work forms part of a project to build an automatic music transcription system. In recent years there has been considerable advances in the area of note identification (frequency tracking) , to the extent that there are commercial systems which perform well on the task of monophonic note identification. The problem of identifying multiple simultaneous notes remains mostly unsolved, due to the difficulty of separating the partials into their component notes. We present an approach to multiphonic note identification, drawing on research in speech recognition and psychoacoustics. The acoustic data is processed according to a model of human auditory perception and dynamic models of the sources. Keywords Music perception, note identification, automatic transcription, frequency tracking. 1 Introduction and Background Devices for analysing acoustic signals have been available for...
Sparse and shift-invariant feature extraction from non-negative data
, 2008
"... In this paper we describe a technique that allows the extraction of multiple local shift-invariant features from analysis of non-negative data of arbitrary dimensionality. Our approach employs a probabilistic latent variable model with sparsity constraints. We demonstrate its utility by performing f ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
In this paper we describe a technique that allows the extraction of multiple local shift-invariant features from analysis of non-negative data of arbitrary dimensionality. Our approach employs a probabilistic latent variable model with sparsity constraints. We demonstrate its utility by performing feature extraction in a variety of domains ranging from audio to images and video. Index Terms — Feature extraction, Unsupervised learning 1.
Content-based Transformations
- Journal of New Music Research
, 2003
"... Content processing is a vast and growing field that integrates different approaches borrowed from the signal processing, information retrieval and machine learning disciplines. In this article we deal with a particular type of content processing: the so-called content-based transformations. We will ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Content processing is a vast and growing field that integrates different approaches borrowed from the signal processing, information retrieval and machine learning disciplines. In this article we deal with a particular type of content processing: the so-called content-based transformations. We will not focus on any particular application but rather try to give an overview of different techniques and conceptual implications. We first describe the transformation process itself, including the main model schemes that are commonly used, which lead to the establishment of the formal basis for a definition of content-based transformations. Then we take a quick look at a general spectral based analysis/synthesis approach to process audio signals and how to extract features that can be used in the content-based transformation context. Using this analysis/synthesis approach we give some examples on how content-based transformations can be applied to modify the basic perceptual axis of a sound and how we can even combine different basic effects in order to perform more meaningful transformations. We finish by going a step further in the abstraction ladder and present transformations that are related to musical (and thus symbolic) properties rather than to those of the sound or the signal itself.

