Results 1  10
of
90
Discovering Structural Regularity in 3D Geometry
, 2008
"... We introduce a computational framework for discovering regular or repeated geometric structures in 3D shapes. We describe and classify possible regular structures and present an effective algorithm for detecting such repeated geometric patterns in point or meshbased models. Our method assumes no p ..."
Abstract

Cited by 78 (9 self)
 Add to MetaCart
We introduce a computational framework for discovering regular or repeated geometric structures in 3D shapes. We describe and classify possible regular structures and present an effective algorithm for detecting such repeated geometric patterns in point or meshbased models. Our method assumes no prior knowledge of the geometry or spatial location of the individual elements that define the pattern. Structure discovery is made possible by a careful analysis of pairwise similarity transformations that reveals prominent lattice structures in a suitable model of transformation space. We introduce an optimization method for detecting such uniform grids specifically designed to deal with outliers and missing elements. This yields a robust algorithm that successfully discovers complex regular structures amidst clutter, noise, and missing geometry. The accuracy of the extracted generating transformations is further improved using a novel simultaneous registration method in the spatial domain. We demonstrate the effectiveness of our algorithm on a variety of examples and show applications to compression, model repair, and geometry synthesis.
Computer Identification of Musical Instruments Using Pattern Recognition With Cepstral Coefficients as Features
, 1997
"... Cepstral coefficients based on a constant Q transform have been calculated for 28 short (12 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax clas ..."
Abstract

Cited by 57 (0 self)
 Add to MetaCart
Cepstral coefficients based on a constant Q transform have been calculated for 28 short (12 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax class. The training set consisted of longer sounds of 1 minute or more for each of the instruments. A kmeans algorithm was used to calculate clusters for the training data, and Gaussian probability density functions were formed from the mean and variance of each of the clusters. Each member of the test set was then analyzed to determine the probability that it belonged to each of the two classes; and a Bayes decision rule was invoked to assign it to one of the classes. Results have been extremely good and are compared to a human perception experiment identifying a subset of these same sounds.
Automatic Chord Recognition from Audio Using an HMM with Supervised Learning
 In Proc. ISMIR
, 2006
"... A novel approach for obtaining labeled training data is presented to directly estimate the model parameters in a supervised learning algorithm for automatic chord recognition from the raw audio. To this end, harmonic analysis is first performed on symbolic data to generate label files. In parallel, ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
A novel approach for obtaining labeled training data is presented to directly estimate the model parameters in a supervised learning algorithm for automatic chord recognition from the raw audio. To this end, harmonic analysis is first performed on symbolic data to generate label files. In parallel, we synthesize audio data from the same symbolic data, which are then provided to a machine learning algorithm along with label files to estimate model parameters. Experimental results show higher performance in framelevel chord recognition than the previous approaches.
Automatic Chord Transcription from Audio Using Computational Models of Musical Context
, 2010
"... I certify that this thesis, and the research to which it refers, are the product of my own work, and that any ideas or quotations from the work of other people, published or otherwise, are fully acknowledged in accordance with the standard referencing practices of the discipline. I acknowledge the h ..."
Abstract

Cited by 23 (11 self)
 Add to MetaCart
I certify that this thesis, and the research to which it refers, are the product of my own work, and that any ideas or quotations from the work of other people, published or otherwise, are fully acknowledged in accordance with the standard referencing practices of the discipline. I acknowledge the helpful guidance and support of my supervisor, Dr Simon Dixon. This thesis is concerned with the automatic transcription of chords from audio, with an emphasis on modern popular music. Musical context such as the key and the structural segmentation aid the interpretation of chords in human beings. In this thesis we propose computational models that integrate such musical context into the automatic chord estimation process. We present a novel dynamic Bayesian network (DBN) which integrates models of metric position, key, chord, bass note and two beatsynchronous audio features (bass and treble chroma) into a single highlevel musical context model. We simultaneously infer the most probable sequence of metric positions, keys, chords and bass notes via Viterbi inference. Several experiments with real world data show that adding context parameters results in a significant increase in chord recognition accuracy and faithfulness of chord segmentation. The proposed,
Meter and Periodicity in Musical Performance
 Journal of New Music Research
, 2001
"... This paper presents a psychoacoustically based method of data reduction motivated by the desire to analyze the rhythm of musical performances. The resulting information is then analyzed by the “Periodicity Transform ” (which is based on a projection onto “periodic subspaces”) to locate periodicities ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
This paper presents a psychoacoustically based method of data reduction motivated by the desire to analyze the rhythm of musical performances. The resulting information is then analyzed by the “Periodicity Transform ” (which is based on a projection onto “periodic subspaces”) to locate periodicities in the resulting data. These periodicities represent the rhythm at several levels, including the “pulse”, the “measure”, and larger structures such as musical “phrases.” The implications (and limitations) of such automated grouping of rhythmic features is discussed. The method is applied to a number of musical examples, its output is compared to that of the Fourier Transform, and both are compared to a more traditional “musical ” analysis of the rhythm. Unlike many methods of rhythm analysis, the techniques can be applied directly to the digitized performance (i.e., a soundfile) and do not require a musical score or a MIDI transcription. Several examples are presented that highlight both the strengths and weaknesses of the approach. 1
Sparse and shiftinvariant feature extraction from nonnegative data
, 2008
"... In this paper we describe a technique that allows the extraction of multiple local shiftinvariant features from analysis of nonnegative data of arbitrary dimensionality. Our approach employs a probabilistic latent variable model with sparsity constraints. We demonstrate its utility by performing f ..."
Abstract

Cited by 20 (4 self)
 Add to MetaCart
In this paper we describe a technique that allows the extraction of multiple local shiftinvariant features from analysis of nonnegative data of arbitrary dimensionality. Our approach employs a probabilistic latent variable model with sparsity constraints. We demonstrate its utility by performing feature extraction in a variety of domains ranging from audio to images and video. Index Terms — Feature extraction, Unsupervised learning 1.
Shiftinvariant probabilistic latent component analysis
 Journal of Machine Learning Research (under review
, 2008
"... In this paper we present a model which can decompose a probability densities or count data into a set of shift invariant components. We begin by introducing a regular latent variable model and subsequently extend it to deal with shift invariance in order to model more complex inputs. We develop an e ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
In this paper we present a model which can decompose a probability densities or count data into a set of shift invariant components. We begin by introducing a regular latent variable model and subsequently extend it to deal with shift invariance in order to model more complex inputs. We develop an expectation maximization algorithm for estimating components and present various results on challenging realworld data. We show that this approach is a probabilistic generalization of well known algorithms such as NonNegative Matrix Factorization and multiway decompositions, and discuss its advantages over such approaches.
A New Method for Tracking Modulations in Tonal Music in Audio Data Format
 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS
, 2000
"... Cqprofiles are 12dimensional vectors, each component referring to a pitch class. They can be employed to represent keys. Cqprofiles are calculated with the constant Q filter bank [4]. They have the following advantages: (i) They correspond to probe tone ratings. (ii) Calculation is possible in ..."
Abstract

Cited by 17 (8 self)
 Add to MetaCart
Cqprofiles are 12dimensional vectors, each component referring to a pitch class. They can be employed to represent keys. Cqprofiles are calculated with the constant Q filter bank [4]. They have the following advantages: (i) They correspond to probe tone ratings. (ii) Calculation is possible in realtime. (iii) Stability is obtained with respect to sound quality. (iv) They are transposable. By using the cqprofile technique as a simple auditory model in combination with the SOM [11] an arrangement of keys emerges, that resembles results from psychological experiments [13], and from music theory [1]. Cqprofiles are reliably applied to modulation tracking by introducing a special distance measure.
Sound Source Separation using Shifted Nonnegative Tensor Factorisation
 Proceedings on the IEE Conference on Audio and Speech Signal Processing (ICASSP
, 2006
"... Recently, shifted Nonnegative Matrix Factorisation was developed as a means of separating harmonic instruments from single channel mixtures. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Recently, shifted Nonnegative Matrix Factorisation was developed as a means of separating harmonic instruments from single channel mixtures. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end, a shifted Nonnegative Tensor Factorisation algorithm is derived, which extends shifted Nonnegative Matrix Factorisation to the multichannel case. The use of this algorithm for multichannel sound source separation of harmonic instruments is demonstrated. Further, it is shown that the algorithm can be used to perform Nonnegative Tensor Deconvolution, a multichannel version of Nonnegative Matrix Deconvolution, to separate sound sources which have time evolving spectra from multichannel signals. 1.