Results 1  10
of
131
Discovering Structural Regularity in 3D Geometry
, 2008
"... We introduce a computational framework for discovering regular or repeated geometric structures in 3D shapes. We describe and classify possible regular structures and present an effective algorithm for detecting such repeated geometric patterns in point or meshbased models. Our method assumes no p ..."
Abstract

Cited by 82 (10 self)
 Add to MetaCart
We introduce a computational framework for discovering regular or repeated geometric structures in 3D shapes. We describe and classify possible regular structures and present an effective algorithm for detecting such repeated geometric patterns in point or meshbased models. Our method assumes no prior knowledge of the geometry or spatial location of the individual elements that define the pattern. Structure discovery is made possible by a careful analysis of pairwise similarity transformations that reveals prominent lattice structures in a suitable model of transformation space. We introduce an optimization method for detecting such uniform grids specifically designed to deal with outliers and missing elements. This yields a robust algorithm that successfully discovers complex regular structures amidst clutter, noise, and missing geometry. The accuracy of the extracted generating transformations is further improved using a novel simultaneous registration method in the spatial domain. We demonstrate the effectiveness of our algorithm on a variety of examples and show applications to compression, model repair, and geometry synthesis.
Computer Identification of Musical Instruments Using Pattern Recognition With Cepstral Coefficients as Features
, 1997
"... Cepstral coefficients based on a constant Q transform have been calculated for 28 short (12 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax clas ..."
Abstract

Cited by 63 (0 self)
 Add to MetaCart
Cepstral coefficients based on a constant Q transform have been calculated for 28 short (12 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax class. The training set consisted of longer sounds of 1 minute or more for each of the instruments. A kmeans algorithm was used to calculate clusters for the training data, and Gaussian probability density functions were formed from the mean and variance of each of the clusters. Each member of the test set was then analyzed to determine the probability that it belonged to each of the two classes; and a Bayes decision rule was invoked to assign it to one of the classes. Results have been extremely good and are compared to a human perception experiment identifying a subset of these same sounds.
Automatic Chord Recognition from Audio Using an HMM with Supervised Learning
 In Proc. ISMIR
, 2006
"... A novel approach for obtaining labeled training data is presented to directly estimate the model parameters in a supervised learning algorithm for automatic chord recognition from the raw audio. To this end, harmonic analysis is first performed on symbolic data to generate label files. In parallel, ..."
Abstract

Cited by 33 (4 self)
 Add to MetaCart
(Show Context)
A novel approach for obtaining labeled training data is presented to directly estimate the model parameters in a supervised learning algorithm for automatic chord recognition from the raw audio. To this end, harmonic analysis is first performed on symbolic data to generate label files. In parallel, we synthesize audio data from the same symbolic data, which are then provided to a machine learning algorithm along with label files to estimate model parameters. Experimental results show higher performance in framelevel chord recognition than the previous approaches.
Automatic Chord Transcription from Audio Using Computational Models of Musical Context
, 2010
"... I certify that this thesis, and the research to which it refers, are the product of my own work, and that any ideas or quotations from the work of other people, published or otherwise, are fully acknowledged in accordance with the standard referencing practices of the discipline. I acknowledge the h ..."
Abstract

Cited by 23 (11 self)
 Add to MetaCart
(Show Context)
I certify that this thesis, and the research to which it refers, are the product of my own work, and that any ideas or quotations from the work of other people, published or otherwise, are fully acknowledged in accordance with the standard referencing practices of the discipline. I acknowledge the helpful guidance and support of my supervisor, Dr Simon Dixon. This thesis is concerned with the automatic transcription of chords from audio, with an emphasis on modern popular music. Musical context such as the key and the structural segmentation aid the interpretation of chords in human beings. In this thesis we propose computational models that integrate such musical context into the automatic chord estimation process. We present a novel dynamic Bayesian network (DBN) which integrates models of metric position, key, chord, bass note and two beatsynchronous audio features (bass and treble chroma) into a single highlevel musical context model. We simultaneously infer the most probable sequence of metric positions, keys, chords and bass notes via Viterbi inference. Several experiments with real world data show that adding context parameters results in a significant increase in chord recognition accuracy and faithfulness of chord segmentation. The proposed,
Sparse and shiftinvariant feature extraction from nonnegative data
, 2008
"... In this paper we describe a technique that allows the extraction of multiple local shiftinvariant features from analysis of nonnegative data of arbitrary dimensionality. Our approach employs a probabilistic latent variable model with sparsity constraints. We demonstrate its utility by performing f ..."
Abstract

Cited by 23 (4 self)
 Add to MetaCart
(Show Context)
In this paper we describe a technique that allows the extraction of multiple local shiftinvariant features from analysis of nonnegative data of arbitrary dimensionality. Our approach employs a probabilistic latent variable model with sparsity constraints. We demonstrate its utility by performing feature extraction in a variety of domains ranging from audio to images and video. Index Terms — Feature extraction, Unsupervised learning 1.
Meter and Periodicity in Musical Performance
 Journal of New Music Research
, 2001
"... This paper presents a psychoacoustically based method of data reduction motivated by the desire to analyze the rhythm of musical performances. The resulting information is then analyzed by the “Periodicity Transform ” (which is based on a projection onto “periodic subspaces”) to locate periodicities ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
(Show Context)
This paper presents a psychoacoustically based method of data reduction motivated by the desire to analyze the rhythm of musical performances. The resulting information is then analyzed by the “Periodicity Transform ” (which is based on a projection onto “periodic subspaces”) to locate periodicities in the resulting data. These periodicities represent the rhythm at several levels, including the “pulse”, the “measure”, and larger structures such as musical “phrases.” The implications (and limitations) of such automated grouping of rhythmic features is discussed. The method is applied to a number of musical examples, its output is compared to that of the Fourier Transform, and both are compared to a more traditional “musical ” analysis of the rhythm. Unlike many methods of rhythm analysis, the techniques can be applied directly to the digitized performance (i.e., a soundfile) and do not require a musical score or a MIDI transcription. Several examples are presented that highlight both the strengths and weaknesses of the approach. 1
A supervised classification algorithm for note onset detection
 EURASIP Journal on Applied Signal Processing
, 2007
"... This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peakpicking algorithm based on a ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peakpicking algorithm based on a moving average. In this paper we present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. In the paper we describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation. 1
Shiftinvariant probabilistic latent component analysis
 Journal of Machine Learning Research (under review
, 2008
"... In this paper we present a model which can decompose a probability densities or count data into a set of shift invariant components. We begin by introducing a regular latent variable model and subsequently extend it to deal with shift invariance in order to model more complex inputs. We develop an e ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
In this paper we present a model which can decompose a probability densities or count data into a set of shift invariant components. We begin by introducing a regular latent variable model and subsequently extend it to deal with shift invariance in order to model more complex inputs. We develop an expectation maximization algorithm for estimating components and present various results on challenging realworld data. We show that this approach is a probabilistic generalization of well known algorithms such as NonNegative Matrix Factorization and multiway decompositions, and discuss its advantages over such approaches.
A New Method for Tracking Modulations in Tonal Music in Audio Data Format
 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS
, 2000
"... Cqprofiles are 12dimensional vectors, each component referring to a pitch class. They can be employed to represent keys. Cqprofiles are calculated with the constant Q filter bank [4]. They have the following advantages: (i) They correspond to probe tone ratings. (ii) Calculation is possible in ..."
Abstract

Cited by 18 (8 self)
 Add to MetaCart
Cqprofiles are 12dimensional vectors, each component referring to a pitch class. They can be employed to represent keys. Cqprofiles are calculated with the constant Q filter bank [4]. They have the following advantages: (i) They correspond to probe tone ratings. (ii) Calculation is possible in realtime. (iii) Stability is obtained with respect to sound quality. (iv) They are transposable. By using the cqprofile technique as a simple auditory model in combination with the SOM [11] an arrangement of keys emerges, that resembles results from psychological experiments [13], and from music theory [1]. Cqprofiles are reliably applied to modulation tracking by introducing a special distance measure.