Results 1 - 10
of
38
Functional Phonology -- Formalizing the interactions between articulatory and perceptual drives
, 1998
"... ..."
Content-based Organization and Visualization of Music Archives
, 2002
"... With Islands of Music we present a system which facilitates exploration of music libraries without requiring manual genre classification. Given pieces of music in raw audio format we estimate their perceived sound similarities based on psychoacoustic models. Subsequently, the pieces are organized on ..."
Abstract
-
Cited by 85 (24 self)
- Add to MetaCart
With Islands of Music we present a system which facilitates exploration of music libraries without requiring manual genre classification. Given pieces of music in raw audio format we estimate their perceived sound similarities based on psychoacoustic models. Subsequently, the pieces are organized on a 2-dimensional map so that similar pieces are located close to each other. A visualization using a metaphor of geographic maps provides an intuitive interface where islands resemble genres or styles of music. We demonstrate the approach using a collection of 359 pieces of music.
Perceptual Coding of Digital Audio
- Proceedings of the IEEE
, 2000
"... During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, wireless, and multimedia computing systems face a series of constraints such as reduced channel bandwidth, limited storage capacity, and low cost. These new applic ..."
Abstract
-
Cited by 76 (0 self)
- Add to MetaCart
During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, wireless, and multimedia computing systems face a series of constraints such as reduced channel bandwidth, limited storage capacity, and low cost. These new applications have created a demand for high-quality digital audio delivery at low bit rates. In response to this need, considerable research has been devoted to the development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed, and several have now become international and/or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. The paper is organized as follows. First, psychoacoustic principles are described with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Next, filter bank design issues and algorithms are addressed, with a particular emphasis placed on the Modified Discrete Cosine Transform (MDCT), a perfect reconstruction (PR) cosine-modulated filter bank that has become of central importance in perceptual audio coding. Then, we review methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction (LP) parameters, as well as hybrid algorithms that make use of more than one signal model. These discussions concentrate on architectures and applications of
Exploring Music Collections by Browsing Different Views
, 2003
"... The availability of large music collections calls for ways to efficiently access and explore them. We present a new approach which combines descriptors derived from audio analysis with meta-information to create different views of a collection. Such views can have a focus on timbre, rhythm, artist, ..."
Abstract
-
Cited by 64 (16 self)
- Add to MetaCart
The availability of large music collections calls for ways to efficiently access and explore them. We present a new approach which combines descriptors derived from audio analysis with meta-information to create different views of a collection. Such views can have a focus on timbre, rhythm, artist, style or other aspects of music. For each view the pieces of music are organized on a map in such a way that similar pieces are located close to each other. The maps are visualized using an Islands of Music metaphor where islands represent groups of similar pieces. The maps are linked to each other using a new technique to align self-organizing maps. The user is able to browse the collection and explore different aspects by gradually changing focus from one view to another. We demonstrate our approach on a small collection using a meta-information-based view and two views generated from audio analysis, namely, beat periodicity as an aspect of rhythm and spectral information as an aspect of timbre.
Islands of Music - Analysis, Organization, and Visualization of Music Archives
, 2001
"... This report summarizes the master's thesis Islands of Music: Analysis, Organization, and Visualization of Music Archives, which I submitted to the Vienna University of Technology on December 11th, 2001. I wrote it at the Department of Software Technology and Interactive Systems, supervised by Dr. An ..."
Abstract
-
Cited by 60 (15 self)
- Add to MetaCart
This report summarizes the master's thesis Islands of Music: Analysis, Organization, and Visualization of Music Archives, which I submitted to the Vienna University of Technology on December 11th, 2001. I wrote it at the Department of Software Technology and Interactive Systems, supervised by Dr. Andreas Rauber, and assessed by Prof. Dr. Dieter Merkl
Using Psycho-Acoustic Models and Self-Organizing Maps To Create Hierarchical Structuring of Music by Sound Similarity
, 2002
"... With the advent of large musical archives the need to provide an organization of these archives becomes eminent. While artist-based organizations or title indexes may help in locating a specific piece of music, a more intuitive, genre-based organization is required to allow users to browse an archiv ..."
Abstract
-
Cited by 59 (19 self)
- Add to MetaCart
With the advent of large musical archives the need to provide an organization of these archives becomes eminent. While artist-based organizations or title indexes may help in locating a specific piece of music, a more intuitive, genre-based organization is required to allow users to browse an archive and explore its contents. Yet, currently these organizations following musical styles have to be designed manually.
Evaluation of Feature Extractors and Psycho-Acoustic Transformations for Music Genre Classification
"... We present a study on the importance of psycho-acoustic transformations for effective audio feature calculation. From the results, both crucial and problematic parts of the algorithm for Rhythm Patterns feature extraction are identified. We furthermore introduce two new feature representations in th ..."
Abstract
-
Cited by 42 (14 self)
- Add to MetaCart
We present a study on the importance of psycho-acoustic transformations for effective audio feature calculation. From the results, both crucial and problematic parts of the algorithm for Rhythm Patterns feature extraction are identified. We furthermore introduce two new feature representations in this context: Statistical Spectrum Descriptors and Rhythm Histogram features. Evaluation on both the individual and combined feature sets is accomplished through a music genre classification task, involving 3 reference audio collections. Results are compared to published measures on the same data sets. Experiments confirmed that in all settings the inclusion of psycho-acoustic transformations provides significant improvement of classification accuracy.
The SOM-enhanced JukeBox: Organization and visualization of music collections based on perceptual models
- Journal of New Music Research
, 2003
"... This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express ..."
Abstract
-
Cited by 27 (13 self)
- Add to MetaCart
This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. Journal of New Music Research 0929-8215/03/3202-193$16.00 2003, Vol. 32, No. 2, pp. 193–210 © Swets & Zeitlinger
Objective Estimation of Perceived Speech Quality-Part I: Development of the Measuring Normalizing Block Technique
- IEEE Trans. on Speech and Audio Process
, 1999
"... EDICS Number: SA 1.4.6 Part I of this paper describes a new approach to the objective estimation of perceived speech quality. This new approach uses a simple but effective perceptual transformation and a distance measure that consists of a hierarchy of measuring normalizing blocks. Each measuring no ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
EDICS Number: SA 1.4.6 Part I of this paper describes a new approach to the objective estimation of perceived speech quality. This new approach uses a simple but effective perceptual transformation and a distance measure that consists of a hierarchy of measuring normalizing blocks. Each measuring normalizing block integrates two perceptually transformed signals over some time or frequency interval to determine the average difference across that interval. This difference is then normalized out of one signal, and is further processed to generate one or more measurements. In Part II the resulting estimates of perceived speech quality are correlated with the results of nine subjective listening tests. Together, these tests include 219 4-kHz bandwidth speech codecs, transmission systems, and reference conditions, with bit rates ranging from 2.4 to 64 kb/s. When compared with six other estimators, significant improvements are seen in many cases, particularly at lower bit rates, and when bit errors or frame erasures are present. These hierarchical structures of measuring normalizing blocks, or other structures of measuring normalizing blocks may also address open issues in perceived audio quality estimation, layered speech or audio coding, automatic speech
Sound Re-Synthesis From Rhythm Pattern Features - Audible Insight into a Music Feature Extraction Process
- In Proceedings of the International Computer Music Conference (ICMC
, 2005
"... For tasks like musical genre identification and similarity searches in audio databases, audio files have to be described by suitable feature sets. Since these feature sets usually try to capture diverse discriminative characteristics, it is interesting and desirable to create an acoustic representat ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
For tasks like musical genre identification and similarity searches in audio databases, audio files have to be described by suitable feature sets. Since these feature sets usually try to capture diverse discriminative characteristics, it is interesting and desirable to create an acoustic representation of the feature set to support intuitive evaluation. In this paper, we present an approach for making a specific feature set, namely Rhythm Patterns, instantly human comprehensible by re-assembling sound from the numerical descriptors. The re-synthesized audio chunks represent clearly perceivable rhythmical characteristics on critical frequency bands of the original music.

