Results 1 - 10
of
40
Techniques For Automatic Music Transcription
- in International Symposium on Music Information Retrieval
, 2000
"... Two systems are reviewed than perform automatic music transcription. The first perform monophonic transcription using an autocorrelation pitch tracker. The algorithm takes advantage of some heuristic parameters related to the similarity between image and sound in the collector. The detection is corr ..."
Abstract
-
Cited by 43 (1 self)
- Add to MetaCart
(Show Context)
Two systems are reviewed than perform automatic music transcription. The first perform monophonic transcription using an autocorrelation pitch tracker. The algorithm takes advantage of some heuristic parameters related to the similarity between image and sound in the collector. The detection is correct between notes B1 to E6 and further timbre analysis will provide the necessary parameters to reproduce a similar copy of the original sound. The second system is able to analyse simple polyphonic tracks. It is composed of a blackboard system, receiving its input from a segmentation routine in the form of an averaged STFT matrix. The blackboard contents an hypotheses database, an scheduler and knowledge sources, one of which is a neural network chord recogniser with the ability to reconfigure the operation of the system, allowing it to output more than one note hypothesis at the time. Some examples are provided to illustrate the performance and the weaknesses of the current implementation. Next steps for further development are defined.
CubyHum: A Fully Operational Query by Humming System
- ISMIR 2002 Conference Proceedings
, 2002
"... 'Query by humming ' is an interaction concept in which the identity of a song has to be revealed fast and orderly from a given sung input using a large database of known melodies. In short, it tries to detect the pitches in a sung melody and compares these pitches with symbolic representat ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
(Show Context)
'Query by humming ' is an interaction concept in which the identity of a song has to be revealed fast and orderly from a given sung input using a large database of known melodies. In short, it tries to detect the pitches in a sung melody and compares these pitches with symbolic representations of the known melodies. Melodies that are similar to the sung pitches are retrieved. Approximate pattern matching in the melody comparison process compensates for the errors in the sung melody by using classical dynamic programming. A filtering method is used to save computation in the dynamic programming framework. This paper presents the algorithms for pitch detection, note onset detection, quantization, melody encoding and approximate pattern matching as they have been implemented in the CubyHum software system. Since human reproduction of melodies is imperfect, findings from an experimental singing study were a crucial input to the development of the algorithms. Future research should pay special attention to the reliable detection of note onsets in any preferred singing style. In addition, research on index methods and fast bitparallelism algorithms for approximate pattern matching need to be further pursued to decrease computational requirements when dealing with large melody databases. 1.
Conducting Audio Files via Computer Vision
- Proceedings of the Gesture Workshop
, 2003
"... This paper presents a system to control the playback of audio files by means of the standard classical conducting technique. Computer vision techniques are developed to track a conductor's baton, and the gesture is subsequently analysed. Audio parameters are extracted from the sound-file an ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
(Show Context)
This paper presents a system to control the playback of audio files by means of the standard classical conducting technique. Computer vision techniques are developed to track a conductor's baton, and the gesture is subsequently analysed. Audio parameters are extracted from the sound-file and are further processed for audio beat tracking. The sound-file playback speed is adjusted in order to bring the audio beat points into alignment with the gesture beat points. The complete system forms all parts necessary to simulate an orchestra reacting to a conductor 's baton.
Signal Processing for Music Analysis
, 2011
"... Music signal processing may appear to be the junior relation of the large and mature field of speech signal processing, not least because many techniques and representations originally developed for speech have been applied to music, often with good results. However, music signals possess specific ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
(Show Context)
Music signal processing may appear to be the junior relation of the large and mature field of speech signal processing, not least because many techniques and representations originally developed for speech have been applied to music, often with good results. However, music signals possess specific acoustic and structural characteristics that distinguish them from spoken language or other nonmusical signals. This paper provides an overview of some signal analysis techniques that specifically address musical dimensions such as melody, harmony, rhythm, and timbre. We will examine how particular characteristics of music signals impact and determine these techniques, and we highlight a number of novel music analysis and retrieval tasks that such processing makes possible. Our goal is to demonstrate that, to be successful, music audio signal processing techniques must be informed by a deep and thorough insight into the nature of music itself.
Methods for separation of harmonic sound sources using sinusoidal modeling
- in Proc. AES 106th Convention
, 1999
"... Methods are proposed for separation of harmonic sound sources using sinusoidal modeling. A local nonlinear least-squares (NLS) frequency estimator is proposed to resolve sinusoids that are close in frequency. An iterative analysis scheme using interpolated parameter trajectories and subtraction of d ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
Methods are proposed for separation of harmonic sound sources using sinusoidal modeling. A local nonlinear least-squares (NLS) frequency estimator is proposed to resolve sinusoids that are close in frequency. An iterative analysis scheme using interpolated parameter trajectories and subtraction of detected components is presented. A measure is proposed for testing the accuracy of the model. 0
ESSENTIA: AN AUDIO ANALYSIS LIBRARY FOR MUSIC INFORMATION RETRIEVAL
"... We present Essentia 2.0, an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing bl ..."
Abstract
-
Cited by 21 (12 self)
- Add to MetaCart
(Show Context)
We present Essentia 2.0, an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. The library is cross-platform and currently supports Linux, Mac OS X, and Windows systems. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications. 1.
Toward Automated Holistic Beat Tracking, Music Analysis And Understanding
, 2005
"... Most music processing attempts to focus on one particular feature or structural element such as pitch, beat location, tempo, or genre. This hierarchical approach, in which music is separated into elements that are analyzed independently, is convenient for the scientific researcher, but is at od ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Most music processing attempts to focus on one particular feature or structural element such as pitch, beat location, tempo, or genre. This hierarchical approach, in which music is separated into elements that are analyzed independently, is convenient for the scientific researcher, but is at odds with intuition about music perception. Music is
Phavorit - a phase vocoder for real-time interactive time-stretching
, 2005
"... Time-based, interactive systems for digital audio and video often employ algorithms to control the speed of the media whilst maintaining the integrity of the content. The phase vocoder is a popular algorithm for time-stretching audio without changing the pitch; its characteristic “transient smearing ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
(Show Context)
Time-based, interactive systems for digital audio and video often employ algorithms to control the speed of the media whilst maintaining the integrity of the content. The phase vocoder is a popular algorithm for time-stretching audio without changing the pitch; its characteristic “transient smearing ” and “reverberation” artifacts, however, limit its application to “simple” audio signals such as instrumental or vocal music. We propose three new methods to further improve the audio quality of phase vocoder-based time-stretching: multiresolution peak-picking accounts for the non-uniform frequency resolution of the human auditory system; sinusoidal trajectory heuristics constrain the set of spectral peaks that are candidates for sinusoidal trajectory continuation, thereby reducing phase propagation along incorrectly detected trajectories; and silent passage phase reset re-establishes the vertical phase coherence at certain intervals. These techniques have been implemented as modules of the PhaVoRIT timestretching software, available as a plug-in to Apple’s Core Audio framework, and the Semantic Time Framework, a multimedia framework we developed for time-based interactive systems. We obtained favorable results when comparing Pha-VoRIT to existing, commonly available audio time-stretching software tools in a formal user study. PhaVoRIT is also deployed as the audio time-stretching engine for Maestro!, an interactive conducting exhibit installed in the Betty Brinn Children’s Museum in Milwaukee, USA. 1
Onset detection in polyphonic signals by means of transient peak classification
- In MIREX Online Proceedings (ISMIR 2005
, 2005
"... The extended abstract describes an onset detection algorithm that is based on a classification of spectral peaks into transient and non-transient peaks and a statistical model of the classification results to prevent detection of random transient peaks due to noise. Compared to last years mirex cont ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
(Show Context)
The extended abstract describes an onset detection algorithm that is based on a classification of spectral peaks into transient and non-transient peaks and a statistical model of the classification results to prevent detection of random transient peaks due to noise. Compared to last years mirex contribution the algorithm has not been modified, but bug fixes and a better parameter optimization procedure should lead to improved performance. 1
Automatic classification of digestive organs in wireless endoscopy videos
- in Proc .ACM Symposium on Applied Computing SAC’07, Seoul, Korea
"... Wireless Capsule Endoscopy (WCE) allows a physician to examine the entire small intestine without any surgical operation. With the miniaturization of wireless and camera technologies the ability comes to view the entire gestational track with little effort. Although WCE is a technical breakthrough t ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
(Show Context)
Wireless Capsule Endoscopy (WCE) allows a physician to examine the entire small intestine without any surgical operation. With the miniaturization of wireless and camera technologies the ability comes to view the entire gestational track with little effort. Although WCE is a technical breakthrough that allows us to access the entire intestine without surgery, it is reported that a medical clinician spends one or two hours to assess a WCE video. It limits the number of examinations possible, and incur considerable amount of costs. To reduce the assessment time, it is critical to develop a technique to automatically discriminate digestive organs such as esophagus, stomach, small intestinal (i.e., duodenum, jejunum, and ileum) and colon. In this paper, we propose a novel technique to segment a WCE video into these anatomic parts based on color change pattern analysis. The basic idea is that the each digestive organ has different patterns of intestinal contractions that are quantified as the features. We present the experimental results that demonstrate the effectiveness of the proposed method.