Results 1 - 10
of
11
Multichannel audio modeling and coding using a multiband source/filter model
- in Conf. Record of the Thirty-Ninth Asilomar Conf. Signals, Systems and Computers
, 2005
"... In this paper we propose a source/filter model for achieving low bitrate transmission of multichannel audio signals, in which the filter part corresponds to the specifics of each microphone information while the source part contains mostly the interchannel similarities. Using the appropriate filter ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
In this paper we propose a source/filter model for achieving low bitrate transmission of multichannel audio signals, in which the filter part corresponds to the specifics of each microphone information while the source part contains mostly the interchannel similarities. Using the appropriate filter for each channel and the source part of only one of the microphone signals, we can resynthesize a high quality approximation of each channel; thus, the filter part of each channel need only be encoded. Low datarates can be achieved in the order of few KBits/sec/channel focusing on applications such as remote mixing or distributed musicians collaboration. 1
MODELING SPOT MICROPHONE SIGNALS USING THE SINUSOIDAL PLUS NOISE APPROACH
"... This paper focuses on high-fidelity multichannel audio coding based on an enhanced adaptation of the well-known sinusoidal plus noise model (SNM). Sinusoids cannot be used per se for high-quality audio modeling because they do not represent all the audible information of a recording. The noise part ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
This paper focuses on high-fidelity multichannel audio coding based on an enhanced adaptation of the well-known sinusoidal plus noise model (SNM). Sinusoids cannot be used per se for high-quality audio modeling because they do not represent all the audible information of a recording. The noise part has also to be treated to avoid an artificial sounding resynthesis of the audio signal. Generally, the encoding process needs much higher bitrates for the noise part than the sinusoidal one. Our objective is to encode spot microphone signals using the SNM, by taking advantage of the interchannel similarities to achieve low bitrates. We demonstrate that for a given multichannel audio recording, the noise part for each spot microphone signal (before the mixing stage) can be obtained by using its noise envelope to transform the noise part of just one of the signals (the so-called ”reference signal”, which is fully encoded). 1.
Sinusoidal modeling of spot microphone signals based on noise transplantation fro multichannel audio coding
- Submitted European Signal Process. Conf. (EUSIPCO
, 2007
"... This paper focuses on high-fidelity multichannel audio modeling based on an enhanced adaptation of the well-known sinusoidal plus noise model (SNM). Sinusoids cannot be used per se for high-quality audio modeling because they do not represent all the audible information of a recording. The noise par ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
This paper focuses on high-fidelity multichannel audio modeling based on an enhanced adaptation of the well-known sinusoidal plus noise model (SNM). Sinusoids cannot be used per se for high-quality audio modeling because they do not represent all the audible information of a recording. The noise part has also to be treated to avoid an artificial sounding resynthesis of the audio signal. Generally, the encoding process needs much higher bitrates for the noise part than the sinusoidal one. Our objective is to model spot microphone signals using the SNM, by taking advantage of the interchannel similarities to achieve low bitrates. We demonstrate that for a given multichannel audio recording, the noise part for each spot microphone signal (microphone signals before the mixing stage) can be obtained from the noise part of one of the signals (reference, which is fully encoded), by transforming the reference noise part using the noise envelope of each of the remaining signals. 1.
MULTIBAND SOURCE/FILTER REPRESENTATION OF MULTICHANNEL AUDIO FOR REDUCTION OF INTER-CHANNEL REDUNDANCY
"... In this paper we propose a model for multichannel audio recordings that can be utilized for revealing the underlying interchannel similarities. This is important for achieving low bitrates for multichannel audio and is especially suitable for applications when there is a large number of microphone s ..."
Abstract
- Add to MetaCart
In this paper we propose a model for multichannel audio recordings that can be utilized for revealing the underlying interchannel similarities. This is important for achieving low bitrates for multichannel audio and is especially suitable for applications when there is a large number of microphone signals to be transmitted (such as remote mixing or distributed musicians collaboration). Using this model, we can encode a multichannel audio signal using only one full audio channel and some side information in the order of few KBits/sec per channel, which can be used to decode the multiple channels at the receiving end. We apply objective and subjective measures in order to evaluate the performance of our method. 1.
Distributed Immersive Performance: Enabling Technologies for and Analyses of Remote Performance and Collaboration
"... This talk presents the distributed immersive performance (DIP) technologies developed at the Integrated Media Systems Center at the University of Southern California, and the DIP experiments designed to assess the quality of human interaction in these remote environments. The talk represents a summa ..."
Abstract
- Add to MetaCart
This talk presents the distributed immersive performance (DIP) technologies developed at the Integrated Media Systems Center at the University of Southern California, and the DIP experiments designed to assess the quality of human interaction in these remote environments. The talk represents a summary of the enabling technologies for, and an amalgamation and interpretation of findings from, our DIP experiments over the past two years. Some of the findings have been appeared in conference proceedings. The enabling technologies include low latency audio and high definition video transmission techniques that conceal packet loss and ensure smooth playback, real time streaming and archival of multiple data streams in an Internet networking environment, and 10.2-channel immersive audio for realistic reproduction of sound. The talk will highlight our results from a series of collaborative performance experiments with the Tosheff Piano Duo, focusing on auditory delay, and discussions of its effects on musical coordination and interpretation, and the players’ assessments of their abilities to adapt to the conditions. Finally, we present and discuss ongoing work and future plans.
5 th Open Workshop of MUSICNETWORK: Integration of Music in Multimedia Applications A Second Report on the User Experiments in the Distributed Immersive Performance Project
"... This report is our second one at the MUSICNETWORK open workshop focusing on the user experiments in the Integrated Media Systems Center’s Distributed Immersive Performance (DIP) Project. The DIP project explores the creation of a seamless environment for remote and synchronous musical collaboration. ..."
Abstract
- Add to MetaCart
This report is our second one at the MUSICNETWORK open workshop focusing on the user experiments in the Integrated Media Systems Center’s Distributed Immersive Performance (DIP) Project. The DIP project explores the creation of a seamless environment for remote and synchronous musical collaboration. We describe here the DIP experiments and findings since the last MUSICNETWORK open workshop. At the last workshop, we introduced the DIP project, our goals of studying the effects of auditory latency on musical ensemble and coordination, and the capture and analysis of user experiment data. We presented the first two sets of user experiments – collaborative performance with musicians facing each other and experiencing controlled auditory delay while performing movements from Poulenc’s Sonata for Piano Four-Hands. The preliminary results from these first experiments showed that the expert users felt that they could adapt to delays below 50ms. This year, we describe the next two sets of experiments, in which the first set of experiments required the players to practice performing with levels of delay around the threshold value, ranging from 40ms to 75ms. As before, our expert users are the award-winning Tosheff piano duo, Vely Stoyanova and Ilia Tosheff. Their practicing led to the creating of the next set of experiments, wherein the players requested for additional delay in the feedback of their own playing in order to hear the audience’s perspective. This additional delay in the
Coding of Spot Microphone Signals
"... A multiresolution source/filter model for coding of audio source signals (spot recordings) is proposed. Spot recordings are a subset of the multimicrophone recordings of a music performance, before the mixing process is applied for producing the final multichannel audio mix. The technique enables lo ..."
Abstract
- Add to MetaCart
A multiresolution source/filter model for coding of audio source signals (spot recordings) is proposed. Spot recordings are a subset of the multimicrophone recordings of a music performance, before the mixing process is applied for producing the final multichannel audio mix. The technique enables low bitrate coding of spot signals with good audio quality (above 3.0 perceptual grade compared to the original). It is demonstrated that this particular model separates the various microphone recordings of a multimicrophone recording into a part that mainly characterizes a specific microphone signal and a part that is common to all signals of the same recording (and can thus be omitted during transmission). Our interest in low bitrate coding of spot recordings is related to applications such as remote mixing and real-time collaboration of musicians who are geographically distributed. Using the proposed approach, it is shown that it is possible to encode a multimicrophone audio recording using a single audio channel only, with additional information for each spot microphone signal in the order of 5 kbps, for good-quality resynthesis. This is verified by employing both objective and subjective measures of performance. Copyright © 2008 Athanasios Mouchtaris et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1.
MODELING AND CODING OF SPOT MICROPHONE SIGNALS FOR IMMERSIVE AUDIO BASED ON THE SINUSOIDAL MODEL
"... In this paper, our objective is to propose a coding method for low bitrate immersive audio applications. This translates into: (a) focusing on spot microphone signals for providing interactivity between the user and the environment, and (b) deriving a model which can take advantage of the similariti ..."
Abstract
- Add to MetaCart
In this paper, our objective is to propose a coding method for low bitrate immersive audio applications. This translates into: (a) focusing on spot microphone signals for providing interactivity between the user and the environment, and (b) deriving a model which can take advantage of the similarities among the various spot signals of a given multichannel recording. Spot signals are the microphone recordings of a performance, before obtaining the multichannel mix. We propose a modified sinusoids plus noise model. The noise component for each spot signal is obtained by transforming the noise part of one of the signals (reference), using the noise envelope of each of the remaining spot signals. Reproduction of good quality and without loss of image width can be achieved using the proposed approach, by encoding a single audio channel, with side information per spot signal in the order of 19 kbps. 1.
5 Multichannel Audio Coding for Multimedia Services in Intelligent Environments
"... Summary. Audio is an integral component of multimedia services in intelligent environments. Use of multiple channels in audio capturing and rendering offers the advantage of recreating arbitrary acoustic environments, immersing the listener into the acoustic scene. On the other hand, multichannel au ..."
Abstract
- Add to MetaCart
Summary. Audio is an integral component of multimedia services in intelligent environments. Use of multiple channels in audio capturing and rendering offers the advantage of recreating arbitrary acoustic environments, immersing the listener into the acoustic scene. On the other hand, multichannel audio contains a large degree of information which is highly demanding to transmit, especially for real-time applications. For this reason, a variety of compression methods have been developed for multichannel audio content. In this chapter, we initially describe the currently popular methods for multichannel audio compression. Low-bitrate encoding methods for multichannel audio have also been recently starting to attract interest, mostly towards extending MP3 audio coding to multichannel audio recordings, and these methods are also examined here. For synthesizing a truly immersive intelligent audio environment, interactivity between the user(s) and the audio environment is essential. Towards this goal, we present recently proposed multichannel-audio-specific models, namely the source/filter and the sinusoidal models, which allow for flexible manipulation and high-quality low-bitrate encoding, tailored for applications such as remote mixing and distributed immersive performances. In this chapter, audio coding methods are described, the emphasis being on multichannel audio coding. Multichannel audio is an important component in most of today’s multimedia applications including entertainment (Digital

