Results 1 - 10
of
17
From Remote Media Immersion to Distributed Immersive Performance
- in Proc. ACM SIGMM Workshop on Experiential Telepresence (ETP
, 2003
"... We present the architecture, technology and experimental applications of a real-time, multi-site, interactive and collaborative environment called Distributed Immersive Performance (DIP). The objective of DIP is to develop the technology for live, interactive musical performances in which the partic ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
We present the architecture, technology and experimental applications of a real-time, multi-site, interactive and collaborative environment called Distributed Immersive Performance (DIP). The objective of DIP is to develop the technology for live, interactive musical performances in which the participants - subsets of musicians, the conductor and the audience - are in different physical locations and are interconnected by very high fidelity multichannel audio and video links. DIP is a specific realization of broader immersive technology - the creation of the complete aural and visual ambience that places a person or a group of people in a virtual space where they can experience events occurring at a remote site or communicate naturally regardless of their location. The DIP experimental system has interaction sites and servers in different locations on the USC campus and at several partners, including the New World Symphony of Miami Beach, FL. The sites have different types of equipment to test the effects of video and audio fidelity on the ease of use and functionality for different applications. Many sites have high-definition (HD) video or digital video (DV) quality images projected onto wide screen wall displays completely integrated with an immersive audio reproduction system for a seamless, fully three-dimensional aural environment with the correct spatial sound localization for participants. The system is capable of storage and playback of the many streams of synchronized audio and video data (immersidata), and utilizes novel protocols for the low-latency, seamless, synchronized realtime delivery of immersidata over local area networks and widearea networks such as Internet2. We discuss several recent interactive experiments using the system and many technical cha...
Fundamental and Technological Limitations of Immersive Audio Systems
- Proceedings of the IEEE
, 1998
"... Numerous applications are currently envisioned for immersive audio systems. The principal function of such systems is to synthesize, manipulate, and render sound fields in real time. In this paper, we examine several fundamental and technological limitations that impede the development of seamless i ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
Numerous applications are currently envisioned for immersive audio systems. The principal function of such systems is to synthesize, manipulate, and render sound fields in real time. In this paper, we examine several fundamental and technological limitations that impede the development of seamless immersive audio systems. Such limitations stem from signal-processing requirements, acoustical considerations, human listening characteristics, and listener movement. We present a brief historical overview to outline the development of immersive audio technologies and discuss the performance and future research directions of immersive audio systems with respect to such limits. Last, we present a novel desktop audio system with integrated listener-tracking capability that circumvents several of the technological limitations faced by today’s digital audio workstations. Keywords—Acoustic signal processing, audio systems, auditory system, multimedia systems, signal processing. I.
Multichannel Recursive-Least-Squares Algorithms and Fast-Transversal-Filter Algorithms for Active Noise Control and Sound Reproduction Systems
, 2000
"... In the last ten years, there has been much research on active noise control (ANC) systems and transaural sound reproduction (TSR) systems. In those fields, multichannel FIR adaptive filters are extensively used. For the learning of FIR adaptive filters, recursive-least-squares (RLS) algorithms are k ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
In the last ten years, there has been much research on active noise control (ANC) systems and transaural sound reproduction (TSR) systems. In those fields, multichannel FIR adaptive filters are extensively used. For the learning of FIR adaptive filters, recursive-least-squares (RLS) algorithms are known to produce a faster convergence speed than stochastic gradient descent techniques, such as the basic least-mean-squares (LMS) algorithm or even the fast convergence Newton-LMS, the gradient-adaptive-lattice (GAL) LMS and the discrete-cosine-transform (DCT) LMS algorithms. In this paper, multichannel RLS algorithms and multichannel fast-transversal-filter (FTF) algorithms are introduced, with the structures of some stochastic gradient descent algorithms used in ANC: the filtered-x LMS, the modified filtered-x LMS and the adjoint-LMS. The new algorithms can be used in ANC systems or for the deconvolution of sounds in TSR systems. Simulation results comparing the convergence speed, the numerical stability and the performance using noisy plant models for the different multichannel algorithms will be presented, showing the large gain of convergence speed that can be achieved by using some of the introduced algorithms.
Design of Cross-talk Cancellation Networks by using Fast Deconvolution
- Soc. Convention in Munich
, 1999
"... Introduction Binaural material, such as a dummy-head recording, is generally intended for playback over headphones [1]. In order to achieve the equivalent effect when such material is played back over two loudspeakers, a cross-talk cancellation network must be used to compensate for the cross-talk ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Introduction Binaural material, such as a dummy-head recording, is generally intended for playback over headphones [1]. In order to achieve the equivalent effect when such material is played back over two loudspeakers, a cross-talk cancellation network must be used to compensate for the cross-talk (the sound that is reproduced at the right ear by the left loudspeaker, and vice versa) and the headrelated transfer functions (HRTFs) associated with a real listener [2]-[8]. In practice, a cross-talk cancellation network can be implemented by a two-by-two matrix of digital filters. Unfortunately, though, efficient cross-talk cancellation at low frequencies is possible only if each element of the cross-talk cancellation network is capable of providing a significant boost of those frequencies [8]. This is because the difference between the direct path HRTF and the cross-talk path HRTF is very small at low frequencies, and so one ends up having to invert an almost singular two-by-two
Immersive Sound Rendering Using Laser-Based Tracking
- Proceedings of the 109 th Audio Engineering Society (AES) Convention, preprint No. 5227
, 2000
"... In this paper we describe the underlying concepts behind the spatial sound renderer built at the University of Southern California’s Immersive Audio Laboratory. In creating this sound rendering system, we were faced with three main challenges. First the rendering of sound using the Head-Related Tran ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this paper we describe the underlying concepts behind the spatial sound renderer built at the University of Southern California’s Immersive Audio Laboratory. In creating this sound rendering system, we were faced with three main challenges. First the rendering of sound using the Head-Related Transfer Functions, second the cancellation of the crosstalk terms and third the localization of the listener’s ears. To deal with the spatial rendering sound we use a two-layer method of modeling the HRTF’s. The first layer accurately reproduces the ITD’s and IAD’s, and the second layer reproduces the spectral characteristics of the HRTF’s. A novel method for generating the required crosstalk cancellation filters as the listener moves was developed based on Low-Rank modeling. Using Karhunen-Loeve expansion we can interpolate among listener positions from a small number of HRTF measurements. Finally we present a Head Detection algorithm for tracking the location of the listener’s ears in real time using a laser scanner. 1
Personal 3D audio system with loudspeakers
- IEEE International Workshop on Hot Topics in 3D, in conjunction with ICME
, 2010
"... Traditional 3D audio systems often have a limited sweet spot for the user to perceive 3D effects successfully. In this paper, we present a personal 3D audio system with loudspeakers that has unlimited sweet spots. The idea is to have a camera track the user’s head movement, and recompute the crossta ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Traditional 3D audio systems often have a limited sweet spot for the user to perceive 3D effects successfully. In this paper, we present a personal 3D audio system with loudspeakers that has unlimited sweet spots. The idea is to have a camera track the user’s head movement, and recompute the crosstalk canceller filters accordingly. As far as the authors are aware of, our system is the first non-intrusive 3D audio system that adapts to both the head position and orientation with six degrees of freedom. The effectiveness of the proposed system is demonstrated with subjective listening tests comparing our system against traditional non-adaptive systems.
Numerically stable fast convergence least-squares algorithms for multichannel active sound cancellation systems and sound deconvolution systems
, 2002
"... In recent years,rec,yAT# least-squares (RLS) algorithms and fast-transversal-#lters (FTF) algorithms have beenintroducy for multicyLDDD actic sound cndyAxx:AyL (ASC) systems andmulticHyLDD sounddecyDxHTzyLD (MSD) systems. It was reported that these algorithmscl greatlyimprove thecyCH#HCyLD speed of ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In recent years,rec,yAT# least-squares (RLS) algorithms and fast-transversal-#lters (FTF) algorithms have beenintroducy for multicyLDDD actic sound cndyAxx:AyL (ASC) systems andmulticHyLDD sounddecyDxHTzyLD (MSD) systems. It was reported that these algorithmscl greatlyimprove thecyCH#HCyLD speed of the ASC=MSD systems using adaptive FIR #lters. However,numericA instabilityof the algorithms is an issue that needs to be resolved. In this paper, extensions of numericyLDAAQAyc realisations of RLS algorithmssuc as the inverse QR-RLS, the QRdecz#DyLDTTQ least-squares-lattic (QRD-LSL) and the symmetry preserving RLS algorithms are introducL for thespecDH problem ofmulticyLCz# ASC=MSD. Multic#HAyL versions of some of these algorithms have previouslybeen published for predicQyL oridenti#cC:z: systems, but not forcryzxA systems. The cey of underdetermined ASC=MSD systems (i.e. systems with moreaceyHCHC than error sensors) is also cyATCHCyLC to show that in these cese it maybe required to usecyDH#DQyLC algorithms in order to have numeric# stability. Constrained algorithms formultic#yLCH ASC=MSD systems are thereforeintroducD for two types of cyH##HTyLCH minimisation of theacyAx#C signals power and minimization of the adaptive filters squarecareyHACDD Simulation results are shown to verifythe numerical stability of the algorithms introducs in the paper.
Multichannel Affine and Fast Affine Projection Algorithms for Active Noise Control and Acoustic Equalization Systems
, 2003
"... In the field of adaptive signal processing, it is well known that affine projection algorithms or their low-computational implementations fast affine projection algorithms can produce a good tradeoff between convergence speed and computational complexity. Although these algorithms typically do not p ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In the field of adaptive signal processing, it is well known that affine projection algorithms or their low-computational implementations fast affine projection algorithms can produce a good tradeoff between convergence speed and computational complexity. Although these algorithms typically do not provide the same convergence speed as recursive-least-squares algorithms, they can provide a much improved convergence speed compared to stochastic gradient descent algorithms, without the high increase of the computational load or the instability often found in recursive-least-squares algorithms. In this paper, multichannel affine and fast affine projection algorithms are introduced for active noise control or acoustic equalization. Multichannel fast affine projection algorithms have been previously published for acoustic echo cancellation, but the problem of active noise control or acoustic equalization is a very different one, leading to different structures, as explained in the paper. The computational complexity of the new algorithms is evaluated, and it is shown through simulations that not only can the new algorithms provide the expected tradeoff between convergence performance and computational complexity, they can also provide the best convergence performance (even over recursive-least-squares algorithms) when nonideal noisy acoustic plant models are used in the adaptive systems.
-D Audio With Dynamic Tracking For Multimedia Environtments
"... This papers deals with a 3-D audio system that has been developed for desktop multimedia environments. The system has the ability to place virtual sources at arbitrary azimuths and elevations around the listener's head based on HRTF binaural synthesis. A listener seated in front of a computer an ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This papers deals with a 3-D audio system that has been developed for desktop multimedia environments. The system has the ability to place virtual sources at arbitrary azimuths and elevations around the listener's head based on HRTF binaural synthesis. A listener seated in front of a computer and two loudspeakers placed at each side of the monitor have been considered. Transaural reproduction using loudspeakers has been used for rendering the sound field to listener ears. Furthermore the system can cope with slight movements of the listener head. Head position is monitored by means of a simple computer vision algorithm. Four head position coordinates (x,y,z,f) in order to allow free movements of the listener are continuously estimated. Cross-talk cancellation filters and virtual sources locations are updated depending on these head coordinates. 1. INTRODUCTION The evolution of multimedia technologies together with the increasing computational power of the personal computers ...
Enhancing Loudspeaker-based 3D Audio with Room Modeling
"... Abstract—For many years, spatial (3D) sound using headphones has been widely used in a number of applications. A rich spatial sensation is obtained by using head related transfer functions (HRTF) and playing the appropriate sound through headphones. In theory, loudspeaker audio systems would be capa ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—For many years, spatial (3D) sound using headphones has been widely used in a number of applications. A rich spatial sensation is obtained by using head related transfer functions (HRTF) and playing the appropriate sound through headphones. In theory, loudspeaker audio systems would be capable of rendering 3D sound fields almost as rich as headphones, as long as the room impulse responses (RIRs) between the loudspeakers and the ears are known. In practice, however, obtaining these RIRs is hard, and the performance of loudspeaker based systems is far from perfect. New hope has been recently raised by a system that tracks the user’s head position and orientation, and incorporates them into the RIRs estimates in real time. That system made two simplifying assumptions: it used generic HRTFs, and it ignored room reverberation. In this paper we tackle the second problem: we incorporate a room reverberation estimate into the RIRs. Note that this is a nontrivial task: RIRs vary significantly with the listener’s positions, and even if one could measure them at a few points, they are notoriously hard to interpolate. Instead, we take an indirect approach: we model the room, and from that model we obtain an estimate of the main reflections. Position and characteristics of walls do not vary with the users ’ movement, yet they allow to quickly compute an estimate of the RIR for each new user position. Of course the key question is whether the estimates are good enough. We show an improvement in localization perception of up to 32 % (i.e., reducing average error from 23.5 ◦ to 15.9 ◦).

