Results 1 - 10
of
164
Information Hiding -- A Survey
, 1999
"... Information hiding techniques have recently become important in a number of application areas. Digital audio, video, and pictures are increasingly furnished with distinguishing but imperceptible marks, which may contain a hidden copyright notice or serial number or even help to prevent unauthorised ..."
Abstract
-
Cited by 146 (0 self)
- Add to MetaCart
Information hiding techniques have recently become important in a number of application areas. Digital audio, video, and pictures are increasingly furnished with distinguishing but imperceptible marks, which may contain a hidden copyright notice or serial number or even help to prevent unauthorised copying directly. Military communications systems make increasing use of traffic security techniques which, rather than merely concealing the content of a message using encryption, seek to conceal its sender, its receiver or its very existence. Similar techniques are used in some mobile phone systems and schemes proposed for digital elections. Criminals try to use whatever traffic security properties are provided intentionally or otherwise in the available communications systems, and police forces try to restrict their use. However, many of the techniques proposed in this young and rapidly evolving field can trace their history back to antiquity; and many of them are surprisingly easy to circumvent. In this article, we try to give an overview of the field; of what we know, what works, what does not, and what are the interesting topics for research.
Speech Analysis
, 1998
"... Contents 1 Introduction 4 1.1 What is Speech Analysis? . . . . . . . . . . . . . . . . . . . . 4 1.1.1 So what is an acoustic vector? . . . . . . . . . . . . . . 4 1.2 Why Speech Analysis? . . . . . . . . . . . . . . . . . . . . . . 4 1.3 The problems of speech analysis . . . . . . . . . . . . . . ..."
Abstract
-
Cited by 134 (0 self)
- Add to MetaCart
Contents 1 Introduction 4 1.1 What is Speech Analysis? . . . . . . . . . . . . . . . . . . . . 4 1.1.1 So what is an acoustic vector? . . . . . . . . . . . . . . 4 1.2 Why Speech Analysis? . . . . . . . . . . . . . . . . . . . . . . 4 1.3 The problems of speech analysis . . . . . . . . . . . . . . . . . 7 1.4 Standard references for this course . . . . . . . . . . . . . . . 7 2 Background 7 2.1 Sampling theory . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 Sampling frequency . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Sampling resolution . . . . . . . . . . . . . . . . . . . . 8 2.2 Linear filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Finite Impulse Response filters . . . . . . . . . . . . . 8 2.2.2 Infinite Impulse Response filters . . . . . . . . . . . . . 11 2.3 The source filter model of speech . . . . . . . . . . . . . . . . 12 3 Filter bank Analysis 12 3.1 Spectrograms . . . . . . . . .
An evaluation of earcons for use in auditory human-computer interfaces
- In Proceedings of ACM/IFIP INTERCHI'93
, 1993
"... An evaluation of earcons was carried out to see whether they are an effective means of communicating information in sound. An initial experiment showed that earcons were better than unstructured bursts of sound and that musical timbres were more effective than simple tones. A second experiment was t ..."
Abstract
-
Cited by 88 (37 self)
- Add to MetaCart
An evaluation of earcons was carried out to see whether they are an effective means of communicating information in sound. An initial experiment showed that earcons were better than unstructured bursts of sound and that musical timbres were more effective than simple tones. A second experiment was then carried out which improved upon some of the weaknesses shown up in Experiment 1 to give a significant improvement in recognition. From the results of these experiments some guidelines were drawn up for use in the creation of earcons. Earcons have been shown to be an effective method for communicating information in a human-computer interface.
Perceptual Coding of Digital Audio
- Proceedings of the IEEE
, 2000
"... During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, wireless, and multimedia computing systems face a series of constraints such as reduced channel bandwidth, limited storage capacity, and low cost. These new applic ..."
Abstract
-
Cited by 76 (0 self)
- Add to MetaCart
During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, wireless, and multimedia computing systems face a series of constraints such as reduced channel bandwidth, limited storage capacity, and low cost. These new applications have created a demand for high-quality digital audio delivery at low bit rates. In response to this need, considerable research has been devoted to the development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed, and several have now become international and/or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. The paper is organized as follows. First, psychoacoustic principles are described with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Next, filter bank design issues and algorithms are addressed, with a particular emphasis placed on the Modified Discrete Cosine Transform (MDCT), a perfect reconstruction (PR) cosine-modulated filter bank that has become of central importance in perceptual audio coding. Then, we review methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction (LP) parameters, as well as hybrid algorithms that make use of more than one signal model. These discussions concentrate on architectures and applications of
A Review of The Cocktail Party Effect
- JOURNAL OF THE AMERICAN VOICE I/O SOCIETY
, 1992
"... The "cocktail party effect"---the ability to focus one's listening attention on a single talker among a cacophony of conversations and background noise---has been recognized for some time. This specialized listening ability may be because of characteristics of the human speech production system, the ..."
Abstract
-
Cited by 74 (3 self)
- Add to MetaCart
The "cocktail party effect"---the ability to focus one's listening attention on a single talker among a cacophony of conversations and background noise---has been recognized for some time. This specialized listening ability may be because of characteristics of the human speech production system, the auditory system, or high-level perceptual and language processing. This paper investigates the literature on what is known about the effect, from the original technical descriptions through current research in the areas of auditory streams and spatial display systems. The underlying goal of the paper is to analyze the components of this effect to uncover relevant attributes of the speech production and perception chain that could be exploited in future speech communication systems. The motivation is to build a system that can simultaneously present multiple streams of speech information such that a user can focus on one stream, yet easily shift attention to the others. A set of speech appli...
Sound-Source Recognition: A Theory and Computational Model
, 1999
"... The ability of a normal human listener to recognize objects in the environment from only the sounds they produce is extraordinarily robust with regard to characteristics of the acoustic environment and of other competing sound sources. In contrast, computer systems designed to recognize sound source ..."
Abstract
-
Cited by 61 (0 self)
- Add to MetaCart
The ability of a normal human listener to recognize objects in the environment from only the sounds they produce is extraordinarily robust with regard to characteristics of the acoustic environment and of other competing sound sources. In contrast, computer systems designed to recognize sound sources function precariously, breaking down whenever the target sound is degraded by reverberation, noise, or competing sounds. Robust listening requires extensive contextual knowledge, but the potential contribution of sound-source recognition to the process of auditory scene analysis has largely been neglected by researchers building computational models of the scene analysis process. This thesis proposes a theory of sound-source recognition, casting recognition as a process of gathering information to enable the listener to make inferences about
A detailed investigation into the effectiveness of earcons
- In Proceedings of ICAD'92 (Santa Fe Institute, Santa Fe) Addison-Wesley
, 1992
"... A detailed experimental evaluation of earcons was carried out to see whether they are an effective means of communicating information in sound. An initial experiment showed that earcons were better than unstructured bursts of sound and that musical timbres were more effective than simple tones. Musi ..."
Abstract
-
Cited by 45 (16 self)
- Add to MetaCart
A detailed experimental evaluation of earcons was carried out to see whether they are an effective means of communicating information in sound. An initial experiment showed that earcons were better than unstructured bursts of sound and that musical timbres were more effective than simple tones. Musicians were shown to be no better than non-musicians when using musical timbres. A second experiment was then carried out which improved upon some of the weaknesses of the pitches and rhythms used in Experiment 1 to give a significant improvement in recognition. From the results some guidelines were drawn up for designers to use when creating earcons. These experiments have formally shown that earcons are an effective method for communicating complex information in sound.
Bark and ERB Bilinear Transforms
, 1999
"... Use of a bilinear conformal map to achieve a frequency warping nearly identical to that of the Bark frequency scale is described. Because the map takes the unit circle to itself, its form is that of the transfer function of a first-order allpass filter. Since it is a first-order map, it preserves th ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
Use of a bilinear conformal map to achieve a frequency warping nearly identical to that of the Bark frequency scale is described. Because the map takes the unit circle to itself, its form is that of the transfer function of a first-order allpass filter. Since it is a first-order map, it preserves the model order of rational systems, making it a valuable frequency warping technique for use in audio filter design. A closed-form weighted-equation-error method is derived which computes the optimal mapping coefficient as a function of sampling rate, and the solution is shown to be generally indistinguishable from the optimal least-squares solution. The optimal Chebyshev mapping is also found to be essentially identical to the optimal least-squares solution. The expression...
Perceptual audio rendering of complex virtual environments
- ACM Transactions on Graphics (SIGGRAPH Conf. Proceedings
, 2004
"... Figure 1: Left, an overview of a test virtual environment, containing 174 sound sources. All vehicles are moving. Mid-left, the magenta dots indicate the locations of the sound sources while the red sphere represents the listener. Notice that the train and the river are extended sources modeled by c ..."
Abstract
-
Cited by 44 (15 self)
- Add to MetaCart
Figure 1: Left, an overview of a test virtual environment, containing 174 sound sources. All vehicles are moving. Mid-left, the magenta dots indicate the locations of the sound sources while the red sphere represents the listener. Notice that the train and the river are extended sources modeled by collections of point sources. Mid-right, ray-paths from the sources to the listener. Paths in red correspond to the perceptually masked sound sources. Right, the blue boxes are clusters of sound sources with the representatives of each cluster in grey. Combination of auditory culling and spatial clustering allows us to render such complex audio-visual scenes in real-time. We propose a real-time 3D audio rendering pipeline for complex virtual scenes containing hundreds of moving sound sources. The approach, based on auditory culling and spatial level-of-detail, can handle more than ten times the number of sources commonly available on consumer 3D audio hardware, with minimal decrease in audio quality. The method performs well for both indoor and outdoor environments. It leverages the limited capabilities of audio hardware for many applications, including interactive architectural acoustics simulations and automatic 3D voice management for video games. Our approach dynamically eliminates inaudible sources and groups the remaining audible sources into a budget number of clusters. Each cluster is represented by one impostor sound source, positioned using perceptual criteria. Spatial audio processing is then performed only on the impostor sound sources rather than on every original source thus greatly reducing the computational cost. A pilot validation study shows that degradation in audio quality, as well as localization impairment, are limited and do not seem to vary significantly with the cluster budget. We conclude that our real-time perceptual audio rendering pipeline can generate spatialized audio for complex auditory environments without introducing disturbing changes in the resulting perceived soundfield.
Magnitude estimation of conceptual data dimensions for use in sonification
- Journal of Experimental Psychology: Applied
, 2002
"... by ..."

