Results 1 - 10
of
18
On combining classifiers
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1998
"... We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental ..."
Abstract
-
Cited by 749 (21 self)
- Add to MetaCart
We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental comparison of various classifier combination schemes demonstrates that the combination rule developed under the most restrictive assumptions—the sum rule—outperforms other classifier combinations schemes. A sensitivity analysis of the various schemes to estimation errors is carried out to show that this finding can be justified theoretically.
Face Recognition: A Literature Survey
, 2000
"... ... This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into ..."
Abstract
-
Cited by 570 (19 self)
- Add to MetaCart
... This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of machine recognition of faces. To provide a comprehensive survey, we not only categorize existing recognition techniques but also present detailed descriptions of representative methods within each category. In addition,
Combining Evidence in Personal Identity Verification Systems
- Pattern Recognition Letters
, 1997
"... A methodology for fusing multiple instances of biometric data to improve the performance of a personal identity verification system is developed. The fusion problem is formulated in the framework of the Bayesian estimation theory. The effect of different fusion strategies on the error probability is ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
A methodology for fusing multiple instances of biometric data to improve the performance of a personal identity verification system is developed. The fusion problem is formulated in the framework of the Bayesian estimation theory. The effect of different fusion strategies on the error probability is analysed theoretically. The proposed methodology is then demonstrated on the problem of personal identity verification using multiple facial images. Experimental studies on the M2VTS database confirm the predicted improvements in performance. A reduction in error rates of up to 40% is achieved. The performance gains are initially monotonic but they tend to saturate after integrating the first few observations. It is also shown that the fusion based on rank order statistic, i.e. the median, is robust to outliers.
Acoustic-labial speaker verification
- Audio and Video based Person Authentication - AVBPA97, volume LNCS-1206
, 1997
"... defined), by Ben Gold and Nelson Morgan. ..."
An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition
- in Advances in Neural Information Processing Systems, NIPS 15
, 2003
"... This paper presents a novel Hidden Markov Model architecture to model the joint probability of pairs of asynchronous sequences describing the same event. It is based on two other Markovian models, namely Asynchronous Input/Output Hidden Markov Models and Pair Hidden Markov Models. An EM algorith ..."
Abstract
-
Cited by 28 (12 self)
- Add to MetaCart
This paper presents a novel Hidden Markov Model architecture to model the joint probability of pairs of asynchronous sequences describing the same event. It is based on two other Markovian models, namely Asynchronous Input/Output Hidden Markov Models and Pair Hidden Markov Models. An EM algorithm to train the model is presented, as well as a Viterbi decoder that can be used to obtain the optimal state sequence as well as the alignment between the two sequences. The model has been tested on an audio-visual speech recognition task using the M2VTS database and yielded robust performances under various noise conditions.
Automatic Person Verification Using Speech and Face Information
, 2003
"... Identity verification systems are an important part of our every day life. A typical example is the Automatic Teller Machine (ATM) which employs a simple identity verification scheme: the user is asked to enter their secret password after inserting their ATM card; if the password matches the one pre ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
Identity verification systems are an important part of our every day life. A typical example is the Automatic Teller Machine (ATM) which employs a simple identity verification scheme: the user is asked to enter their secret password after inserting their ATM card; if the password matches the one prescribed to the card, the user is allowed access to their bank account. This scheme suffers from a major drawback: only the validity of the combination of a certain possession (the ATM card) and certain knowledge (the password) is verified. The ATM card can be lost or stolen, and the password can be compromised. Thus new verification methods have emerged, where the password has either been replaced by, or used in addition to, biometrics such as the person's speech, face image or fingerprints. Apart from the ATM example described above, biometrics can be applied to other areas, such as telephone & internet based banking, airline reservations & check-in, as well as forensic work and law enforcement applications. Biometric systems
Fast Face Detection using MLP and FFT
- in Proc. Second International Conference on Audio and Video-based Biometric Person Authentication (AVBPA'99
, 1999
"... Neural networks have shown their reliability and robustness for the face detection task (Rowley, Baluja and Kanade, 1995). However, the time consuming process needed by neural networks has prevented them from being a practical tool. We propose a new technique that significantly speeds up the time n ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Neural networks have shown their reliability and robustness for the face detection task (Rowley, Baluja and Kanade, 1995). However, the time consuming process needed by neural networks has prevented them from being a practical tool. We propose a new technique that significantly speeds up the time needed by a trained network (MLP in our case) to detect a face in a large image. We reformulate neural activities in the hidden layer of the MLP in terms of filter convolution, enabling the use of the Fourier transform for an efficient computation of the neural activities. The method was applied to face detection in still images as well as on live video sequences. 1 Introduction Face detection is a fundamental step before the recognition or identification procedure. Its reliability and time-response have a major influence on the performance and usability of a face recognition system. The large variability of human faces causes major difficulties in the design of a model that can encompasse...
Multimodal speech processing using asynchronous hidden markov models
- Information Fusion
, 2004
"... This paper advocates that for some multimodal tasks involving more than one stream of data representing the same sequence of events, it might sometimes be a good idea to be able to desynchronize the streams in order to maximize their joint likelihood. We thus present a novel Hidden Markov Model arch ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
This paper advocates that for some multimodal tasks involving more than one stream of data representing the same sequence of events, it might sometimes be a good idea to be able to desynchronize the streams in order to maximize their joint likelihood. We thus present a novel Hidden Markov Model architecture to model the joint probability of pairs of asynchronous sequences describing the same sequence of events. An Expectation-Maximization algorithm to train the model is presented, as well as a Viterbi decoding algorithm, which can be used to obtain the optimal state sequence as well as the alignment between the two sequences. The model was tested on two audio-visual speech processing tasks, namely speech recognition and text-dependent speaker veri cation, both using the M2VTS database. Robust performances under various noise conditions were obtained in both cases. Key words: speech recognition, speaker veri cation, multimodal fusion, asynchronous fusion, joint EM estimation, HMM 1
Fast Object Detection using MLP and FFT
, 1997
"... . We propose a new technique that speeds up significantly the time needed by a trained network (MLP in our case) to detect a face in a large image. We reformulate neural activities in the hidden layer of the MLP in terms of filter convolution enabling the use of Fourier transform for an efficient co ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
. We propose a new technique that speeds up significantly the time needed by a trained network (MLP in our case) to detect a face in a large image. We reformulate neural activities in the hidden layer of the MLP in terms of filter convolution enabling the use of Fourier transform for an efficient computation of the neural activities. A formal proof and a complexity analysis are presented. Finally, some examples illustrate the approach. 2 IDIAP--RR 97-11 1 Introduction Face detection is the fundamental step before the recognition or identification procedure. Its reliability and time-response have a major influence on the performance and usability of the whole face recognition system. The large variability of human faces causes major difficulties in the design of a model that could encompass all possible faces [2]. Appearance-based approaches as well as learning-based approaches seem to be better suited for such a task. A set of representative faces is necessary to find the implicit m...
The VidTIMIT Database
, 2002
"... This communication describes the multi-modal VidTIMIT database, which can be useful for research involving mono- or multi-modal speech recognition or person authentication. It is comprised of video and corresponding audio recordings of 43 volunteers, reciting short sentences selected from the NTIMIT ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This communication describes the multi-modal VidTIMIT database, which can be useful for research involving mono- or multi-modal speech recognition or person authentication. It is comprised of video and corresponding audio recordings of 43 volunteers, reciting short sentences selected from the NTIMIT corpus [8].

