Results 1 - 10
of
33
Deformable Markov model templates for time-series pattern matching
- In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
, 2000
"... Categories and Subject �� � ����� � ������������������������������������������§�§¤��£¦��©���£¦����¥���������������������������� ..."
Abstract
-
Cited by 47 (3 self)
- Add to MetaCart
Categories and Subject �� � ����� � ������������������������������������������§�§¤��£¦��©���£¦����¥����������������������������
Lexical Modeling in a Speaker Independent Speech Understanding System
, 1993
"... Over the past 40 years, significant progress has been made in the fields of speech recognition and speech understanding. Current state-of-the-art speech recognition systems are capable of achieving word-level accuracies of 90 % to 95 % on continuous speech recognition tasks using 5000 words. Even la ..."
Abstract
-
Cited by 39 (8 self)
- Add to MetaCart
Over the past 40 years, significant progress has been made in the fields of speech recognition and speech understanding. Current state-of-the-art speech recognition systems are capable of achieving word-level accuracies of 90 % to 95 % on continuous speech recognition tasks using 5000 words. Even larger systems, capable of recognizing 20,000 words are just now being developed. Speech understanding systems have recently been developed that perform fairly well within a restricted domain. While the size and performance of modern speech recognition and understanding systems are impressive, it is evident to anyone who has used these systems that the technology is primitive compared to our own human ability to understand speech. Some of the difficulties hampering progress in the fields of speech recognition and understanding stem from the many sources of variation that occur during human communication. One of the sources of variation that occurs in human communication is the different ways that words can be pronounced. There are many causes of pronunciation variation, such as: the phonetic environment in which the word occurs, the dialect of the speaker,
On-line Handwritten Signature Verification using Hidden Markov Model Features
- In: Proc. of ICDAR
, 1997
"... A method for the automatic verification of on-line handwritten signatures using both global and local features is described. The global and local features capture various aspects of signature shape and dynamics of signature production. We demonstrate that with the addition to the global features of ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
A method for the automatic verification of on-line handwritten signatures using both global and local features is described. The global and local features capture various aspects of signature shape and dynamics of signature production. We demonstrate that with the addition to the global features of a local feature based on the signature likelihood obtained from Hidden Markov Models (HMM), the performance of signature verification improves significantly. The current version of the program has 2.5% equal error rate. At the 1% false rejection (FR) point, the addition of the local information to the algorithm with only global features reduced the false acceptance (FA) rate from 13% to 5%. 1 Introduction Signature verification is a common behavioral biometric to identify human beings for purposes of establishing their authority to complete an automated transaction, gaining control of a computer, or gaining physical entry to a protected area. Signatures are particularly useful for identif...
Modeling Duration In A Hidden Markov Model With The Exponential Family
- In Proc. ICASSP
, 1993
"... Explicit duration modeling has been shown to increase the effectiveness of hidden Markov models in automatic speech recognition. Ferguson found the optimum parameters of the duration model for the case where duration is assumed to be distributed according to a non-parametric probability mass functio ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Explicit duration modeling has been shown to increase the effectiveness of hidden Markov models in automatic speech recognition. Ferguson found the optimum parameters of the duration model for the case where duration is assumed to be distributed according to a non-parametric probability mass function. Levinson determined the best gamma density to model duration. In this paper, duration is assumed to be modeled by some probability mass function in the exponential family. An iterative procedure for determining the maximum likelihood parameters is presented. Also given is a method for choosing an appropriate member from the exponential family. 1 Introduction The performance of speech recognizers can be improved by accurately modeling the duration of short speech events. Ferguson [1] extended hidden Markov model (HMM) 1 theory to include a non-parametric probability mass function for the duration of each state. The total number of duration parameters for a speech model is typically very...
Document Classification using Layout Analysis
- IN DEXA WORKSHOP
, 1999
"... This paper describes methods for document image classification at the spatial layout level. The goal is to develop fast algorithms for initial document type classification without OCR, which can then be verified using more elaborate methods based on more detailed geometric and syntactic models. A no ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
This paper describes methods for document image classification at the spatial layout level. The goal is to develop fast algorithms for initial document type classification without OCR, which can then be verified using more elaborate methods based on more detailed geometric and syntactic models. A novel feature set called interval encoding is introduced to capture elements of spatial layout. This feature set encodes region layout information in fixed-length vectors by capturing structural characteristics of the image. We demonstrate the usefulness of these features derived from interval coding, in a hidden Markov model based page layout classification system that is trainable and extendible.
A coupled duration-focused architecture for realtime music to score alignment
- IEEE Transactions on Pattern Analysis and Machine Intelligence
"... Abstract—The capacity for real-time synchronization and coordination is a common ability among trained musicians performing a music score that presents an interesting challenge for machine intelligence. Compared to speech recognition, which has influenced many music information retrieval systems, mu ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Abstract—The capacity for real-time synchronization and coordination is a common ability among trained musicians performing a music score that presents an interesting challenge for machine intelligence. Compared to speech recognition, which has influenced many music information retrieval systems, music’s temporal dynamics and complexity pose challenging problems to common approximations regarding time modeling of data streams. In this paper, we propose a design for a real-time music-to-score alignment system. Given a live recording of a musician playing a music score, the system is capable of following the musician in real time within the score and decoding the tempo (or pace) of its performance. The proposed design features two coupled audio and tempo agents within a unique probabilistic inference framework that adaptively updates its parameters based on the real-time context. Online decoding is achieved through the collaboration of the coupled agents in a Hidden Hybrid Markov/semi-Markov framework, where prediction feedback of one agent affects the behavior of the other. We perform evaluations for both real-time alignment and the proposed temporal model. An implementation of the presented system has been widely used in real concert situations worldwide and the readers are encouraged to access the actual system and experiment the results. Index Terms—Automatic musical accompaniment, hidden hybrid Markov/semi-Markov models, computer music. Ç 1
A Parallel Implementation of a Hidden Markov Model with Duration Modeling for Speech Recognition
, 1995
"... Hidden Markov models (HMMs) are currently the most successful paradigm for speech recognition. Although explicit duration continuous HMMs more accurately model speech than HMMs with implicit duration modeling, the cost of accurate duration modeling is often considered prohibitive. This paper describ ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Hidden Markov models (HMMs) are currently the most successful paradigm for speech recognition. Although explicit duration continuous HMMs more accurately model speech than HMMs with implicit duration modeling, the cost of accurate duration modeling is often considered prohibitive. This paper describes a parallel implementation of an HMM with explicit duration modeling for spoken language recognition on the MasPar MP-1. The MP-1 is a fine-grained SIMD architecture with 16384 processing elements (PEs) arranged in a 128x128 mesh. By exploiting the massive parallelism of explicit duration HMMs, development and testing is practical even for large amounts of data. The result of this work is a parallel speech recognizer that can train a phone recognizer in real time. We present several extensions that include context dependent modeling, word recognition, and implicit duration HMMs. 1 Introduction While hidden Markov models (HMMs) have been a popular and effective method of recognizing spoken...
Prosody Dependent Speech Recognition With Explicit Duration Modelling At . . .
- IN PROC. EUROSPEECH’03
, 2003
"... Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. The prosody attribute that we investigate in this study is the duration lengthening effects of the speech segments ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. The prosody attribute that we investigate in this study is the duration lengthening effects of the speech segments in the vicinity of intonational phrase boundaries. Explicit Duration Hidden Markov Model (EDHMM) is implemented to provide an accurate phoneme duration model. This study is conducted on Boston University Radio New Corpus with prosodic boundaries marked using ToBI labelling system. We found that lengthening of the phrase final rhymes can be reliably modelled by EDHMM, which significantly improves the prosody dependent acoustic modelling. Conversely, no systematic duration variation is found at phrase initial position. With prosody dependence implemented in acoustic model, pronunciation model and language model, both word recognition accuracy and boundary recognition accuracy are improved by 1% over systems without prosody dependence.
An MCMC Sampling Approach to Estimation of Nonstationary Hidden Markov Models
- IEEE Trans. Signal Processing
, 2002
"... Hidden Markov models (HMMs) represent a very important tool for analysis of signals and systems. In the past two decades, HMMs have attracted the attention of various research communities, including the ones in statistics, engineering, and mathematics. Their extensive use in signal processing and, i ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Hidden Markov models (HMMs) represent a very important tool for analysis of signals and systems. In the past two decades, HMMs have attracted the attention of various research communities, including the ones in statistics, engineering, and mathematics. Their extensive use in signal processing and, in particular, speech processing is well documented. A major weakness of conventional HMMs is their inflexibility in modeling state durations. This weakness can be avoided by adopting a more complicated class of HMMs known as nonstationary HMMs. In this paper, we analyze nonstationary HMMs whose state transition probabilities are functions of time that indirectly model state durations by a given probability mass function and whose observation spaces are discrete. The objective of our work is to estimate all the unknowns of a nonstationary HMM, which include its parameters and the state sequence. To that end, we construct a Markov chain Monte Carlo (MCMC) sampling scheme, where sampling from all the posterior probability distributions is very easy. The proposed MCMC sampling scheme has been tested in extensive computer simulations on finite discrete-valued observed data, and some of the simulation results are presented in the paper. Index Terms---Gibbs sampling, hidden Markov models, Markov chain Monte Carlo, nonstationary.
Comparison and Classification of Documents Based on Layout Similarity
- Information Retrieval
, 2000
"... This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document retrieval as well as fast algorithms for initial document type classification without OCR. A novel feature set called ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document retrieval as well as fast algorithms for initial document type classification without OCR. A novel feature set called interval encoding is introduced to capture elements of spatial layout. This feature set encodes region layout information in fixed-length vectors by capturing structural characteristics of the image. These fixed-length vectors are then compared to each other through a Manhattan distance computation for fast page layout comparison. The paper describes experiments and results to rank-order a set of document pages in terms of their layout similarity to a test document. We also demonstrate the usefulness of the features derived from interval coding in a hidden Markov model based page layout classification system that is trainable and extendible. The methods described in the paper can be ...

