Results 1 - 10
of
18
Speech recognition techniques for a sign language recognition system
- In Interspeech 2007 - Eurospeech
, 2007
"... One of the most significant differences between automatic sign language recognition (ASLR) and automatic speech recognition (ASR) is due to the computer vision problems, whereas the corresponding problems in speech signal processing have been solved due to intensive research in the last 30 years. We ..."
Abstract
-
Cited by 12 (9 self)
- Add to MetaCart
One of the most significant differences between automatic sign language recognition (ASLR) and automatic speech recognition (ASR) is due to the computer vision problems, whereas the corresponding problems in speech signal processing have been solved due to intensive research in the last 30 years. We present our approach where we start from a large vocabulary speech recognition system to profit from the insights that have been obtained in ASR research. The system developed is able to recognize sentences of continuous sign language independent of the speaker. The features used are obtained from standard video cameras without any special data acquisition devices. In particular, we focus on feature and model combination techniques applied in ASR, and the usage of pronunciation and language models (LM) in sign language. These techniques can be used for all kind of sign language recognition systems, and for many video analysis problems where the temporal context is important, e.g. for action or gesture recognition. On a publicly available benchmark database consisting of 201 sentences and 3 signers, we can achieve a 17 % WER.
Whitespace models for offline arabic handwriting recognition
- In International Conference on Pattern Recognition
, 2008
"... We propose to explicitly model white-spaces for Arabic handwriting recognition within different writing variants. Position-dependent character shapes in Arabic handwriting allow for large white-spaces between characters even within words. Here, a separate character model for white-spaces in combinat ..."
Abstract
-
Cited by 9 (8 self)
- Add to MetaCart
We propose to explicitly model white-spaces for Arabic handwriting recognition within different writing variants. Position-dependent character shapes in Arabic handwriting allow for large white-spaces between characters even within words. Here, a separate character model for white-spaces in combination with a lexicon using different writing variants and character model length adaptation is proposed. Current handwriting recognition systems model the white-spaces implicitly within the character models leading to possibly degraded models, or try to explicitly segment the Arabic words into pieces of Arabic words being prone to segmentation errors. Several white-space modeling approaches are analyzed on the well known IFN/ENIT database and outperform the best reported error rates. 1.
Classification via minimum incremental coding length (MICL
, 2007
"... We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classifiers. Theoretical results provide new insights into relationships among popular classifiers such as MAP and RDA, as well as unsupervised clustering methods based on lossy compression [13]. Minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small-sample setting. Compression also provides a uniform means of handling classes of varying dimension. This simple classification criterion and its kernel and local versions perform competitively against existing classifiers on both synthetic examples and real imagery data such as handwritten digits and human faces, without requiring domain-specific information. 1
Technology
"... With the advancement in the computing technology, the field of Artificial Neural Networks, replicating the logics of brain, gained vast interest during 80’s. With the growing interest in unexplored field of Neural Network and rapid growth of related technology, tremendous work was undertaken in the ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
With the advancement in the computing technology, the field of Artificial Neural Networks, replicating the logics of brain, gained vast interest during 80’s. With the growing interest in unexplored field of Neural Network and rapid growth of related technology, tremendous work was undertaken in the field of development of intelligent machines and Robotics. Most of the earlier research works are focused to transfer the skill from humans to machine. This paper presents Neural Network as a learner in the 1 st step and as a teacher in the second step. This paper demonstrates a methodology of transferring the skills from an expert to a less skilled apprentice using Neural Network as an intermediate transferring medium. The network is trained with random selection of input and output datasets generated during the expert’s performance on the task.
Speeding up IDM without degradation of retrieval quality
"... The Image Distortion Model (IDM) has shown good retrieval quality in previous runs of the medical automatic annotation task of previous ImageCLEF workshops. However, one of its limitations is computational complexity and the resulting long retrieval times. We applied several optimizations, in partic ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The Image Distortion Model (IDM) has shown good retrieval quality in previous runs of the medical automatic annotation task of previous ImageCLEF workshops. However, one of its limitations is computational complexity and the resulting long retrieval times. We applied several optimizations, in particular the use of an early termination strategy for the individual distance computations, the use of optimized data structures for the intermediate results, and the proper use of multithreading on state-of-the-art hardware. With these extensions, we were able to perform the IDM P2DHMDM down to 1.5 second per query. Moreover, we extended the possible displacements to an area of 7x7 pixels, using a local context of either 5x5 or 7x7 pixels. We also introduced a classifier, that exploits the hierarchical structure of the IRMA code. The results of the extendeded IDM P2DHMDM have been submitted to this year’s medical automatic annotation task and achieved rank 19 to 25 out of 68. More importantly, the used techniques are not limited strictly to IDM but are also applicable to other expensive distance measures.
Part-Based Recognition of Handwritten Characters
"... In the part-based recognition method proposed in this paper, a handwritten character image is represented by just a set of local parts. Then, each local part of the input pattern is recognized by a nearestneighbor ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In the part-based recognition method proposed in this paper, a handwritten character image is represented by just a set of local parts. Then, each local part of the input pattern is recognized by a nearestneighbor
1 Simple Method for High-Performance Digit Recognition Based on Sparse Coding
"... Abstract — We propose a method of feature extraction for digit recognition that is inspired by vision research: a sparse-coding strategy and a local maximum operation. We show that our method, despite its simplicity, yields state-of-the-art classification results on a highly competitive digit-recogn ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract — We propose a method of feature extraction for digit recognition that is inspired by vision research: a sparse-coding strategy and a local maximum operation. We show that our method, despite its simplicity, yields state-of-the-art classification results on a highly competitive digit-recognition benchmark. We first employ the unsupervised Sparsenet algorithm to learn a basis for representing patches of handwritten digit images. We then use this basis to extract local coefficients. In a second step, we apply a local maximum operation in order to implement local shift invariance. Finally, we train a Support-Vector-Machine on the resulting feature vectors and obtain state-of-the-art classification performance in the digit recognition task defined by the MNIST benchmark. We compare the different classification performances obtained with sparse coding, Gabor wavelets, and principle component analysis. We conclude that the learning of a sparse representation of local image patches combined with a local maximum operation for feature extraction can significantly improve recognition performance. I.
Deformation-Aware Log-Linear Models
"... Abstract. In this paper, we present a novel deformation-aware discriminative model for handwritten digit recognition. Unlike previous approaches our model directly considers image deformations and allows discriminative training of all parameters, including those accounting for non-linear transformat ..."
Abstract
- Add to MetaCart
Abstract. In this paper, we present a novel deformation-aware discriminative model for handwritten digit recognition. Unlike previous approaches our model directly considers image deformations and allows discriminative training of all parameters, including those accounting for non-linear transformations of the image. This is achieved by extending a log-linear framework to incorporate a latent deformation variable. The resulting model has an order of magnitude less parameters than competing approaches to handling image deformations. We tune and evaluate our approach on the USPS task and show its generalization capabilities by applying the tuned model to the MNIST task. We gain interesting insights and achieve highly competitive results on both tasks. 1
2009 10th International Conference on Document Analysis and Recognition Isolated Handwritten Farsi numerals Recognition Using Sparse And Over-Complete Representations
"... A new isolated handwritten Farsi numeral recognition algorithm is proposed in this paper, which exploits the sparse and over-complete structure from the handwritten Farsi numeral data. In this research, the sparse structure is represented as an over-complete dictionary, which is learned by the K-SVD ..."
Abstract
- Add to MetaCart
A new isolated handwritten Farsi numeral recognition algorithm is proposed in this paper, which exploits the sparse and over-complete structure from the handwritten Farsi numeral data. In this research, the sparse structure is represented as an over-complete dictionary, which is learned by the K-SVD algorithm. These atoms in this dictionary are adopted to initialize the first layer of the Convolutional Neural Network (CNN), the latter is then trained to do the classification task. Data distortion techniques are also applied to promote the generalization capability of the trained classifier. Experiments have shown that good results have been achieved in CENPARMI handwritten Farsi numeral database. 1.

