Results 1 - 10
of
18
Data Clustering: A Review
- ACM COMPUTING SURVEYS
, 1999
"... Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exp ..."
Abstract
-
Cited by 912 (9 self)
- Add to MetaCart
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Artificial Neural Networks: A Tutorial
- IEEE Computer
, 1996
"... Numerous efforts have been made in developing "intelligent" programs based on the Von Neumann's centralized architecture. However, these efforts have not been very successful in building general-purpose intelligent systems. Inspired by biological neural networks, researchers in a number of scientifi ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
Numerous efforts have been made in developing "intelligent" programs based on the Von Neumann's centralized architecture. However, these efforts have not been very successful in building general-purpose intelligent systems. Inspired by biological neural networks, researchers in a number of scientific disciplines are designing artificial neural networks (ANNs) to solve a variety of problems in decision making, optimization, prediction, and control. Artificial neural networks can be viewed as parallel and distributed processing systems which consist of a huge number of simple and massively connected processors. There has been a resurgence of interest in the field of ANNs for several years. This article intends to serve as a tutorial for those readers with little or no knowledge about ANNs to enable them to understand the remaining articles of this special issue. We discuss the motivations behind developing ANNs, basic network models, and two main issues in designing ANNs: network archite...
An HMM-Based Legal Amount Field OCR System for Checks
- IEEE International Conference on Systems, Man and Cybernetics, Vancouver BC
, 1995
"... The system described in this paper applies Hidden Markov technology to the task of recognizing the handwritten legal amount on personal checks. We argue that the most significant source of error in handwriting recognition is the segmentation process. In traditional handwriting OCR systems, recogniti ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
The system described in this paper applies Hidden Markov technology to the task of recognizing the handwritten legal amount on personal checks. We argue that the most significant source of error in handwriting recognition is the segmentation process. In traditional handwriting OCR systems, recognition is performed at the character level, using the output of an independent segmentation step. Using a fixed stepsize series of vertical slices from the image, the HMM system described in this paper avoids taking segmentation decisions early in the recognition process. 0 Introduction The current generation of Optical Character Recognition (OCR) systems can be characterized as a pipeline composed of Preprocessing, Segmentation, Classification, and Identification stages. None of these stages are immune to error. Preprocessing may fail to remove existing noise, it may remove portions of the image or add noise by some other mechanism. Segmentation may fail to establish a boundary where there sh...
An Experimental HMM-Based Postal OCR System
, 1997
"... It is almost universally accepted in speech recognition that phone- or word-level segmentation prior to recognition is neither feasible nor desirable, and in the dynamic (pen-based) handwriting recognition domain the success of segmentation-free techniques points to the same conclusion. But in image ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
It is almost universally accepted in speech recognition that phone- or word-level segmentation prior to recognition is neither feasible nor desirable, and in the dynamic (pen-based) handwriting recognition domain the success of segmentation-free techniques points to the same conclusion. But in image-based handwriting recognition, this conclusion is far from being firmly established, and the results presented in this paper show that systems employing character-level presegmentation can be more effective, even within the same HMM paradigm, than systems relying on sliding window feature extraction. We describe two variants of a Hidden Markov system recognizing handwritten addresses on US mail, one with presegmentation and one without, and report results on the CEDAR data set. 1. INTRODUCTION Any approach to speech and handwriting recognition must take into account that the signal is composed from a succession of alphabetic units (phonemes or graphemes). In the early work on speech recog...
An Efficient Feature Extraction and Dimensionality Reduction Scheme for Isolated Greek Handwritten Character Recognition
"... (ED in the initial submission for double blind review In this paper, we present an off-line methodology for isolated Greek handwritten character recognition based on efficient feature extraction followed by a suitable feature vector dimensionality reduction scheme. Extracted features are based on (i ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
(ED in the initial submission for double blind review In this paper, we present an off-line methodology for isolated Greek handwritten character recognition based on efficient feature extraction followed by a suitable feature vector dimensionality reduction scheme. Extracted features are based on (i) horizontal and vertical zones, (ii) the projections of the character profiles, (iii) distances from the character boundaries and (iv) profiles from the character edges. The combination of these types of features leads to a 325dimensional feature vector. At a next step, a dimensionality reduction technique is applied, according to which the dimension of the feature space is lowered down to comprise only the features pertinent to the discrimination of characters into the given set of letters. In this paper, we also present a new Greek handwritten database of 36,960 characters that we created in order to measure the performance of the proposed methodology. 1.
A Comparison of Hidden Markov Model Features for the Recognition of Cursive Handwriting
, 1996
"... Due to the difficulty of character segmentation in cursive handwriting recognition, much recent research has turned to segmentation free approaches of word recognition. While techniques of feature extraction for presegmented characters have been thoroughly explored in the literature, an evaluation o ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Due to the difficulty of character segmentation in cursive handwriting recognition, much recent research has turned to segmentation free approaches of word recognition. While techniques of feature extraction for presegmented characters have been thoroughly explored in the literature, an evaluation of features for use with segmentation during recognition techniques remains sparse. The main purpose of this thesis is to provide a comparison of a number of feature extraction techniques applied to the domain of legal amount recognition in bank checks. An experimental system using Hidden Markov Models and a horizontally sliding window is described. Results are presented for the recognition of the entire legal field using a variety of features. Of the experiments presented here, the best results were obtained by concatenating the feature vectors from the present, previous, and next window...
Handwritten Digit Recognition with a Novel Vision Model that Extracts Linearly Separable Features
, 2000
"... We use well-established results in biological vision to construct a novel vision model for handwritten digit recognition. We show empirically that the features extracted by our model are linearly separable over a large training set (MNIST). Using only a linear classifier on these features, our model ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We use well-established results in biological vision to construct a novel vision model for handwritten digit recognition. We show empirically that the features extracted by our model are linearly separable over a large training set (MNIST). Using only a linear classifier on these features, our model is relatively simple yet outperforms other models on the same data set.
Practicing Vision: Integration, Evaluation and Applications
- Pattern Recognition
, 1997
"... Computer vision has emerged as a challenging and important area of research, both as an engineering and a scientific discipline. The growing importance of computer vision is evident from the fact that it was identified as one of the "Grand Challenges" and also from its prominent role in the National ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Computer vision has emerged as a challenging and important area of research, both as an engineering and a scientific discipline. The growing importance of computer vision is evident from the fact that it was identified as one of the "Grand Challenges" and also from its prominent role in the National Information Infrastructure. While the design of a general-purpose vision system continues to be elusive, machine vision systems are being used successfully in specific application domains. Building a practical vision system requires a careful selection of appropriate sensors, extraction and integration of information from available cues in the sensed data, and evaluation of system robustness and performance. We discuss and demonstrate advantages of (i) multi-sensor fusion, (ii) combination of features and classifiers, (iii) integration of visual modules, and (iv) goal-directed evaluation of vision algorithms. The requirements of several prominent real world applications such as biometry, do...
Hierarchical Classification of Handwritten Characters based on Novel Structural Features
"... In this paper, we present a methodology for off-line handwritten character/digit recognition. The proposed methodology relies on a new feature extraction technique based on recursive subdivisions of the image as well as on calculation of the centre of masses of each sub-image. Feature extraction is ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this paper, we present a methodology for off-line handwritten character/digit recognition. The proposed methodology relies on a new feature extraction technique based on recursive subdivisions of the image as well as on calculation of the centre of masses of each sub-image. Feature extraction is followed by a hierarchical classification scheme based on the level of granularity of the feature extraction method. Pairs of classes with high values in the confusion matrix are merged at a certain level and higher level granularity features are employed for distinguishing them. A handwritten character database as well as a handwritten digit database is used in order to demonstrate the efficiency of the proposed technique.
An Adaptive Character Recognizer for Telugu Scripts Using Multiresolution Analysis and Associative memory
, 2002
"... viable and a robust character recognizer for Telugu texts. We aim at designing a recognizer which exploits the inherent characteristics of the Telugu Script. Our proposed method uses wavelet multiresolution analysis for the purpose extracting features and associative memory model to accomplish the r ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
viable and a robust character recognizer for Telugu texts. We aim at designing a recognizer which exploits the inherent characteristics of the Telugu Script. Our proposed method uses wavelet multiresolution analysis for the purpose extracting features and associative memory model to accomplish the recognition tasks. Our system learns the style and font from the document itself and then it recognizes the remaining characters in the document. The major contribution of the present study can be outlined as follows. It is a robust OCR system for Telugu printed text. It avoids feature extraction process and it exploits the inherent characteristics of the Telugu character by a clever selection of Wavelet Basis function which extracts the invariant features of the characters. It has a Hopfield-based Dynamic Neural Network for the purpose of learning and recognition. This is important because it overcomes the inherent difficulties of memory limitation and spurious states in the Hopfield Network. The DNN has been demonstrated to be efficient for associative memory recall. However, though it is normally not suitable for image processing application, the multi-resolution analysis reduces the sizes of the images to make the DNN applicable to the present domain. Our experimental results show extremely promising results.

