## Hidden Markov Models in Text Recognition (1995)

Venue: | International Journal of Pattern Recognition and Artificial Intelligence |

Citations: | 6 - 0 self |

### BibTeX

@ARTICLE{Anigbogu95hiddenmarkov,

author = {J.C. Anigbogu and A. Belaïd},

title = {Hidden Markov Models in Text Recognition},

journal = {International Journal of Pattern Recognition and Artificial Intelligence},

year = {1995},

volume = {9},

pages = {95--8}

}

### OpenURL

### Abstract

A multi-level multifont character recognition is presented. The system proceeds by first delimiting the context of the characters. As a way or enhancing system performance, typographical information is extracted and used for font identification before actual character recognition is performed. This has the advantage of sure character identification as well as text reproduction in original form. The font identification is based on decision trees where the characters are automatically arranged differently in confusion classes according to the physical characteristics of fonts. The character recognizers are built around the first and second order hidden Markov models (HMM) as well as Euclidean distance measures. The HMMs use the Viterbi and the Extended Viterbi algorithms to which enhancements were made. Also present is a majority-vote system that polls the other systems for "advice" before deciding on the identity of a character. Among other things, this last system is shown to give bett...

### Citations

942 | An introduction to hidden Markov models - Rabiner, Juang - 1986 |

401 | A Tutorial on - Rabiner - 1989 |

147 |
The forward-backward algorithm
- Jr
- 1996
(Show Context)
Citation Context ... order HMM. Particularly, we shall show the relationships between recognition results obtained, the model parameters such as the number of states, and the training process. The Viterbi algorithm (VA) =-=(4)-=- was used for the recognition process, with a decision-tree pre-classifier restricting the number of candidate characters. We improved on the results by making decisions based on weighted output proba... |

92 |
Computer recognition of unconstrained handwritten numerals
- Suen, Nadal, et al.
- 1992
(Show Context)
Citation Context ...Multiple Recognizers From observations, we remarked that the errors committed by the different recognizers were not always on the same character. This has already been noted in Ho (40) and Suen et al =-=(41)-=- where the former applied multiple classifiers to degraded printed text and the latter to hand written numerals, both with success. If it does improve the recognition rate of degraded text, it should ... |

85 |
On the recognition of printed characters of any font and size
- Kahan, Pavlidis, et al.
- 1987
(Show Context)
Citation Context ...ed for all the fonts for which prototypes exist. The correct font is the one that generates a global minimum. We thus have, CorrectFont = Arg min 1jF [ N X i=1 (Min 0;a;Aff9;z;Z kV i \Gamma P ffj k)] =-=(3)-=- where : ffl N is the number of characters in the paragraph, ffl F is the number of fonts for which prototypes exist, ffl V i is the feature vector for the i th character in the paragraph, ffl P ffj i... |

84 | Letter recognition using holland-style adaptive classifiers - Frey, Slate - 1991 |

76 |
The skew angle of printed documents
- Baird
- 1987
(Show Context)
Citation Context ..., the pre-treatment, the character and word recognition levels. As a side note, we shall not be presenting the aspects of the system dealing with skew-correction, having used a technique due to Baird =-=(31)-=- and the segmentation into blocks (Recursive XY-Cuts due to Nagy et al (32) ) and connected components (due to Capson (33) ). At the first level, the dominant font is determined on a paragraph by para... |

65 | A 100-Font Classifier
- Baird, Fossey
- 1991
(Show Context)
Citation Context ...tion. The shape discrimination criterion is quite simple. Let A 1 and A 2 be two such shape matrices of size m \Theta n Similarity = 1 \Gamma m X i=1 n X j=1 XOR(A 1 (i; j); A 2 (i; j)) (m \Gamma 2)n =-=(1)-=- This is adjusted so that when two shapes are identical we obtain similarity = 1. A circle and a line are two completely dissimilar shapes. A line corresponds to a shape matrix with two rows of 1's an... |

62 |
2-D Shape Classification Using Hidden Markov Model
- He, Kundu
- 1991
(Show Context)
Citation Context ...st important aspect of HMM is that it can be applied to practically any type of signal as shown by the diversity of applications cited. It has even been applied to the discrimination of planar shapes =-=(25)-=- . The only thing common to these applications is the existence of an ordered list of features. One might then think that, to use this type of recognition method, we only need a vocabulary of stable f... |

54 |
Stochastic modeling for automatic speech understanding, Speech Recognition
- Baker
- 1975
(Show Context)
Citation Context ...esented by V k is observed while the model is in state j. Modifying eqn. 7 to take into account the inter-dependence between the number of states and letter distribution yields b t (k) = fl t (k) j T =-=(8)-=- where b t (k) is the probability of the letter or symbol V k appearing in state t, fl t (k) is the total number of times the symbol V k is observed in state or letter-position t and j T is the total ... |

40 |
F.: High performance connected digit recognition using hidden markov models
- Rabiner, Wilpon, et al.
- 1989
(Show Context)
Citation Context ...del that suits the character in one font might not favor it in another. The findings here agree with what obtains in speech analysis, where the optimal number of states varies from speaker to speaker =-=(15;16)-=- . One can safely conclude then that the optimal number of states is dependent on the font. Table 4 shows the performances if we accept the fact that the correct character is either of the first two s... |

36 |
On the application of vector quantization and hidden markov models to speaker-independent, isolated word recognition
- Rabiner, Levinson, et al.
- 1983
(Show Context)
Citation Context ...eformulate only the calculation of A. Character position dependent probabilities that constitute B are unchanged. Thus, for triple-letter transitions we have, a T ijk = ��(i; j; k) X 8V k ��(i=-=; j; k) (10) where �-=-���(i; j; k) is the number of transitions from letter V i at t \Gamma 2 to letter V j at t \Gamma 1 and to state k at t and X 8V k ��(i; j; k) is the total number of transitions from letter V i ... |

33 | A theory of multiple classifier systems and its application to visual word recognition
- Ho
- 1992
(Show Context)
Citation Context ... Majority-Vote from Multiple Recognizers From observations, we remarked that the errors committed by the different recognizers were not always on the same character. This has already been noted in Ho =-=(40)-=- and Suen et al (41) where the former applied multiple classifiers to degraded printed text and the latter to hand written numerals, both with success. If it does improve the recognition rate of degra... |

29 |
Digital typography: An introduction to type and composition for computer system design
- Rubinstein
- 1988
(Show Context)
Citation Context ...inters to names for the eventual reconstruction of the original text. Figure 5 should be placed here 2.4.1 Font-tree Construction Given that font information is inherent in the constituent characters =-=(7)-=- , constructing prototypes for characters is thus a reasonable approach. During an a priori learning phase, and with some of the features earlier enunciated, prototypes are constructed for the charact... |

29 |
Document analysis with an expert system
- Nagy, Seth, et al.
- 1985
(Show Context)
Citation Context ...ote, we shall not be presenting the aspects of the system dealing with skew-correction, having used a technique due to Baird (31) and the segmentation into blocks (Recursive XY-Cuts due to Nagy et al =-=(32)-=- ) and connected components (due to Capson (33) ). At the first level, the dominant font is determined on a paragraph by paragraph basis, the result of which is passed on to the second level where the... |

28 |
Recognition of handwritten word: first and second order HMM based approach
- Kundu, He, et al.
- 1989
(Show Context)
Citation Context ...has known an increasingly wide-spread usage in the past two decades with a lot of applications in speech processing (8\Gamma16) . A number of applications have also appeared in word-level recognition =-=(17;21)-=- and character recognition (22;23) . The most important aspect of HMM is that it can be applied to practically any type of signal as shown by the diversity of applications cited. It has even been appl... |

27 | The viterbi algorithm as an aid in text recognition - Neuhoff - 1975 |

26 |
Description and discrimination of planar shapes using shape matrices
- Goshtasby
- 1985
(Show Context)
Citation Context ...made are not heuristically limited. As can be seen later, if this is handled properly, an enormous gain is assured during recognition as features are extracted only for this reduced subset. Goshtasby =-=(5)-=- suggests coding the forms using shape matrices and then using binary operators to compare. Since shape matrices are binary in nature, their comparison would involve only an eXclusive-OR (XOR) operati... |

26 | Word-level recognition of cursive script - Farag - 1979 |

20 | cchinegro, “Optical Character Recognition - A Survey - Impedovo, Ottaviano, et al. - 1991 |

18 |
Mathematical Foundations of Hidden Markov Models
- Rabiner
- 1988
(Show Context)
Citation Context ...d-order properties. The left-to-right parallel methods were applied at the word recognition level, again using both orders, with the further assumption that the models were non-stationary. As Rabiner =-=(12)-=- points out, there are a lot of factors that influence the quality of results obtained. We shall show the influence of the number of states on the result. It is also a well known fact that the trainin... |

18 |
An integrated algorithm for text recognition: comparison with a cascaded algorithm
- Hull, Srihari, et al.
- 1983
(Show Context)
Citation Context ...hm. As fig. 1 shows, the user can update a working dictionary as the recognition proceeds. 4.2 First Order HMM 4.2.1 Model Construction First order HMM has limited correcting capability as Hull et al =-=(26)-=- points out. Kundu (18) has proved that for the English language, the optimal model is at least of order two. However, it is a useful method if the parameters are well chosen and if it is also complem... |

16 | Feature identification for hybrid structural/statistical pattern classification - Baird - 1988 |

11 | P.Bahl: Recognition of Handwritten Word - Kundu - 1989 |

9 |
Contextual word recognition using probabilistic relaxation labeling
- Goshtasby, Ehrich
- 1988
(Show Context)
Citation Context ...eir difference is thus by XOR, (m \Gamma 2)n, from where by eqn. 1 a 0 is obtained. Other shapes will produce similarities between 0.0 and 1.0. Although this has been applied to character recognition =-=(6)-=- , the principal fault is that at low point sizes, the shape matrices of characters like a, e and o do not differ much and produce high similarity scores between them. We opted for a similarity score ... |

9 |
Application of hidden Markov models to multifont text recognition
- Anigbogu, Bela, et al.
- 1991
(Show Context)
Citation Context ...has known an increasingly wide-spread usage in the past two decades with a lot of applications in speech processing (8\Gamma16) . A number of applications have also appeared in word-level recognition =-=(17;21)-=- and character recognition (22;23) . The most important aspect of HMM is that it can be applied to practically any type of signal as shown by the diversity of applications cited. It has even been appl... |

9 |
On partitioning a dictionary for visual text recognition
- Sinha
- 1990
(Show Context)
Citation Context ...he output is never guaranteed to be a valid word or this is sent out as output. 4.1 Dictionary Verification The dictionary is partitioned into classes using a technique similar to the envelope method =-=(30)-=- . The words are first classified by length and then regrouped according to the positions of the constituent characters with respect to the x-height and the baseline. Characters that have ascenders ar... |

8 | K.M.Kulkarni, A high accuracy algorithm for the recognition of hand written numerals - Baptista - 1998 |

7 |
Recognition of Multifont Text Using Markov Models
- Anigbogu, Bela, et al.
- 1991
(Show Context)
Citation Context ...ead usage in the past two decades with a lot of applications in speech processing (8\Gamma16) . A number of applications have also appeared in word-level recognition (17;21) and character recognition =-=(22;23)-=- . The most important aspect of HMM is that it can be applied to practically any type of signal as shown by the diversity of applications cited. It has even been applied to the discrimination of plana... |

6 | A Theory of Multiple Classi er Systems and Its Application to Visual Word Recognition - Ho - 1992 |

5 |
On optimal order in modeling sequence of letters in words of common language as a Markov chain
- Kundu, He
- 1991
(Show Context)
Citation Context ...te that there is no good theoretical way to choose the number of states as they are not physically related to any observable symbol and also with respect to the font. 3.3 Second Order HMM Kundu et al =-=(17;18)-=- and Liu (16) contend, and rightly so (see section 4.3.1), that second order models use more information than first order models, and can thus outperform the latter. This assertion might be true at th... |

5 |
Extended viterbi algorithm for second order hidden markov process
- He
- 1988
(Show Context)
Citation Context ...2 X t=1 N X k=1 j t+1 (i; j; k) where j t+1 (i; j; k) = ff t+1 (i; j)a ijk b k (O t+2 )fi t+2 (j; k) P (Oj) 3.3.2 Recognition For recognition, we applied the extended Viterbi algorithm as found in He =-=(24)-=- . As with first order models, we skipped the tracking of the sequence of states. The results under the two recognition modes are shown in tables 7 through 9. On comparing tables 3 and 7, we can see t... |

5 |
The sensitivity of the modified Viterbi algorithm to the source statistics
- Shinghal, Toussaint
- 1980
(Show Context)
Citation Context ...atrix B was replaced with character position dependent probabilities. Similar techniques have been applied to English text by Sinha and Prasada (29) , and by Kundu et al (17) . Shinghal and Toussaint =-=(28)-=- have also studied this phenomenon with regards to the source statistics. There were 22 models for the English language (2-22, 28. No words of lengths 23-27 were available) and 23 models for French (2... |

5 |
An Improved Algorithm for the Sequential Extraction of Boundaries from a Raster Scan
- Capson
- 1984
(Show Context)
Citation Context ...the system dealing with skew-correction, having used a technique due to Baird (31) and the segmentation into blocks (Recursive XY-Cuts due to Nagy et al (32) ) and connected components (due to Capson =-=(33)-=- ). At the first level, the dominant font is determined on a paragraph by paragraph basis, the result of which is passed on to the second level where the system has the choice then of using either of ... |

4 |
Multifont Character Recognition For Typeset Documents
- Shlien
- 1988
(Show Context)
Citation Context ...e ff and fi are stretch-factors which generally ensure differences of \Sigma5 pixels. The correlation coefficient or similarity, S, is then given by S = align(1 \Gamma XOR(IA ; I B ) max(IA ; I B ) ) =-=(2)-=- It can be seen that if the two images I A and I B are identical, their XOR is 0 and we obtain S = 1. On the other hand, if the images are completely dissimilar their XOR would yield more black pixels... |

4 | A 100-font classi er - Baird, Fossey - 1991 |

4 | Optical Character Recognition — A Survey” in Int - Impedovo, Ottaviano, et al. - 1991 |

3 |
Visual Text Recognition Through
- Sinha, Prasada
- 1988
(Show Context)
Citation Context ...ram frequencies of letter pairs, while the traditional matrix B was replaced with character position dependent probabilities. Similar techniques have been applied to English text by Sinha and Prasada =-=(29)-=- , and by Kundu et al (17) . Shinghal and Toussaint (28) have also studied this phenomenon with regards to the source statistics. There were 22 models for the English language (2-22, 28. No words of l... |

3 | On Optimal Order in Modeling Sequence of - Kundu, He - 1991 |

2 |
Speaker independent connected digit recognition using hidden Markov models
- Mari, Roucos
- 1985
(Show Context)
Citation Context ...rved in state t summed over all samples of size T . That is, j T j X 8V i fl t (i) In the same manner, the calculation of A = a ij is constrained by N = T . That is, a T ij = ��(i; j) X 8V k ���=-=�(i; k) (9) where �-=-���(i; j) is the number of transitions from letter V i to letter V j and X 8V k ��(i; k) is the total number of transitions from letter V i . We assume that letter-pair frequencies or digrams de... |

2 |
A Speech Recognition Method Based on Feature Distributions
- Liu, Chiou, et al.
- 1991
(Show Context)
Citation Context ...del that suits the character in one font might not favor it in another. The findings here agree with what obtains in speech analysis, where the optimal number of states varies from speaker to speaker =-=(15;16)-=- . One can safely conclude then that the optimal number of states is dependent on the font. Table 4 shows the performances if we accept the fact that the correct character is either of the first two s... |

2 |
Pattern Matching by Gestalt
- Ratcliff
- 1988
(Show Context)
Citation Context ...ve nature of the possibilities from a standard dictionary, the OCR output is used to constrain the result. The matching between the input and dictionary words is achieved using the Ratcliff-Obershelp =-=(27)-=- pattern matching algorithm. As fig. 1 shows, the user can update a working dictionary as the recognition proceeds. 4.2 First Order HMM 4.2.1 Model Construction First order HMM has limited correcting ... |

2 | Applications of multi-dimensional search to structural feature identification - Baird - 1986 |

2 | The Viterbi algorithm as an aid in text recognition - NEUHO' - 1975 |

2 | 2-D Shape Classication Using Hidden Markov Model - He, Kundu - 1991 |

1 |
Speech Recognition Based on Second Order HMM
- Kriouile, Mari, et al.
- 1991
(Show Context)
Citation Context ...ed the sequences of features as second order Markovian functions. 3.3.1 Model Training For training, we used the extended version of the Baum-Welch forward-backward algorithm as presented by Kriouile =-=(14) . T-=-hese extensions for the forward variable ff and backward variable fi are given below. 1. Initialization ff 2 (i; j) = �� i b i (O 1 )a ijk b j (O 2 ), 1si; jsN 2. Induction for 2stsT \Gamma 1 ff t... |

1 |
Printed Character Recognition Using Markov Models, Onzime Colloque GRETSI
- Kordi, Xydeas, et al.
- 1987
(Show Context)
Citation Context ...ead usage in the past two decades with a lot of applications in speech processing (8\Gamma16) . A number of applications have also appeared in word-level recognition (17;21) and character recognition =-=(22;23)-=- . The most important aspect of HMM is that it can be applied to practically any type of signal as shown by the diversity of applications cited. It has even been applied to the discrimination of plana... |

1 |
Pattern Recognition Principles," Addison-Wesley
- Tou, Gonzalez
- 1974
(Show Context)
Citation Context ...ifferent leaves due to bitmap image defects. In order to reduce the set of samples (up to 240) of a given character in a leaf-node to a single representative (prototype), we use the K-means algorithm =-=(39)-=- . The representative of the samples of each character being a sequence of features, the distance used to make these groupings into classes is the Euclidean distance in ! 13 space. For each leaf, we i... |

1 | Pattern Matching by Gestalt, D.D.J - Ratcli - 1988 |

1 | The Sensitivity of the Modied Viterbi Algorithm to the Source Statistics - Shinghal, Toussaint - 1980 |