Learning and Upgrading Rules for an OCR System Using Genetic Programming (0) [11 citations — 2 self]
Abstract:
: Rule-based systems used for Optical Character Recognition (OCR) are notoriously difficult to write, maintain, and upgrade. This paper describes a method for using Genetic Programming (GP) to evolve and upgrade rules for an OCR system. The language of the evolved programs was designed such that human hand-coded rules can be included into the initial population in order to upgrade for a new font. The system was successful at learning rules for large character sets consisting of multiple fonts and sizes, with very good generalization to test sets. In addition, the method was found to be successful at updating hand-coded rules written in C for new fonts. This research demonstrates the successful application of GP to a difficult, noisy, real-world problem. 1. Introduction Rule-based systems used in OCR are difficult and time-consuming to write, maintain, and upgrade. There is a rule set for each character that is supposedly true only for that character. Thus, any changes in a rule set m...

