Results 1 -
1 of
1
Statistics and Graphotactical Rules in Finding OCR-errors
, 2000
"... This thesis describes two experiments in nding errors in optically scanned Swedish without relying on a lexicon. First, statistics were used to nd unexpectedly frequent trigrams and correction rules for these cases were created. The rules were then tested and compared to a hand corrected version of ..."
Abstract
- Add to MetaCart
This thesis describes two experiments in nding errors in optically scanned Swedish without relying on a lexicon. First, statistics were used to nd unexpectedly frequent trigrams and correction rules for these cases were created. The rules were then tested and compared to a hand corrected version of the test text. Secondly, Bengt Sigurd's model of Swedish phonotax was used to detect words with phonotactically illegal beginning or end.

