December 01, 2016
A simple, effective post-processing OCR improvement.
- Waldstein R.
High quality optical character recognition (OCR) is very important for search and retrieval of scanned materials. This paper presents a simple, low-cost method of significantly improving OCR output. In particular, it improves the text, and thus the retrieval results, by repairing words being broken up by the OCR incorrectly adding spaces between letters of a word.View Original Article