Output details
11 - Computer Science and Informatics
University of Salford
Word-Based Adaptive OCR for Historical Books
<17>One of the major results of the IMPACT multi-million research project, actively involving industry and academia, in improving OCR performance for large-scale digitization of historical documents. In the case of books (majority of world-library holdings) the proposed architecture for OCR supports a recognition system that can train itself as it progresses through the pages of a book. This is an important requirement for large-scale digitization, where human input is impractical, very costly and material is printed using a variety archaic conventions and fonts. Experiments with material from major European libraries demonstrate a significant improvement in recognition rate using this approach.