Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance

Thesis (MEng)--Stellenbosch University, 2023.

Saved in:

Bibliographic Details
Main Author:	Kleinhans, Ryno
Other Authors:	Nel, Gerrit Stephanus
Format:	Thesis
Language:	en_ZA en_ZA
Published:	Stellenbosch : Stellenbosch University 2023
Subjects:	Optical character recognition Document imaging systems Computer vision
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613900621479936
access_status_str	Open Access
author	Kleinhans, Ryno
author2	Nel, Gerrit Stephanus
author_browse	Kleinhans, Ryno Nel, Gerrit Stephanus
author_facet	Nel, Gerrit Stephanus Kleinhans, Ryno
author_sort	Kleinhans, Ryno
collection	Thesis
dc_rights_str_mv	Stellenbosch University
description	Thesis (MEng)--Stellenbosch University, 2023.
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/127090
institution	Stellenbosch University (South Africa)
language	en_ZA en_ZA
last_indexed	2026-06-10T12:43:29.841Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2023
publishDateRange	2023
publishDateSort	2023
publisher	Stellenbosch : Stellenbosch University
publisherStr	Stellenbosch : Stellenbosch University
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/127090 Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance Kleinhans, Ryno Nel, Gerrit Stephanus Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering. Optical character recognition Document imaging systems Computer vision Thesis (MEng)--Stellenbosch University, 2023. ENGLISH ABSTRACT: A characteristic trait of the age of digitalisation is the ubiquitous transition from paper-reliant and manual-based business processes to completely digital, computer-assisted and automated versions thereof. Although many industries have already commenced with this transition away from paper documents, several real-world information chains are still intertwined with downstream paperbased systems. Some of these systems might require several decades to transition into a fully digital version thereof. Consequently, in order to fully automate these processes, the paper-based documents ought to be digitised. Computerised approaches, e.g. optical character recognition engines, have achieved notable success in accurately extracting and transforming pixel-based information into machine-encoded information. The algorithmic performance of these engines is, however, reliant on the quality of the captured document images. Although there are a plethora of image enhancement techniques designed to increase image quality, the implementation of some of these techniques involves a large degree of dependency on human cognition as each document image requires a unique set of preprocessing steps. Accordingly, the application of data-driven approaches from the realm of machine learning — more specifically, deep learning — certainly warrants consideration within the presented context. In this thesis, a generic framework for intelligent document image enhancement for improved optical character recognition is proposed. The focus of the framework is placed on facilitating the text extraction procedure of document images by automating the preprocessing stage by means of intelligently identifying the best combination of document image enhancement techniques to implement in respect of individual (document) images. Powerful approaches from the domain of computer vision, together with the implementation of transfer learning, are considered. An instantiation of this framework is, first, implemented on a benchmark document analysis data set. Subsequently, the framework is applied to a real-world case study in the South African banking sector in order to illustrate the practical workability of the framework. During both instantiations, the models developed by means of the framework are shown to improve the optical character recognition accuracy of the document images. AFRIKAANS OPSOMMING: ‘n Kenmerkende eienskap van die era van digitalisering is die alomteenwoordige oorgang van papier-afhanklike en handgebaseerde besigheidsprosesse na volledig digitale, ekenaargesteunde en outomatiese weergawes daarvan. Alhoewel baie nywerhede reeds begin het met hierdie oorgang weg van papierdokumente, is verskeie werklike inligtingskettings steeds verweef met stroomaf papiergebaseerde stelsels. Sommige van hierdie stelsels kan ’n paar dekades benodig om oor te skakel na ’n volledig digitale weergawe daarvan. Gevolglik, om hierdie prosesse ten volle te outomatiseer, behoort die papiergebaseerde dokumente gedigitaliseer te word. Gerekenariseerde benaderings, e.g. optiese karakterherkenningsenjins, het noemenswaardige sukses behaal om pixel-gebaseerde inligting akkuraat in masjien-gekodeerde inligting te onttrek. Die werkverrigting van hierdie enjins is egter afhanklik van die kwaliteit van die vasgelˆede dokumentbeelde. Alhoewel daar ’n oorvloed van beeldverbeteringstegnieke is wat ontwerp is om beeldkwaliteit te verhoog, behels die implementering van sommige van hierdie tegnieke ’n groot mate van afhanklikheid van menslike insig aangesien elke dokumentbeeld ’n unieke stel voorverwerkingstappe vereis. Gevolglik regverdig die toepassing van data-gedrewe benaderings uit die gebied van masjienleer — meer spesifiek, diep leer — beslis oorweging binne die voorgestelde konteks. In hierdie tesis word ’n generiese raamwerk vir intelligente dokumentbeeldverbetering vir verbeterde optiese karakterherkenning voorgestel. Die fokus van die raamwerk word geplaas op die fasilitering van die teksonttrekkingsprosedure van dokumentbeelde deur die voorverwerkingstadium te outomatiseer deur middel van intelligente identifisering van watter kombinasie van kumentbeeldverbeteringstegnieke om ten opsigte van individuele beelde te implementeer. Kragtige benaderings vanuit die rigting van rekenaarvisie, tesame met die implementering van oordragleer, word oorweeg. ’n Instansiasie van hierdie raamwerk word eerstens ge¨ımplementeer op ’n maatstafdokumentanalise datastel. Daarna word die raamwerk toegepas op ’n werklike gevallestudie in die Suid-Afrikaanse banksektor om die praktiese werkbaarheid van die raamwerk te illustreer. Tydens beide instansiasies word die modelle wat deur middel van die raamwerk ontwikkel is, gewys om die optiese karakterherkenning akkuraatheid van die dokumentbeelde te verbeter. Masters 2023-02-06T14:37:46Z 2023-05-18T07:03:49Z 2023-02-06T14:37:46Z 2023-05-18T07:03:49Z 2023-03 Thesis http://hdl.handle.net/10019.1/127090 en_ZA en_ZA Stellenbosch University xx, 178 pages : illustrations. application/pdf Stellenbosch : Stellenbosch University
spellingShingle	Optical character recognition Document imaging systems Computer vision Kleinhans, Ryno Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance
title	Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance
title_full	Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance
title_fullStr	Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance
title_full_unstemmed	Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance
title_short	Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance
title_sort	towards a framework for intelligent document image enhancement in pursuit of improved ocr performance
topic	Optical character recognition Document imaging systems Computer vision
url	http://hdl.handle.net/10019.1/127090
work_keys_str_mv	AT kleinhansryno towardsaframeworkforintelligentdocumentimageenhancementinpursuitofimprovedocrperformance

Full Text Available

Towards a framework for intelligent document image enhancement in pursuit of improved OCR performance

Similar Items