Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

A workflow for geocoding South African addresses

There are many industries that have long been utilizing Geographical Information Systems (GIS) for spatial analysis. In many parts of the world, it has gained less popularity because of inaccurate geocoding methods and a lack of data standardization. Commercial services can also be expensive and as...

Full description

Saved in:
Bibliographic Details
Main Author: Van Rensburg, Alexandria
Other Authors: Berman, Sonia
Format: Thesis
Language:English
Published: Department of Computer Science 2016
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614205476077568
access_status_str Open Access
author Van Rensburg, Alexandria
author2 Berman, Sonia
author_browse Berman, Sonia
Van Rensburg, Alexandria
author_facet Berman, Sonia
Van Rensburg, Alexandria
author_sort Van Rensburg, Alexandria
collection Thesis
description There are many industries that have long been utilizing Geographical Information Systems (GIS) for spatial analysis. In many parts of the world, it has gained less popularity because of inaccurate geocoding methods and a lack of data standardization. Commercial services can also be expensive and as such, smaller businesses have been reluctant to make a financial commitment to spatial analytics. This thesis discusses the challenges specific to South Africa as well as the challenges inherent in bad address data. The main goal of this research is to highlight the potential error rates of geocoded user-captured address data and to provide a workflow that can be followed to reduce the error rate without intensive manual data cleansing. We developed a six step workflow and software package to prepare address data for spatial analysis and determine the potential error rate. We used three methods of geocoding: a gazetteer postal code file, a free web API and an international commercial product. To protect the privacy of the clients and the businesses, addresses were aggregated with precision to a postcode or suburb centroid. Geocoding results were analysed before and after each step. Two businesses were analysed, a mid-large scale business with a large structured client address database and a small private business with a 20 year old unstructured client address database. The companies are from two completely different industries, the larger being in the financial industry and the smaller company an independent magazine in publishing.
format Thesis
id oai:open.uct.ac.za:11427/16198
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:48:20.718Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2016
publishDateRange 2016
publishDateSort 2016
publisher Department of Computer Science
publisherStr Department of Computer Science
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/16198 A workflow for geocoding South African addresses Van Rensburg, Alexandria Berman, Sonia Computer Science There are many industries that have long been utilizing Geographical Information Systems (GIS) for spatial analysis. In many parts of the world, it has gained less popularity because of inaccurate geocoding methods and a lack of data standardization. Commercial services can also be expensive and as such, smaller businesses have been reluctant to make a financial commitment to spatial analytics. This thesis discusses the challenges specific to South Africa as well as the challenges inherent in bad address data. The main goal of this research is to highlight the potential error rates of geocoded user-captured address data and to provide a workflow that can be followed to reduce the error rate without intensive manual data cleansing. We developed a six step workflow and software package to prepare address data for spatial analysis and determine the potential error rate. We used three methods of geocoding: a gazetteer postal code file, a free web API and an international commercial product. To protect the privacy of the clients and the businesses, addresses were aggregated with precision to a postcode or suburb centroid. Geocoding results were analysed before and after each step. Two businesses were analysed, a mid-large scale business with a large structured client address database and a small private business with a 20 year old unstructured client address database. The companies are from two completely different industries, the larger being in the financial industry and the smaller company an independent magazine in publishing. 2016-01-02T05:21:50Z 2016-01-02T05:21:50Z 2015 Master Thesis Masters MPhil http://hdl.handle.net/11427/16198 eng application/pdf Department of Computer Science Faculty of Science University of Cape Town
spellingShingle Computer Science
Van Rensburg, Alexandria
A workflow for geocoding South African addresses
thesis_degree_str Master's
title A workflow for geocoding South African addresses
title_full A workflow for geocoding South African addresses
title_fullStr A workflow for geocoding South African addresses
title_full_unstemmed A workflow for geocoding South African addresses
title_short A workflow for geocoding South African addresses
title_sort workflow for geocoding south african addresses
topic Computer Science
url http://hdl.handle.net/11427/16198
work_keys_str_mv AT vanrensburgalexandria aworkflowforgeocodingsouthafricanaddresses
AT vanrensburgalexandria workflowforgeocodingsouthafricanaddresses