Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Privacy preserving data anonymisation: an experimental examination of customer data for POPI compliance in South Africa

Data has become an essential commodity in this day and age. Organisations want to share the massive amounts of data that they collect as a way to leverage and grow their businesses. On the other hand, the need to maintain privacy is critical in order to avoid the release of sensitive information. Th...

Full description

Saved in:
Bibliographic Details
Main Author: Chetty, Nirvashnee
Other Authors: Hutchison, Andrew
Format: Thesis
Language:English
Published: University of Cape Town 2020
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613312093519872
access_status_str Open Access
author Chetty, Nirvashnee
author2 Hutchison, Andrew
author_browse Chetty, Nirvashnee
Hutchison, Andrew
author_facet Hutchison, Andrew
Chetty, Nirvashnee
author_sort Chetty, Nirvashnee
collection Thesis
description Data has become an essential commodity in this day and age. Organisations want to share the massive amounts of data that they collect as a way to leverage and grow their businesses. On the other hand, the need to maintain privacy is critical in order to avoid the release of sensitive information. This has been shown to be a constant challenge, namely the trade-off between preserving privacy and data utility [1]. This study performs an evaluation of privacy models together with their relevant tools and techniques to ascertain whether data can be anonymised in such a way that it can be in compliance with the Protection of Personal Information (POPI) Act and preserve the privacy of individuals. The results of this research should provide a practical solution for organisations in South Africa to adequately anonymise customer data to ensure POPI Act compliance with the use of a software tool. An experimental environment was setup with the ARX de-identification tool as the tool of choice to implement the privacy models. Two privacy models, namely k-anonymity and ldiversity, were tested on a publicly available data set. Data quality models as well as privacy risk measures were implemented. The results of the study showed that when taking both data utility and privacy risks into consideration, neither privacy model was the clear winner. The K-anonymity privacy model was a better choice for data utility, whereas the l-diversity privacy model was a better choice for privacy preservation by reducing re-identification risks. Therefore, in relation to the aim of the study which is to compare the results of data anonymisation to ensure that data privacy needs are met more than data utility, the result showed that the l-diversity privacy model was the preferred model. Finally, considering that the POPI Act is still awaiting the final step to be promulgated, there is time to conduct further experiments in the various ways to practically implement and apply data anonymisation techniques in the day-to-day processing of data and information in South Africa.
format Thesis
id oai:open.uct.ac.za:11427/32448
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:34:08.683Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2020
publishDateRange 2020
publishDateSort 2020
publisher University of Cape Town
publisherStr University of Cape Town
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/32448 Privacy preserving data anonymisation: an experimental examination of customer data for POPI compliance in South Africa Chetty, Nirvashnee Hutchison, Andrew Computer Science Data Anonymisation Data has become an essential commodity in this day and age. Organisations want to share the massive amounts of data that they collect as a way to leverage and grow their businesses. On the other hand, the need to maintain privacy is critical in order to avoid the release of sensitive information. This has been shown to be a constant challenge, namely the trade-off between preserving privacy and data utility [1]. This study performs an evaluation of privacy models together with their relevant tools and techniques to ascertain whether data can be anonymised in such a way that it can be in compliance with the Protection of Personal Information (POPI) Act and preserve the privacy of individuals. The results of this research should provide a practical solution for organisations in South Africa to adequately anonymise customer data to ensure POPI Act compliance with the use of a software tool. An experimental environment was setup with the ARX de-identification tool as the tool of choice to implement the privacy models. Two privacy models, namely k-anonymity and ldiversity, were tested on a publicly available data set. Data quality models as well as privacy risk measures were implemented. The results of the study showed that when taking both data utility and privacy risks into consideration, neither privacy model was the clear winner. The K-anonymity privacy model was a better choice for data utility, whereas the l-diversity privacy model was a better choice for privacy preservation by reducing re-identification risks. Therefore, in relation to the aim of the study which is to compare the results of data anonymisation to ensure that data privacy needs are met more than data utility, the result showed that the l-diversity privacy model was the preferred model. Finally, considering that the POPI Act is still awaiting the final step to be promulgated, there is time to conduct further experiments in the various ways to practically implement and apply data anonymisation techniques in the day-to-day processing of data and information in South Africa. 2020-12-30T10:17:54Z 2020-12-30T10:17:54Z 2020 Master Thesis Masters MSc http://hdl.handle.net/11427/32448 eng application/pdf University of Cape Town Department of Computer Science Faculty of Science
spellingShingle Computer Science
Data Anonymisation
Chetty, Nirvashnee
Privacy preserving data anonymisation: an experimental examination of customer data for POPI compliance in South Africa
thesis_degree_str Master's
title Privacy preserving data anonymisation: an experimental examination of customer data for POPI compliance in South Africa
title_full Privacy preserving data anonymisation: an experimental examination of customer data for POPI compliance in South Africa
title_fullStr Privacy preserving data anonymisation: an experimental examination of customer data for POPI compliance in South Africa
title_full_unstemmed Privacy preserving data anonymisation: an experimental examination of customer data for POPI compliance in South Africa
title_short Privacy preserving data anonymisation: an experimental examination of customer data for POPI compliance in South Africa
title_sort privacy preserving data anonymisation an experimental examination of customer data for popi compliance in south africa
topic Computer Science
Data Anonymisation
url http://hdl.handle.net/11427/32448
work_keys_str_mv AT chettynirvashnee privacypreservingdataanonymisationanexperimentalexaminationofcustomerdataforpopicomplianceinsouthafrica