Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

A study of fairness in machine learning in the presence of missing values

Thesis (MCom)--Stellenbosch University, 2023.

Saved in:
Bibliographic Details
Main Author: Bhatti, Aeysha Aziz
Other Authors: Sandrock, Trudy
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2023
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613767866515456
access_status_str Open Access
author Bhatti, Aeysha Aziz
author2 Sandrock, Trudy
author_browse Bhatti, Aeysha Aziz
Sandrock, Trudy
author_facet Sandrock, Trudy
Bhatti, Aeysha Aziz
author_sort Bhatti, Aeysha Aziz
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MCom)--Stellenbosch University, 2023.
format Thesis
id oai:scholar.sun.ac.za:10019.1/128499
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:41:23.238Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/128499 A study of fairness in machine learning in the presence of missing values Bhatti, Aeysha Aziz Sandrock, Trudy Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science. Machine learning Big data Machine learning -- Mathematical models Computer algorithms UCTD Thesis (MCom)--Stellenbosch University, 2023. ENGLISH SUMMARY: Fairness of Machine Learning algorithms is a topic that is receiving increasing attention, as more and more algorithms permeate the day to day aspects of our lives. One way in which bias can manifest in a data source is through missing values. If data are missing, these data are often assumed to be missing completely randomly, but usually this is not the case. In reality, the propensity of data being missing is often tied to socio-economic status or demographic characteristics of individuals. There is very limited research into how missing values and missing value handling methods can impact the fairness of an algorithm. In this research, we conduct a systematic study starting from the foundational questions of how the data are missing, how the missing data are dealt with and how this impacts fairness, based on the outcome of a few different types of machine learning algorithms. Most researchers, when dealing with missing data, either apply listwise deletion or tend to use the simpler methods of imputation versus the more complex ones. We study the impact of these simpler methods on the fairness of algorithms. Our results show that the missing data mechanism and missing data handling procedure can impact the fairness of an algorithm, and that under certain conditions the simpler imputation methods can sometimes be beneficial in decreasing discrimination. AFRIKAANSE OPSOMMING: Die regverdigheid van masjienleeralgoritmes is ’n onderwerp wat toenemend aandag geniet, soos al hoe meer algoritmes elke aspek van ons alledaagse lewens deurdring. Een manier waarop sydigheid in ’n databron kan manifesteer is deur ontbrekende waardes. Indien daar ontbrekende data is, word daar dikwels aanvaar dat die data op ’n algeheel ewekansige manier ontbrekend is, maar dit is gewoonlik nie die geval nie. In werklikheid is die geneigdheid vir die afwesigheid van data dikwels eerder as meer komplekse metodes wanneer hulle met ontbrekende waardes gekonfronteer word. Ons ondersoek die impak van hierdie eenvoudiger metodes op die regverdigheid van algoritmes. Ons resultate toon dat die onderliggende ontbrekende waarde meganisme en die prosedure vir die hantering van ontbrekende waardes die regverdigheid van ’n algoritme kan beinvloed, en dat onder sekere kondisies die eenvoudiger imputasiemetodes soms kan help om diskriminasie te verminder. Masters 2023-03-01T08:54:25Z 2023-08-30T13:07:36Z 2023-03-01T08:54:25Z 2023-08-31T09:18:53Z 2023-03-01T08:54:25Z 2023-08-31T09:18:53Z 2023-03 Thesis https://scholar.sun.ac.za/handle/10019.1/128499 en_ZA Stellenbosch University application/pdf xi, 125 pages : illustrations, includes annexures application/pdf Stellenbosch : Stellenbosch University
spellingShingle Machine learning
Big data
Machine learning -- Mathematical models
Computer algorithms
UCTD
Bhatti, Aeysha Aziz
A study of fairness in machine learning in the presence of missing values
title A study of fairness in machine learning in the presence of missing values
title_full A study of fairness in machine learning in the presence of missing values
title_fullStr A study of fairness in machine learning in the presence of missing values
title_full_unstemmed A study of fairness in machine learning in the presence of missing values
title_short A study of fairness in machine learning in the presence of missing values
title_sort study of fairness in machine learning in the presence of missing values
topic Machine learning
Big data
Machine learning -- Mathematical models
Computer algorithms
UCTD
url https://scholar.sun.ac.za/handle/10019.1/128499
work_keys_str_mv AT bhattiaeyshaaziz astudyoffairnessinmachinelearninginthepresenceofmissingvalues
AT bhattiaeyshaaziz studyoffairnessinmachinelearninginthepresenceofmissingvalues