Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data

Insurance underwriting can be time-consuming and costly for both insurers and customers. However, the insight gained is of critical importance in addressing the information asymmetry between insurers and customers in terms of establishing a customer's risk profile. Consequently, any test that assist...

Full description

Saved in:

Bibliographic Details
Main Author:	Perumal, Yevashan
Other Authors:	Britz, Stefan
Format:	Thesis
Language:	Eng
Published:	Department of Statistical Sciences 2024
Subjects:	Statistical Sciences
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613182662541312
access_status_str	Open Access
author	Perumal, Yevashan
author2	Britz, Stefan
author_browse	Britz, Stefan Perumal, Yevashan
author_facet	Britz, Stefan Perumal, Yevashan
author_sort	Perumal, Yevashan
collection	Thesis
description	Insurance underwriting can be time-consuming and costly for both insurers and customers. However, the insight gained is of critical importance in addressing the information asymmetry between insurers and customers in terms of establishing a customer's risk profile. Consequently, any test that assists in providing a risk assessment is critical in allowing insurance companies to manage risk and price their products appropriately. Gamma-glutamyl Transferase (GGT) is an enzyme which has been used by insurers in underwriting medical tests as an indicator of potential adverse outcomes. However, due to complexities such as differing underwriting strategies, data collection and data storage issues, not every customer on an insurer's books will have a GGT value or even a complete data profile. This research investigates if statistical techniques such as imputation and supervised learning can be used in conjunction with available medical, demographic, underwriting and policy data to accurately predict GGT values. A combination of multivariate imputation by chained equations (MICE) and extremegradient boosted trees (XGBoost) offers a 31% improvement in accuracy compared to a naïve prediction. However, there does appear to be a limit to the performance achieved from all implemented techniques with the analysed dataset, with various model combinations yielding root mean squared error (RMSE) values within a narrow range. In addition, when comparing the predictions from a separate, unlabelled dataset to actual data, it appears as though predictions from the models cannot be reliably deemed to be from the same distribution. This indicates that further research is required before insurers can reliably switch out blood-work based GGT results for those from a supervised learning model. Keywords: insurance, underwriting, gamma-glutamyl transferase, imputation, supervised learning
format	Thesis
id	oai:open.uct.ac.za:11427/39916
institution	University of Cape Town (South Africa)
language	Eng
last_indexed	2026-06-10T12:32:05.102Z
license_str	Not specified — see source repository
provenance_str_mv	Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate	2024
publishDateRange	2024
publishDateSort	2024
publisher	Department of Statistical Sciences
publisherStr	Department of Statistical Sciences
record_format	dspace
source_str	UCTD — University of Cape Town Open Access Repository
spelling	oai:open.uct.ac.za:11427/39916 Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data Perumal, Yevashan Britz, Stefan Statistical Sciences Insurance underwriting can be time-consuming and costly for both insurers and customers. However, the insight gained is of critical importance in addressing the information asymmetry between insurers and customers in terms of establishing a customer's risk profile. Consequently, any test that assists in providing a risk assessment is critical in allowing insurance companies to manage risk and price their products appropriately. Gamma-glutamyl Transferase (GGT) is an enzyme which has been used by insurers in underwriting medical tests as an indicator of potential adverse outcomes. However, due to complexities such as differing underwriting strategies, data collection and data storage issues, not every customer on an insurer's books will have a GGT value or even a complete data profile. This research investigates if statistical techniques such as imputation and supervised learning can be used in conjunction with available medical, demographic, underwriting and policy data to accurately predict GGT values. A combination of multivariate imputation by chained equations (MICE) and extremegradient boosted trees (XGBoost) offers a 31% improvement in accuracy compared to a naïve prediction. However, there does appear to be a limit to the performance achieved from all implemented techniques with the analysed dataset, with various model combinations yielding root mean squared error (RMSE) values within a narrow range. In addition, when comparing the predictions from a separate, unlabelled dataset to actual data, it appears as though predictions from the models cannot be reliably deemed to be from the same distribution. This indicates that further research is required before insurers can reliably switch out blood-work based GGT results for those from a supervised learning model. Keywords: insurance, underwriting, gamma-glutamyl transferase, imputation, supervised learning 2024-06-19T07:22:12Z 2024-06-19T07:22:12Z 2023 2024-06-06T14:24:12Z Thesis / Dissertation Masters MSc http://hdl.handle.net/11427/39916 Eng application/pdf Department of Statistical Sciences Faculty of Science
spellingShingle	Statistical Sciences Perumal, Yevashan Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data
thesis_degree_str	Master's
title	Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data
title_full	Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data
title_fullStr	Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data
title_full_unstemmed	Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data
title_short	Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data
title_sort	applying imputation and statistical learning to predict gamma glutamyl transferase in underwriting data
topic	Statistical Sciences
url	http://hdl.handle.net/11427/39916
work_keys_str_mv	AT perumalyevashan applyingimputationandstatisticallearningtopredictgammaglutamyltransferaseinunderwritingdata

Full Text Available

Applying imputation and statistical learning to predict gamma-glutamyl transferase in underwriting data

Similar Items