Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Case mix and coding error detection in Western Cape healthcare facilities

South Africa has a two-tier structure for the delivery of hospital and health care services: the public sector and the private sector. The private sector is known for having better service quality, cost, and data management. The Clinton Health Access Initiative (CHAI) has been supporting the first s...

Full description

Saved in:
Bibliographic Details
Main Author: Narayan, Saiheal
Other Authors: Ngwenya, Mzabalazo
Format: Thesis
Language:Eng
Published: Department of Statistical Sciences 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613284722540544
access_status_str Open Access
author Narayan, Saiheal
author2 Ngwenya, Mzabalazo
author_browse Narayan, Saiheal
Ngwenya, Mzabalazo
author_facet Ngwenya, Mzabalazo
Narayan, Saiheal
author_sort Narayan, Saiheal
collection Thesis
description South Africa has a two-tier structure for the delivery of hospital and health care services: the public sector and the private sector. The private sector is known for having better service quality, cost, and data management. The Clinton Health Access Initiative (CHAI) has been supporting the first steps towards Diagnosis Related Group (DRG) to categorise hospitalisation costs in the public health facilities in South Africa. DRG's are widely used in the private sector for active cost management. Additionally, an issue was raised by the on-site audit clinical coding report of the public hospitals managed by the Western Cape Department of Health, which must be addressed. This dissertation applies case mix adjustment for hospitals in the Western Cape based on DRG weights from the private sector. DRG weights represent the average resources required to care for cases in that particular DRG, relative to the average resources used to treat cases in all DRGs. This is then compared to another metric that uses actual length of stay data from the public sector, which will act as a proxy for resource utilisation (Fetter, Shin, Freeman, Averill, and Thompson, 1980). The objective is to find out if case mix will help in identifying hospitals which take on highly resource intensive procedures on average. The potential of using case mix in the public sector will allow for optimized resourcing. The second part looks at generating classification models that will be used to flag diagnosis coding errors by healthcare staff in the Western Cape. Patient-level data was used which includes length of stay, procedures, and cost centre. Models trained to classify diagnosis include neural networks, multinomial logistic regression, random forests, SMOTE (Synthetic Minority Over-sampling Technique), and finally an ensemble of the top 3 models using majority voting. These models are able to handle multiple response categories. The aim of the error detection model will be to increase data quality in the public sector. The results showed that the DRG weights from the private sector might not be appropriate for the public health sector. Next, it was shown that the best predictive model for diagnosis was a random forest with an accuracy of 57% on the unseen test dataset. Lastly, through the explanatory analysis, this dissertation identified both qualitative and quantitative relationships in the data that could open up avenues for more research and development. These results can be used to help stakeholders make informed decisions and improve data quality in the public sector.
format Thesis
id oai:open.uct.ac.za:11427/41252
institution University of Cape Town (South Africa)
language Eng
last_indexed 2026-06-10T12:33:41.762Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Department of Statistical Sciences
publisherStr Department of Statistical Sciences
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/41252 Case mix and coding error detection in Western Cape healthcare facilities Narayan, Saiheal Ngwenya, Mzabalazo Silal Sheetal Statistical Sciences South Africa has a two-tier structure for the delivery of hospital and health care services: the public sector and the private sector. The private sector is known for having better service quality, cost, and data management. The Clinton Health Access Initiative (CHAI) has been supporting the first steps towards Diagnosis Related Group (DRG) to categorise hospitalisation costs in the public health facilities in South Africa. DRG's are widely used in the private sector for active cost management. Additionally, an issue was raised by the on-site audit clinical coding report of the public hospitals managed by the Western Cape Department of Health, which must be addressed. This dissertation applies case mix adjustment for hospitals in the Western Cape based on DRG weights from the private sector. DRG weights represent the average resources required to care for cases in that particular DRG, relative to the average resources used to treat cases in all DRGs. This is then compared to another metric that uses actual length of stay data from the public sector, which will act as a proxy for resource utilisation (Fetter, Shin, Freeman, Averill, and Thompson, 1980). The objective is to find out if case mix will help in identifying hospitals which take on highly resource intensive procedures on average. The potential of using case mix in the public sector will allow for optimized resourcing. The second part looks at generating classification models that will be used to flag diagnosis coding errors by healthcare staff in the Western Cape. Patient-level data was used which includes length of stay, procedures, and cost centre. Models trained to classify diagnosis include neural networks, multinomial logistic regression, random forests, SMOTE (Synthetic Minority Over-sampling Technique), and finally an ensemble of the top 3 models using majority voting. These models are able to handle multiple response categories. The aim of the error detection model will be to increase data quality in the public sector. The results showed that the DRG weights from the private sector might not be appropriate for the public health sector. Next, it was shown that the best predictive model for diagnosis was a random forest with an accuracy of 57% on the unseen test dataset. Lastly, through the explanatory analysis, this dissertation identified both qualitative and quantitative relationships in the data that could open up avenues for more research and development. These results can be used to help stakeholders make informed decisions and improve data quality in the public sector. 2025-03-26T11:54:08Z 2025-03-26T11:54:08Z 2024 2025-03-26T11:13:42Z Thesis / Dissertation Masters MSc http://hdl.handle.net/11427/41252 Eng application/pdf Department of Statistical Sciences Faculty of Science
spellingShingle Statistical Sciences
Narayan, Saiheal
Case mix and coding error detection in Western Cape healthcare facilities
thesis_degree_str Master's
title Case mix and coding error detection in Western Cape healthcare facilities
title_full Case mix and coding error detection in Western Cape healthcare facilities
title_fullStr Case mix and coding error detection in Western Cape healthcare facilities
title_full_unstemmed Case mix and coding error detection in Western Cape healthcare facilities
title_short Case mix and coding error detection in Western Cape healthcare facilities
title_sort case mix and coding error detection in western cape healthcare facilities
topic Statistical Sciences
url http://hdl.handle.net/11427/41252
work_keys_str_mv AT narayansaiheal casemixandcodingerrordetectioninwesterncapehealthcarefacilities