Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Prediction of hepatitis B in adult patients in the Gambia using machine learning algorithms.

Thesis (MSc)--Stellenbosch University, 2025.

Saved in:
Bibliographic Details
Main Author: Asare, Dorcas
Other Authors: Bah, B.
Format: Thesis
Language:English
Published: Stellenbosch : Stellenbosch University 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613858711994368
access_status_str Open Access
author Asare, Dorcas
author2 Bah, B.
author_browse Asare, Dorcas
Bah, B.
author_facet Bah, B.
Asare, Dorcas
author_sort Asare, Dorcas
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MSc)--Stellenbosch University, 2025.
format Thesis
id oai:scholar.sun.ac.za:10019.1/134498
institution Stellenbosch University (South Africa)
language English
last_indexed 2026-06-10T12:42:49.487Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/134498 Prediction of hepatitis B in adult patients in the Gambia using machine learning algorithms. Asare, Dorcas Bah, B. Ndow, G. Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Applied Mathematics Division. Hepatitis B virus -- Mortality -- Africa Artificial intelligence -- Medical applications Medical informatics Machine learning Thesis (MSc)--Stellenbosch University, 2025. Asare, D. 2025. Prediction of Hepatitis B in Adult Patients in The Gambia using Machine Learning Algorithms. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/47cfee6e-c38f-4854-90ed-25408027f782 ENGLISH ABSTRACT: Chronic hepatitis B (CHB) is a major global health problem, accounting for a signifi-cant proportion of premature death in Africa. The World Health Organisation has set ambitious targets to eliminate CHB as a public health threat by 2030. To achieve these elimination targets, we need to better understand the factors associated with disease progression and mortality among African adults living with CHB infection. Using the long-term prospective cohort of CHB adults set up by the PROLIFICA (Prevention of Liver Fibrosis and Cancer in Africa) program in The Gambia, my study aimed to predict HBV-related mortality among treated and treatment-naïve adults, and estimate and predict CHB disease progression among untreated adults with CHB infection. Using the longitudinal PROLIFICA data, which comprises comprehensive clinical and laboratory data collected annually from 2011 to 2023, I used Machine Learning Algorithms (Deci-sion Tree, K Nearest Neighbors, Logistic Regression, Naive Bayes, Random Forest, and Support Vector Machines) to study the association between multiple features (biodata, epidemiological, clinical, virological, environmental) and key outcomes (mortality, dis-ease progression), and to discover data patterns and biomarkers that can predict disease outcomes. The Random Forest (RF) and Naive Bayes (NB) models had the highest predictive score of 0.95 to predict mortality among CHB infected adults in The Gambia. The model found that high baseline Fibroscan result (which highly correlated with GGT and alkaline phosphate), years of follow-up, albumin levels, presence of liver cirrhosis, and low platelet count were significant factors associated with mortality. The Logistic Regression (LR) model had the highest score (0.70) for predicting disease progression with significant factors like fibroscan results (highly correlated with GGT and alkaline phosphate), years interval, haemoglobin levels, albumin levels, and living in the West Coast Region of The Gambia as key associated features. Lastly, the Random Forest (RF) model had the highest score (0.94) in predicting treatment outcomes with fibroscan results (MSTIFF), AST levels (highly correlated with ALT and total bilirubin), viral load greater than 2000IU/mL, GGT levels, and elevated fasting blood sugar levels (glufast) as significant features associated with treatment outcome. This study provides reliable evidence that complements epidemiological studies to better understand the determinants of CHB outcomes in The Gambia. My findings support routine screening and early initiation of treatment to prevent poor clinical outcomes and mortality. The models and data patterns can be used in a clinical setting to better detect high-risk patients for prioritization. The effectiveness of these models should be tested in larger cohorts with more comprehensive data. AFRIKAANSE OPSOMMING: Chroniese hepatitis B (CHB) is ’n groot gesondheidesprobleem, wat verantwoordelik is vir ’n beduidende deel van voortydige sterftes in Afrika. DieWereldgesondheidsorganisasie het ambisieuse teikens gestel om CHB as ’n openbare gesonheidsbedreiging teen 2030 uit te skakel. Om hierdie uitskakelingsteikens te bereik, moet ons die faktore wat verband hou met siekteprogressie en mortaliteit onder Afrika-volwassenes wat met CHB-infeksie leef, beter verstaan. Deur gebruik te maak van die langtermyn-voornemende groep CHB-volwassenes wat deur die PROLIFICA (Prevention of Liver Fibrosis and Cancer in Africa)-program in Gambie opgestel is, het my studie ten doel gehad om HBV-verwante sterftes onder behandelde en behandelde-naiewe volwassenes te voorspel, en CHB te skat en te voorspel siekte vordering onder onbehandelde volwassenes met CHB infeksie. Met behulp van die longitudinale PROLIFICA-data, wat omvattende kliniese en laboratoriumdata was jaarliks van 2011 tot 2023 ingesamel is, bestaan, het ek Masjienleeralgoritmes (Besluitboom, K Naaste Bure, Logistiese Regressie, Naiewe Bayes, Random Forest en ondersteuningsvektormasjiene) gebruik om die verband tudden veelvuldige kenmerke (biodata, epidemiologies, klinies, virologies, omgewings) en sleuteluitkomste (sterftes, siekteprogressie), en om datapatrone en biomerkers te ontdek wat siekte-uitkomste kam voorspel. Die Random Forest (RF) en Naive Bayes (NB) modelle het die hoogste voorspellende telling van 0.95 gehad om mortaliteit onder CHB besmette volwassenes in Gambië te voorspel. Die model het bevind dat hoë basislyn Fibroscan-resultaat (wat hoogs gekorreleer is met GGT en alkaliese fosfaat), jare van opvolg, albumienvlakke, teenwoordigheid van lewersirrose en lae bloedplaatjietelling betekenisvolle faktore was wat met mortaliteitverband hou. Die Logistic Regression (LR) model het die hoogste telling (0.70) gehad vir die voorspelling van siekteprogressie met beduidende faktore soos fibroscan resultate (hoogs gekorreleer met GGT en alkaliese fosfaat), jare interval, hemoglobienvlakke, albumienvlakke, en woon in dieWeste Kusstreek van Gambië as sleutelverwante kenmerke. Laastens het die Random Forest (RF) model die hoogste telling (0.94) in die voorspelling van behandelingsuitkomste met fibroscan resultate (MSTIFF), AST-vlakke (hoogs gekorreleermet ALT en totale bilirubien), virale lading groter as 2000IU/mL, GGT-vlakke en verhoogde vastende bloedsuikervlakke (glufast) as beduidende kenmerke wat verband hou met swak behandelingsuitkomste. Hierdie studie verskaf betroubare bewyse wat epidemiologiese studies aanvul om die determinante van CHB-uitkomste in Gambië beter te verstaan. My bevindinge ondersteun roetine-sifting en vroeë aanvang van behandeling om swak kliniese uitkomste en mortaliteit te voorkom. Die modelle en datapatrone kan in ’n kliniese omgewing gebruik word om hoërisikopasiënte beter op te spoor vir prioritisering. Die doeltreffendheid van hierdie modelle moet in groter kohorte met meer omvattende data getoets word. Masters 2025-12-11T09:38:35Z 2025-12-11T09:38:35Z 2025-12 Thesis https://scholar.sun.ac.za/handle/10019.1/134498 en Stellenbosch University 131 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle Hepatitis B virus -- Mortality -- Africa
Artificial intelligence -- Medical applications
Medical informatics
Machine learning
Asare, Dorcas
Prediction of hepatitis B in adult patients in the Gambia using machine learning algorithms.
title Prediction of hepatitis B in adult patients in the Gambia using machine learning algorithms.
title_full Prediction of hepatitis B in adult patients in the Gambia using machine learning algorithms.
title_fullStr Prediction of hepatitis B in adult patients in the Gambia using machine learning algorithms.
title_full_unstemmed Prediction of hepatitis B in adult patients in the Gambia using machine learning algorithms.
title_short Prediction of hepatitis B in adult patients in the Gambia using machine learning algorithms.
title_sort prediction of hepatitis b in adult patients in the gambia using machine learning algorithms
topic Hepatitis B virus -- Mortality -- Africa
Artificial intelligence -- Medical applications
Medical informatics
Machine learning
url https://scholar.sun.ac.za/handle/10019.1/134498
work_keys_str_mv AT asaredorcas predictionofhepatitisbinadultpatientsinthegambiausingmachinelearningalgorithms