Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Visualising interpretability in random forests

Thesis (MCom)--Stellenbosch University, 2025.

Saved in:

Bibliographic Details
Main Author:	Manefeldt, Peter Cornelius
Other Authors:	Lamont, M. M. C.
Format:	Thesis
Published:	Stellenbosch : Stellenbosch University 2025
Subjects:	Machine learning > Models Biplots Geometric data analysis Correspondence analysis UCTD
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613887476531200
access_status_str	Open Access
author	Manefeldt, Peter Cornelius
author2	Lamont, M. M. C.
author_browse	Lamont, M. M. C. Manefeldt, Peter Cornelius
author_facet	Lamont, M. M. C. Manefeldt, Peter Cornelius
author_sort	Manefeldt, Peter Cornelius
collection	Thesis
dc_rights_str_mv	Stellenbosch University
description	Thesis (MCom)--Stellenbosch University, 2025.
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/132647
institution	Stellenbosch University (South Africa)
last_indexed	2026-06-10T12:43:16.997Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2025
publishDateRange	2025
publishDateSort	2025
publisher	Stellenbosch : Stellenbosch University
publisherStr	Stellenbosch : Stellenbosch University
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/132647 Visualising interpretability in random forests Manefeldt, Peter Cornelius Lamont, M. M. C. Lubbe, S. Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistical and Actuarial Science. Machine learning -- Models Biplots Geometric data analysis Correspondence analysis UCTD Thesis (MCom)--Stellenbosch University, 2025. Manefeldt, P. C. 2025. Visualising Interpretability in Random Forests. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/57483c04-5512-4dfb-b1b0-590be397e355 ENGLISH SUMMARY: Models that use prediction proficiency as their aim are often viewed as black-box models. So called “black-box” models are able to map highly complex nonlinear relationships with high order interactions, but lack interpretability. Random Forests are one such model, with decision boundaries described by its thousands of trees. Random Forests have been shown to incur a low generalisation error while also needing very little to no optimisation by the user. Random Forest proximities and out-of-bag (OOB) Random Forest proximities act as two unique similarity measures between observations. Multidimensional scaling (MDS) seeks to find a low dimensional approximation of pairwise similarities that provides a visual representation of the similarity between observations. By applying MDS to the Random Forest proximity measure, Random Forest proximity plots are constructed. The Random Forest proximity plot provides a view of how the observations in your sample are related from the model’s perspective, thus allowing us to “see through the eyes” of the black-box. The MDS method under consideration is classical scaling, as this provides a transformation that can be used to embed new/hypothetical cases on the proximity plot. How would the model have viewed a given observation if one of its covariates were different? We can answer this counterfactual question by embedding counterfactual observations into the proximity plot. These embedded counterfactual cases can be used to create trajectory axes. Case based trajectory axes embedded in the proximity plot, would result in a Random Forest proximity biplot. As a special case of nonlinear biplots, this enables the exploration of the relationships uncovered by the model. Additionally, adding predictive axes creates a biplot that relates the model’s view of the observations back to the original variables. Additional procedures are added that use α-bags to visualise the sampling variability in the MDS procedure as well as stability of the Random Forest proximity. AFRIKAANSE OPSOMMING: Modelle met noukeurigheid van voorspellings as hul hoof doel word gereeld as “swart boks” modelle gesien. Sogenaamde “swart boks” modelle is in staat om hoogs komplekse nie-lineere verhoudings met hoe orde interaksies te pas, maar lei aan ’n gebrek aan interpreteerbaarheid. Die Ewekansige Woud is een so ’n model, met beslissingsgrense beskryf deur hul duisende bome. Daar is getoon dat die Ewekansige Woud ’n lae veralgemeende foutkoers behaal terwyl dit ook baie min tot geen optimalisering deur die gebruiker benodig. Ewekansige Woud nabyheid en buite-sak (OOB) Ewekansige woud nabyheid tree op as twee unieke ooreenkomsmaatstawwe tussen waarnemings. Meerdimensionale skalering (MDS) poog om ’n lae dimensionele benadering van paarsgewyse ooreenkomste te vind wat visueel voorgestel kan word. Deur MDS toe te pas op die Ewekansige Woud nabyheidsmaatreel, word die Ewekansige Woud Nabyheidsdiagram geskep. Die Ewekansige Woud Nabyheidsdiagram bied ’n “blik deur die o¨e” van die swart boks, deur die verwantskap tussen die waarneming in die steekproef uit die model se perspektief te sien. Die MDSmetode wat benut word, is klassieke skalering, aangesien dit ’n transformasie verskaf wat gebruik kan word om nuwe/hipotetiese gevalle op die nabyheidsdiagram in te sluit. Hoe sou die model ’n gegewe waarneming sien indien een van sy kovariate anders was? Ons kan hierdie teenfeitelike vraag beantwoord deur teenfeitelike waarnemings in die nabyheidsdiagram in te sluit. Hierdie ingebedde teenfeitelike gevalle kan gebruik word om trajek-asse te skep. Gevalgebaseerde trajek-asse wat in die nabyheidsdiagram ingebed is, sal lei tot ’n Ewekansige Woud bi-stipping. As ’n spesiale geval van nie-lineere bi-stipping, maak dit die verkenning van die verwantskappe wat deur die model ontbloot word moontlik. Deur voorspelling-asse by te voeg, word ’n bi-stipping geskep wat die model se siening van die waarnemings terug na die oorspronklike veranderlikes koppel. Bykomende prosedures word bygevoeg vir die visualisering van steekproefveranderlikheid en onstabiliteit van die Ewekansige Woud nabyheid deur middel van α-sakkies. Masters 2025-06-12T08:55:18Z 2025-06-12T08:55:18Z 2025-03 Thesis https://scholar.sun.ac.za/handle/10019.1/132647 Stellenbosch University xvii, 125 pages : illustrations, includes annexures application/pdf Stellenbosch : Stellenbosch University
spellingShingle	Machine learning -- Models Biplots Geometric data analysis Correspondence analysis UCTD Manefeldt, Peter Cornelius Visualising interpretability in random forests
title	Visualising interpretability in random forests
title_full	Visualising interpretability in random forests
title_fullStr	Visualising interpretability in random forests
title_full_unstemmed	Visualising interpretability in random forests
title_short	Visualising interpretability in random forests
title_sort	visualising interpretability in random forests
topic	Machine learning -- Models Biplots Geometric data analysis Correspondence analysis UCTD
url	https://scholar.sun.ac.za/handle/10019.1/132647
work_keys_str_mv	AT manefeldtpetercornelius visualisinginterpretabilityinrandomforests

Full Text Available

Visualising interpretability in random forests

Similar Items