Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

A genetic algorithm based model tree forest

Thesis (MEng)--Stellenbosch University, 2023.

Saved in:
Bibliographic Details
Main Author: Van der Merwe, Werner
Other Authors: Engelbrecht, A. P.
Format: Thesis
Language:en_ZA
en_ZA
Published: Stellenbosch : Stellenbosch University 2023
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614022871810048
access_status_str Open Access
author Van der Merwe, Werner
author2 Engelbrecht, A. P.
author_browse Engelbrecht, A. P.
Van der Merwe, Werner
author_facet Engelbrecht, A. P.
Van der Merwe, Werner
author_sort Van der Merwe, Werner
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MEng)--Stellenbosch University, 2023.
format Thesis
id oai:scholar.sun.ac.za:10019.1/126939
institution Stellenbosch University (South Africa)
language en_ZA
en_ZA
last_indexed 2026-06-10T12:45:26.037Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/126939 A genetic algorithm based model tree forest Van der Merwe, Werner Engelbrecht, A. P. Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering. Ensemble learning (Machine learning) Genetic algorithms Decision trees Thesis (MEng)--Stellenbosch University, 2023. ENGLISH ABSTRACT: This thesis presents an ensemble approach that reduces the high variance error exhibited by model trees that comprise multivariate nonlinear models and increases the overall robustness of model trees. The ensemble approach is conceptualised, tuned for, and evaluated against competing regression ensemble models on ten separate benchmarking datasets. The ensemble, referred to as the model tree forest (MTF), incorporates a hybrid genetic algorithm approach to construct structurally optimal polynomial expressions (GASOPE) within the leaf nodes of greedy induced model trees that form the base learners of the ensemble. Bootstrap aggregation, together with the implementation of randomised feature space splits during tree induction, sufficiently decorrelates the base learners within the ensemble. Thereby, the variance error of MTF is reduced compared to that of a single model tree, whilst the favourable low bias error of model trees is retained. The multivariate nonlinear models that predict the output enable MTF to produce approximations of highly nonlinear data. The addition of ensembling methods passively combat overfitting brought forth by the increased model complexity, compared to the previous implementation of GASOPE within a tree structure, which exhibits overfitting in specific cases. MTF produced a similar predictive accuracy to the random forest method and outperformed an artificial feed-forward neural network ensemble, an ensemble of M5 model trees, and ensembled support vector regression models. However, the computational cost of the MTF induction algorithm is up to four orders of magnitude greater than RF. AFRIKAANS OPSOMMING: Hierdie tesis stel ’n groeperingsmetode voor wat die ho¨e variansie fout van modelbome, met ho¨er orde uitset funksies, verminder. Die groeperingsmetode verhoog ook die robuustheid van die modelbome. Hierdie groeperingsmetode word na verwys as “model tree forest”(MTF). MTF word volledig beskryf, verstel en evalueer teen ander regressie modelle op tien verskillende datastelle. MTF behels ’n genetiese algoritme wat optimale polinoom funksies binne die blaar nodes van die gierige ge¨ınduseerede modelbome kweek. Die gebruik van “bootstrap aggregation”gesaamentlik met lukraak-bepaalde vertakkings gedurende die kweekingsproses van ’n enkele modelboom, verseker dat die varianse fout van MTF verlaag word. Terselfdetyd word die gemiddelde afwykingsfout van die modelbome, wat reeds baie laag is, laag gehou. Die polinoom funksies wat die skatting akkuraatheid van MTF bepaal, laat toe dat die gesaamentlike model data, wat hoogs nie-liniˆer is, goed naboots. Die polinoom funksies help ook om te keer dat die modelbome nie die datastel oorpas nie, wat andersins die geval sou wees as gevolg van die ho¨e kompleksiteit wat modelbome besit. MTF het gelykstaande resultate gelewer aan ’n lukrake woud, en beter resultate as ’n groepering van neurale netwerke, ’n groepering van M5 model bome en ’n groepering van ondersteunende vektor regressie. Die tydkoste van ’n MTF model kweek is tot vier eksponenti¨ele ordes grootter as ’n lukrake woud. Masters 2023-01-23T08:07:20Z 2023-05-18T06:56:43Z 2023-01-23T08:07:20Z 2023-05-18T06:56:43Z 2023-03 Thesis http://hdl.handle.net/10019.1/126939 en_ZA en_ZA Stellenbosch University xiv, 117 pages : illustrations. application/pdf Stellenbosch : Stellenbosch University
spellingShingle Ensemble learning (Machine learning)
Genetic algorithms
Decision trees
Van der Merwe, Werner
A genetic algorithm based model tree forest
title A genetic algorithm based model tree forest
title_full A genetic algorithm based model tree forest
title_fullStr A genetic algorithm based model tree forest
title_full_unstemmed A genetic algorithm based model tree forest
title_short A genetic algorithm based model tree forest
title_sort genetic algorithm based model tree forest
topic Ensemble learning (Machine learning)
Genetic algorithms
Decision trees
url http://hdl.handle.net/10019.1/126939
work_keys_str_mv AT vandermerwewerner ageneticalgorithmbasedmodeltreeforest
AT vandermerwewerner geneticalgorithmbasedmodeltreeforest