Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Landscape aware algorithm selection for feature selection

Thesis (PhD)--Stellenbosch University, 2023.

Saved in:
Bibliographic Details
Main Author: Mostert, Werner
Other Authors: Engelbrecht, Andries Petrus
Format: Thesis
Language:en_ZA
en_ZA
Published: Stellenbosch : Stellenbosch University 2023
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614005807284224
access_status_str Open Access
author Mostert, Werner
author2 Engelbrecht, Andries Petrus
author_browse Engelbrecht, Andries Petrus
Mostert, Werner
author_facet Engelbrecht, Andries Petrus
Mostert, Werner
author_sort Mostert, Werner
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (PhD)--Stellenbosch University, 2023.
format Thesis
id oai:scholar.sun.ac.za:10019.1/128812
institution Stellenbosch University (South Africa)
language en_ZA
en_ZA
last_indexed 2026-06-10T12:45:09.986Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/128812 Landscape aware algorithm selection for feature selection Mostert, Werner Engelbrecht, Andries Petrus Malan, Katherine Mary Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Computer Science Division. Machine learning Algorithms Baseline fitness improvement Feature selection Instance space analysis Instance space analysis Thesis (PhD)--Stellenbosch University, 2023. ENGLISH ABSTRACT: Feature selection is commonly applied as a pre-processing technique for machine learning to reduce the dimensionality of a problem by removing redundant and irrelevant features. Another desirable outcome of feature selection lies in the potential performance improvement of predictive models. The development of new feature selection algorithms are common within the field, however, relatively little research has historically been done to better understand the feature selection problem from a theoretical perspective. Researchers and practitioners in the field often rely on a trial-and-error strategy to decide on which feature selection algorithm to use for a specific instance of a machine learning problem. This thesis contributes towards a better understanding of the complex feature selection problem by investigating the link between feature selection problem characteristics and the performance of feature selection algorithms. A variety of fitness landscape analysis techniques are used to gain insights into the structure of the feature selection fitness landscape. Performance complementarity for feature selection algorithms is empirically shown, emphasising the potential value of automated algorithm selection for feature selection algorithms. Towards the realisation of a landscape aware algorithm selector for feature selection, a novel performance metric for feature selection algorithms is presented. The baseline fitness improvement (BFI) performance metric is unbiased and can be used for comparative analysis across feature selection problem instances. The insights obtained via landscape analysis are used with other meta-features of datasets and the BFI performance measure to develop a new landscape aware algorithm selector for feature selection. The landscape aware algorithm selector provides a human-interpretable predictive model of the best feature selection algorithm for a specific dataset and classification problem. AFRIKAANSE OPSOMMING: Kenmerkkeuse word algemeen toegepas as ’n voorverwerkingstegniek vir masjienleer om die dimensionaliteit van ’n probleem te verminder deur oortollige en irrelevante kenmerke te verwyder. Nog ’n gewenste uitkoms van kenmerkkeuse lê in die potensiële prestasieverbetering van voorspellende modelle. Die ontwikkeling van nuwe kenmerkseleksie-algoritmes is algemeen binne die veld, maar relatief min navorsing is egter histories gedoen om die kenmerkseleksieprobleem beter vanuit ’n teoretiese perspektief te verstaan. Navorsers en praktisyns in die veld maak dikwels staat op ’n proef-by-fout-strategie om te besluit watter kenmerkseleksie- algoritme om vir ’n spesifieke geval van ’n probleem te gebruik. Hierdie tesis dra by tot ’n beter begrip van die komplekse kenmerkseleksieprobleem deur die verband tussen kenmerkseleksieprobleemkenmerke en die werkverrigting van kenmerkseleksie-algoritmes te ondersoek. ’n Verskeidenheid fiksheidslandskap- ontledingstegnieke word gebruik om insigte te verkry rondom die struktuur van die kenmerkkeuse-fiksheidlandskap. Prestasiekomplementariteit vir kenmerkkeusealgoritmes word empiries getoon, wat die waarde van outomatiese algoritmeseleksie vir kenmerkkeusealgoritmes beklemtoon. Met die oog op die verwesenliking van ’n landskapbewuste algoritmekieser vir kenmerkkeuse, word ’n nuwe prestasiemmaatstaf vir kenmerkkeusealgoritmes aangebied. Die basislyn fiksheidsverbetering (BFI) prestasie maatstaf is onbevooroordeeld en kan gebruik word vir vergelykende analise oor iv kenmerkkeuseprobleme. Die insigte wat verkry is deur landskapanalise word saam met die BFI-prestasiemetriek gebruik om ’n nuwe landskapbewuste algoritme-kieser vir kenmerkkeuse te ontwikkel. Die landskapbewuste algoritme-kieser verskaf voorheen onbeskikbare leiding aan navorsers en praktisyns vir die keuse van ’n kenmerkseleksie- algoritme om te ontplooi. Doctoral 2023-10-23T05:37:56Z 2024-01-08T12:01:50Z 2023-10-23T05:37:56Z 2024-01-08T12:01:50Z 2023-10 Thesis https://scholar.sun.ac.za/handle/10019.1/128812 en_ZA en_ZA Stellenbosch University xix, 160 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle Machine learning
Algorithms
Baseline fitness improvement
Feature selection
Instance space analysis
Instance space analysis
Mostert, Werner
Landscape aware algorithm selection for feature selection
title Landscape aware algorithm selection for feature selection
title_full Landscape aware algorithm selection for feature selection
title_fullStr Landscape aware algorithm selection for feature selection
title_full_unstemmed Landscape aware algorithm selection for feature selection
title_short Landscape aware algorithm selection for feature selection
title_sort landscape aware algorithm selection for feature selection
topic Machine learning
Algorithms
Baseline fitness improvement
Feature selection
Instance space analysis
Instance space analysis
url https://scholar.sun.ac.za/handle/10019.1/128812
work_keys_str_mv AT mostertwerner landscapeawarealgorithmselectionforfeatureselection