Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

A hyperheuristic approach towards the training of artificial neural networks

Thesis (PhD)--Stellenbosch University, 2021.

Saved in:
Bibliographic Details
Main Author: Nel, Gerrit Stephanus
Other Authors: Van Vuuren, J. H.
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2021
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614025591816192
access_status_str Open Access
author Nel, Gerrit Stephanus
author2 Van Vuuren, J. H.
author_browse Nel, Gerrit Stephanus
Van Vuuren, J. H.
author_facet Van Vuuren, J. H.
Nel, Gerrit Stephanus
author_sort Nel, Gerrit Stephanus
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (PhD)--Stellenbosch University, 2021.
format Thesis
id oai:scholar.sun.ac.za:10019.1/109792
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:45:28.762Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2021
publishDateRange 2021
publishDateSort 2021
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/109792 A hyperheuristic approach towards the training of artificial neural networks Nel, Gerrit Stephanus Van Vuuren, J. H. Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering. Multi-objective optimisation UCTD Artificial neural networks Heuristic algorithms Machine learning Evolutionary computation Thesis (PhD)--Stellenbosch University, 2021. ENGLISH ABSTRACT: In 2015, approximately 2.5 × 1018 bytes of data were generated on a daily basis. The enormity and nature of these data have laid bare the inadequacies of standard data analytic approaches. Researchers and practitioners have for long been unequipped with the necessary means to extract insight from the vast amounts of data at their disposal - until now, that is. Recent advances within the domain of artificial intelligence have ushered in a new era, providing the essential connective tissue between data and analysis. These advances can be attributed to instrumental research conducted within the field of machine learning, research that has provided algorithms with the inherent ability to learn. A groundbreaking algorithm at the forefront of the current machine learning impetus is the artificial neural network. Artificial neural networks are computational models inspired by biological neural networks. This process of neurological emulation enables artificial neural networks to gain an ability intrinsic to their muse - i.e. to learn from experience. A characteristic that distinguishes this algorithm from other machine learning algorithms is the efficiency and effectiveness with which it can recognise complex patterns and abstractions within data. The process according to which this algorithm recognises patterns from data is called training and is arguably its most intriguing facet. Conventionally, the method of gradient descent (or steepest ascent) is employed to find good network parameter values. A limitation is, however, imposed on the level of abstraction at which optimisation can thus transpire. A gradient-free approach offers a good alternative. More specifically, the research field of metaheuristics provides powerful optimisation techniques that are applicable in the context of training artificial neural networks. A metaheuristic optimisation approach allows for far greater freedom during artificial neural network training - the network weights, its structure, and its activation functions can be optimised concurrently. This versatility of metaheuristics, as well as their proven capability in many optimisation contexts, serves as justification for why they feature centrally in this dissertation. A challenge to all optimisation approaches, however, relates to the decision of which algorithm to employ for this purpose. Fortunately, the relatively new and promising field of hyperheuristics provides the necessary means to circumvent this challenge - a hyperheuristic is essentially a heuristic that chooses heuristics. The hyperheuristic considered in this dissertation is called the AMALGAM method. AMALGAM is a powerful and robust optimisation approach that delivers significant performance improvements (approaching a factor of ten), whilst enhancing the level of general applicability over various benchmark problems. This hyperheuristic has not been applied in the literature to the optimisation problem of training artificial neural networks in respect of their network weights, network structure, and activation functions concurrently. An AMALGAM-based hyperheuristic training algorithm is therefore proposed in this dissertation. The novelty of the problem under investigation, however, necessitates a new mathematical learning model. In addition, novel modifications in respect of AMALGAM are made so as to enable its use in neural network training. A bi-objective hyperheuristic training algorithm is designed, in which the main objective represents a novel network performance measure while a secondary so-called helper objective is incorporated to guide the search process. A test suite, comprising several data sets, is created in order to evaluate the efficacy of the proposed training algorithm. Three extensive parameter evaluations are performed so as to gain insight into algorithmic performance under different conditions. An in-depth algorithmic performance comparison is also performed during which the performance achieved by the proposed hyperheuristic training algorithm is compared with those of its constituent sub-algorithms. The robustness of the proposed approach is also validated by means of a meta-generalisation analysis. A comparison between the hyperheuristic training algorithm and powerful gradient-based training algorithms is performed which is supplemented by an investigation into the potential consolidation of the hyperheuristic approach with the best gradient-based algorithm. An in-depth investigation is launched into the temporal dynamics of the hyperheuristic's sub-algorithms with a view to gain new insight into this novel approach towards training artificial neural networks and to predict algorithmic performance. A demonstration of how the working of the hyperheuristic can be improved by means of the prediction model is also provided. The structural attributes related to favourable networks produced by the hyperheuristic are analysed with a view to gain new insight into the working of the hyperheuristic. AFRIKAANSE OPSOMMING: In 2015 is daar ongeveer 2.5 × 1018 grepe data op 'n daaglikse basis gegenereer. Die omvang en aard van hierdie data het die tekortkominge van standaard data-analitiese benaderings blootgelê. Navorsers en praktisyns het lank nie oor die nodige middele beskik om insig uit die groot hoeveelhede data tot hulle beskikking, te verkry nie - tot nou toe. As gevolg van onlangse vordering binne die vakgebied van kunsmatige intelligensie het 'n nuwe era aanbreek wat die nodige bindweefsel tussen data en analise verskaf. Hierdie vordering kan toegeskryf word aan instrumentele navorsing in die gebied van masjienleer, navorsing waarin algoritmes wat die inherente vermoë het om te leer, die lig gesien het. 'n Baanbrekende algoritme aan die voorpunt van die huidige masjienleer-momentum is die kunsmatige neurale netwerk. Kunsmatige neurale netwerke is berekeningsmodelle wat deur biologiese neurale netwerke geïnspireer is. Hierdie proses van neurologiese nabootsing stel kunsmatige neurale netwerke in staat om 'n vermoë te ontwikkel wat eie is aan hul muse - naamlik om uit ervaring te leer. 'n Eienskap wat hierdie algoritme van ander masjienleeralgoritmes onderskei, is die doeltreffendheid en effektiwiteit waarmee dit komplekse patrone en abstraksies binne data kan herken. Die proses waarvolgens hierdie algoritme patrone uit data herken, word leer genoem en is waarskynlik die mees interessante faset daarvan. Gewoonlik word die gradient-dalingsmetode (of die steilste-hellingmetode) gebruik om goeie netwerkparameterwaardes te vind. Die vlak van abstraksie waarby optimering sodoende kan plaansind, is egter beperk. 'n Gradient-vrye benadering, darenteen, bied 'n goeie alternatief. Meer spesifiek verskaf die navorsingsveld van metaheuristieke kragtige optimeringstegnieke wat in die konteks van kunsmatige neurale netwerkleer toepaslik is. 'n Metaheuristiese optimeringsbenadering maak voorsiening vir veel groter vryheid tydens kunsmatige neurale netwerk-leer - die netwerkgewigte, die netwerkstruktuur en die aktiveringsfunksies van die netwerk kan gelyktydig só geoptimeer word. Hierdie veelsydigheid van metaheuristieke, sowel as hul bewese vermoë in verskeie optimeringskontekste, dien as motivering vir hul kern-oorweging in hierdie proefskrif. 'n Uitdaging vir alle optimeringsbenaderings het egter betrekking op die besluit oor watter metaheuristiek om vir hierdie doel in te span. Gelukkig bied die relatiewe nuwe en belowende studieveld van hiperheuristieke die nodige middele om hierdie uitdaging te oorkom - 'n hiperheuristiek is in wese 'n heuristiek wat heuristieke kies. Die hiperheuristiek wat in hierdie proefskrif oorweeg word, word die AMALGAM-metode genoem. AMALGAM is 'n kragtige en robuuste optimeringsbenadering wat beduidende prestasieverbeterings (met 'n faktor van tot tien) bied, terwyl die vlak van algemene toepaslikheid oor verskeie toetsprobleme verbeter. Hierdie hiperheuristiek is nog nie in die literatuur op die optimeringsprobleem van kunsmatige neurale netwerk-leer toegepas waarin netwerkgewigte, netwerkstruktuur en aktiveringsfunksies gelyktydig bepaal word nie. 'n AMALGAM-gebaseerde hiperheuristiese leeralgoritme word dus in hierdie proefskrif daargestel. Die oorspronklikheid van die probleem wat ondersoek word, vereis egter dat 'n nuwe wiskundige leermodel geformuleer word. Daarbenewens word nuwe veranderinge aan AMALGAM voorgestel sodat die algoritme vir neurale netwerk-leer ingespan kan word. 'n Tweedoelige hiperheuristiese leeralgoritme word ontwerp waarin die hoofdoel 'n netwerkprestasiemaatstaf verteenwoordig terwyl 'n sekondêre, sogenaamde hulpdoel daarop gemik is om die optimeringsoekproses te lei. 'n Versameling toetsprobleme, bestaande uit verskeie datastelle, word geskep om die doeltre endheid van die voorgestelde leeralgoritme te evalueer. Drie omvattende parameterevaluerings word uitgevoer om sodoende insig te verkry in algoritmiese prestasie onder verskillende omstandighede. Daar word ook 'n diepgaande algoritmiese prestasievergelyking uitgevoer waartydens die prestasie wat deur die voorgestelde hiperheuristiese leeralgoritme bereik word, vergelyk word met dié van sy deelalgoritmes. Die robuustheid van die voorgestelde benadering word ook deur middel van 'n meta-veralgemeningsanalise gevalideer. 'n Vergelyking tussen die hiperheuristiese leeralgoritme en kragtige gradiëntgebaseerde leeralgoritmes word verder uitgevoer en aangevul deur 'n ondersoek na die moontlike konsolidering van die hiperteuristiese benadering met die beste gradiëntgebaseerde algoritme. 'n In-diepte ondersoek na die temporale dinamika van die hiperheuristiek se deelalgoritmes word geloots om insig in hierdie nuwe benadering tot kunsmatige neurale netwerk-leer te verkry en om algoritmiese prestasie te voorspel. 'n Demonstrasie van hoe die werking van die hiperheuristiek deur middel van 'n voorspellingsmodel verbeter kan word, word ook gelewer. Die strukturele kenmerke wat verband hou met gunstige netwerke wat deur die hiperheuristiek gegenereer word, word geanaliseer met die oog op nuwe insig in die werking van die hiperheuristiek. Doctoral 2021-01-22T07:37:46Z 2021-04-21T14:26:25Z 2021-01-22T07:37:46Z 2021-04-21T14:26:25Z 2021-03 Thesis http://hdl.handle.net/10019.1/109792 en_ZA Stellenbosch University 287 pages application/pdf Stellenbosch : Stellenbosch University
spellingShingle Multi-objective optimisation
UCTD
Artificial neural networks
Heuristic algorithms
Machine learning
Evolutionary computation
Nel, Gerrit Stephanus
A hyperheuristic approach towards the training of artificial neural networks
title A hyperheuristic approach towards the training of artificial neural networks
title_full A hyperheuristic approach towards the training of artificial neural networks
title_fullStr A hyperheuristic approach towards the training of artificial neural networks
title_full_unstemmed A hyperheuristic approach towards the training of artificial neural networks
title_short A hyperheuristic approach towards the training of artificial neural networks
title_sort hyperheuristic approach towards the training of artificial neural networks
topic Multi-objective optimisation
UCTD
Artificial neural networks
Heuristic algorithms
Machine learning
Evolutionary computation
url http://hdl.handle.net/10019.1/109792
work_keys_str_mv AT nelgerritstephanus ahyperheuristicapproachtowardsthetrainingofartificialneuralnetworks
AT nelgerritstephanus hyperheuristicapproachtowardsthetrainingofartificialneuralnetworks