Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants

Mouton, A. 2025. Multi-Fidelity Parameter Tuning of Parallel Monte-Carlo Tree Search Variants. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/e4cb5098-5f59-4c10-9f1d-aa69428ba144

Saved in:

Bibliographic Details
Main Author:	Mouton, Alexander
Other Authors:	Inggs, C. P.
Format:	Thesis
Language:	English
Published:	Stellenbosch : Stellenbosch University 2025
Subjects:	Monte Carlo method Machine learning Simulated annealing (Mathematics) Neural networks (Computer science) UCTD
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867614107289518080
access_status_str	Open Access
author	Mouton, Alexander
author2	Inggs, C. P.
author_browse	Inggs, C. P. Mouton, Alexander
author_facet	Inggs, C. P. Mouton, Alexander
author_sort	Mouton, Alexander
collection	Thesis
dc_rights_str_mv	Stellenbosch University
description	Mouton, A. 2025. Multi-Fidelity Parameter Tuning of Parallel Monte-Carlo Tree Search Variants. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/e4cb5098-5f59-4c10-9f1d-aa69428ba144
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/132582
institution	Stellenbosch University (South Africa)
language	English
last_indexed	2026-06-10T12:46:46.943Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2025
publishDateRange	2025
publishDateSort	2025
publisher	Stellenbosch : Stellenbosch University
publisherStr	Stellenbosch : Stellenbosch University
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/132582 Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants Mouton, Alexander Inggs, C. P. Kroon, R. S. Stellenbosch University. Faculty of Science. Dept. of Computer Science. Monte Carlo method Machine learning Simulated annealing (Mathematics) Neural networks (Computer science) UCTD Mouton, A. 2025. Multi-Fidelity Parameter Tuning of Parallel Monte-Carlo Tree Search Variants. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/e4cb5098-5f59-4c10-9f1d-aa69428ba144 Thesis (MSc)--Stellenbosch University, 2025. ENGLISH ABSTRACT: In this work, we develop a hyperparameter optimisation algorithm that aims to effectively leverage observations from lower‐cost systems to optimise a computationally more expensive target system, where these less expensive systems are thought of as approximating the target system. We investigate the application of our cost‐aware multi‐fidelity Bayesian optimisation (MFBO algorithm for the hyperparameter optimisation of various parallel Monte‐Carlo tree search (MCTS agents. Hyperparameter tuning for AI systems is often computationally expensive, particularly when evaluations are time‐consuming and result in approximate (or noisy outcomes. This is no different for AI game agents, which provide an ideal testbed for multi‐fidelity optimisation research. Factors such as game board size, allocated time per move, and the number of games played for evaluation can be adjusted to create approximate evaluations of a target agent’s performance. When determining parameters to evaluate, our algorithm selects values for each of these factors by trading off computational cost against potential information gain about the target system. In contrast to this, existing literature optimises AI game agents with these factors kept constant — to our knowledge, no previous work has applied multi‐fidelity optimisation to AI game agents. Furthermore, the vast majority of existing multi‐fidelity optimisation research only considers querying from a limited number of lower‐cost systems and incorporates a single cost‐influencing factor when constructing these systems. We evaluated our proposed algorithm by optimising six parallel MCTS agent variants on the games of Go and Othello, with each agent incorporating up to two MCTS enhancements from the literature, including custom adaptations that address key shortcomings identified in our research. Additionally, we explored the potential of extending our algorithm to multi‐target optimisation, specifying four different numbers of threads allocated to the search agent. Another benefit of our approach is that we can obtain reasonable hyperparameters conditioned on other (non‐target) board sizes or numbers of threads, for example, without additional computation, which can be valuable in settings such as ours. Our results show the potential of multi‐fidelity approaches such as this, but highlight that future research is required. Significant challenges identified are effectively dealing with large cost disparities in the contexts available for evaluation, as well as cases where inexpensive contexts are not representative of the target system. The conclusions drawn from our results highlight promising avenues for future work with significant potential for enhancing our approach and outcomes. AFRIKAANSE OPSOMMING: In hierdiewerk ontwikkel ons ’n hiperparameter‐optimeringsalgoritmewat poog om waarnemings van laer‐koste stelsels effektiefte benut om ’n rekenkun‐ dig duurder teikenstelsel te optimeer, waar hierdie goedkoper stelsels beskou word as benaderings van die teikenstelsel. Ons ondersoek die toepassing van ons koste‐bewuste multi‐ϐideliteit Bayesiese optimeringsalgoritme vir die op‐ timering van hiperparameters van verskeie parallelle Monte‐Carlo boomsoek‐ togagente. Hiperparameter‐verstelling vir kunsmatige intelligensie (KI stelsels is dik‐ wels berekeningsintensief, veralwanneer evaluasies tydrowend is en benaderde of wisselende resultate lewer. Dieselfde geld vir KI‐spelagente, wat ’n ideale toetsomgewing bied vir die evaluasie van multi‐ϐideliteit optimeringstegnieke. Faktore soos spelbordgrootte,toegekende tyd per skuif en die aantal kerewat’n spel per evaluasie gespeel word, kan aangepas word om benaderings van ’n tei‐ kenagent se prestasie te skep. Wanneer parameters vir evaluasie bepaal word, kies ons algoritme waardes vir elk van hierdie faktore deur die rekenaarkoste en potensiële inligtingswins oor die teikenstelsel teenoor mekaar op te weeg. In teenstelling hiermee, optimeer ander werk in die literatuur KI‐spelagente terwyl hierdie faktore konstant gehou word — sover ons kennis strek, het geen vorige werk multi‐ϐideliteit optimering op KI‐spelagente toegepas nie. Verder beperk die oorgrote meerderheid van bestaande multi‐ϐideliteit optimerings‐navorsing hul omvang tot die evaluasie van ’n beperkte aantal laer‐koste stel‐ sels, waar hierdie stelsels slegs ’n enkele faktor wat die koste beïnvloed, insluit. Ons het ons voorgestelde algoritme geëvalueer deur ses parallelle Monte‐ Carlo boomsoektogvariante op Go en Othello te optimeer. Elke agent het tot twee Monte‐Carlo boomsoektogverbeterings uit die literatuur geïntegreer, in‐ sluitend bykomende aanpassings wat gemaak is om belangrike tekortkominge wat ons tydens ons navorsing geïdentiϐiseer het, aan te spreek. Boonop het ons die potensiaal om ons algoritme na multi‐teiken optimering uit te brei onder‐ soek. Hiervoor het ons vier verskillende liggewig prosesse as teikens aan elke soektogagent toegeken. Nog ’n voordeel van ons benadering is dat ons “ver‐ niet” redelike hiperparameters kan verkry vir ’n agent met ’n ander teikenkon‐ ϐigurasie, byvoorbeeld vir ’n gegewe bordgrootte of aantal liggewig prosesse. Hierdie eienskap kan waardevol wees in toepassings wat soortgelyk is aan die KI‐agente wat hier gebruik is. Ons resultate toon die potensiaal van multi‐ϐideliteitsbenaderings soos hier‐ die, maar beklemtoon dat toekomstige navorsing nodig is om beduidende uit‐ dagings aan te spreek. Hierdie uitdagins behels gevalle waar daar groot koste‐ verskille is in die kontekste beskikbaar vir evaluasie, asook wanneer goedkoop kontekste nie verteenwoordigend is van die teiken KI‐stelsel nie. Die gevolg‐ trekkings wat ons na analise van ons resultate gemaak het, beklemtoon belo‐ wende areas vir verdere navorsing met aansienlike potensiaal om op ons be‐ nadering en uitkomste te verbeter. Masters 2025-06-11T09:54:49Z 2025-06-11T09:54:49Z 2025-03 Thesis https://scholar.sun.ac.za/handle/10019.1/132582 en Stellenbosch University xvi, 139 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle	Monte Carlo method Machine learning Simulated annealing (Mathematics) Neural networks (Computer science) UCTD Mouton, Alexander Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants
title	Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants
title_full	Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants
title_fullStr	Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants
title_full_unstemmed	Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants
title_short	Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants
title_sort	multi fidelity parameter tuning of parallel monte carlo tree search variants
topic	Monte Carlo method Machine learning Simulated annealing (Mathematics) Neural networks (Computer science) UCTD
url	https://scholar.sun.ac.za/handle/10019.1/132582
work_keys_str_mv	AT moutonalexander multifidelityparametertuningofparallelmontecarlotreesearchvariants

Full Text Available

Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants

Similar Items