Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Mouton, A. 2025. Multi-Fidelity Parameter Tuning of Parallel Monte-Carlo Tree Search Variants. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/e4cb5098-5f59-4c10-9f1d-aa69428ba144
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | English |
| Published: |
Stellenbosch : Stellenbosch University
2025
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867614107289518080 |
|---|---|
| access_status_str | Open Access |
| author | Mouton, Alexander |
| author2 | Inggs, C. P. |
| author_browse | Inggs, C. P. Mouton, Alexander |
| author_facet | Inggs, C. P. Mouton, Alexander |
| author_sort | Mouton, Alexander |
| collection | Thesis |
| dc_rights_str_mv | Stellenbosch University |
| description | Mouton, A. 2025. Multi-Fidelity Parameter Tuning of Parallel Monte-Carlo Tree Search Variants. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/e4cb5098-5f59-4c10-9f1d-aa69428ba144 |
| format | Thesis |
| id | oai:scholar.sun.ac.za:10019.1/132582 |
| institution | Stellenbosch University (South Africa) |
| language | English |
| last_indexed | 2026-06-10T12:46:46.943Z |
| license_str | Other — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository |
| publishDate | 2025 |
| publishDateRange | 2025 |
| publishDateSort | 2025 |
| publisher | Stellenbosch : Stellenbosch University |
| publisherStr | Stellenbosch : Stellenbosch University |
| record_format | dspace |
| source_str | SUNScholar — Stellenbosch University Repository |
| spelling | oai:scholar.sun.ac.za:10019.1/132582 Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants Mouton, Alexander Inggs, C. P. Kroon, R. S. Stellenbosch University. Faculty of Science. Dept. of Computer Science. Monte Carlo method Machine learning Simulated annealing (Mathematics) Neural networks (Computer science) UCTD Mouton, A. 2025. Multi-Fidelity Parameter Tuning of Parallel Monte-Carlo Tree Search Variants. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/e4cb5098-5f59-4c10-9f1d-aa69428ba144 Thesis (MSc)--Stellenbosch University, 2025. ENGLISH ABSTRACT: In this work, we develop a hyperparameter optimisation algorithm that aims to effectively leverage observations from lower‐cost systems to optimise a computationally more expensive target system, where these less expensive systems are thought of as approximating the target system. We investigate the application of our cost‐aware multi‐fidelity Bayesian optimisation (MFBO algorithm for the hyperparameter optimisation of various parallel Monte‐Carlo tree search (MCTS agents. Hyperparameter tuning for AI systems is often computationally expensive, particularly when evaluations are time‐consuming and result in approximate (or noisy outcomes. This is no different for AI game agents, which provide an ideal testbed for multi‐fidelity optimisation research. Factors such as game board size, allocated time per move, and the number of games played for evaluation can be adjusted to create approximate evaluations of a target agent’s performance. When determining parameters to evaluate, our algorithm selects values for each of these factors by trading off computational cost against potential information gain about the target system. In contrast to this, existing literature optimises AI game agents with these factors kept constant — to our knowledge, no previous work has applied multi‐fidelity optimisation to AI game agents. Furthermore, the vast majority of existing multi‐fidelity optimisation research only considers querying from a limited number of lower‐cost systems and incorporates a single cost‐influencing factor when constructing these systems. We evaluated our proposed algorithm by optimising six parallel MCTS agent variants on the games of Go and Othello, with each agent incorporating up to two MCTS enhancements from the literature, including custom adaptations that address key shortcomings identified in our research. Additionally, we explored the potential of extending our algorithm to multi‐target optimisation, specifying four different numbers of threads allocated to the search agent. Another benefit of our approach is that we can obtain reasonable hyperparameters conditioned on other (non‐target) board sizes or numbers of threads, for example, without additional computation, which can be valuable in settings such as ours. Our results show the potential of multi‐fidelity approaches such as this, but highlight that future research is required. Significant challenges identified are effectively dealing with large cost disparities in the contexts available for evaluation, as well as cases where inexpensive contexts are not representative of the target system. The conclusions drawn from our results highlight promising avenues for future work with significant potential for enhancing our approach and outcomes. AFRIKAANSE OPSOMMING: In hierdiewerk ontwikkel ons ’n hiperparameter‐optimeringsalgoritmewat poog om waarnemings van laer‐koste stelsels effektiefte benut om ’n rekenkun‐ dig duurder teikenstelsel te optimeer, waar hierdie goedkoper stelsels beskou word as benaderings van die teikenstelsel. Ons ondersoek die toepassing van ons koste‐bewuste multi‐ϐideliteit Bayesiese optimeringsalgoritme vir die op‐ timering van hiperparameters van verskeie parallelle Monte‐Carlo boomsoek‐ togagente. Hiperparameter‐verstelling vir kunsmatige intelligensie (KI stelsels is dik‐ wels berekeningsintensief, veralwanneer evaluasies tydrowend is en benaderde of wisselende resultate lewer. Dieselfde geld vir KI‐spelagente, wat ’n ideale toetsomgewing bied vir die evaluasie van multi‐ϐideliteit optimeringstegnieke. Faktore soos spelbordgrootte,toegekende tyd per skuif en die aantal kerewat’n spel per evaluasie gespeel word, kan aangepas word om benaderings van ’n tei‐ kenagent se prestasie te skep. Wanneer parameters vir evaluasie bepaal word, kies ons algoritme waardes vir elk van hierdie faktore deur die rekenaarkoste en potensiële inligtingswins oor die teikenstelsel teenoor mekaar op te weeg. In teenstelling hiermee, optimeer ander werk in die literatuur KI‐spelagente terwyl hierdie faktore konstant gehou word — sover ons kennis strek, het geen vorige werk multi‐ϐideliteit optimering op KI‐spelagente toegepas nie. Verder beperk die oorgrote meerderheid van bestaande multi‐ϐideliteit optimerings‐navorsing hul omvang tot die evaluasie van ’n beperkte aantal laer‐koste stel‐ sels, waar hierdie stelsels slegs ’n enkele faktor wat die koste beïnvloed, insluit. Ons het ons voorgestelde algoritme geëvalueer deur ses parallelle Monte‐ Carlo boomsoektogvariante op Go en Othello te optimeer. Elke agent het tot twee Monte‐Carlo boomsoektogverbeterings uit die literatuur geïntegreer, in‐ sluitend bykomende aanpassings wat gemaak is om belangrike tekortkominge wat ons tydens ons navorsing geïdentiϐiseer het, aan te spreek. Boonop het ons die potensiaal om ons algoritme na multi‐teiken optimering uit te brei onder‐ soek. Hiervoor het ons vier verskillende liggewig prosesse as teikens aan elke soektogagent toegeken. Nog ’n voordeel van ons benadering is dat ons “ver‐ niet” redelike hiperparameters kan verkry vir ’n agent met ’n ander teikenkon‐ ϐigurasie, byvoorbeeld vir ’n gegewe bordgrootte of aantal liggewig prosesse. Hierdie eienskap kan waardevol wees in toepassings wat soortgelyk is aan die KI‐agente wat hier gebruik is. Ons resultate toon die potensiaal van multi‐ϐideliteitsbenaderings soos hier‐ die, maar beklemtoon dat toekomstige navorsing nodig is om beduidende uit‐ dagings aan te spreek. Hierdie uitdagins behels gevalle waar daar groot koste‐ verskille is in die kontekste beskikbaar vir evaluasie, asook wanneer goedkoop kontekste nie verteenwoordigend is van die teiken KI‐stelsel nie. Die gevolg‐ trekkings wat ons na analise van ons resultate gemaak het, beklemtoon belo‐ wende areas vir verdere navorsing met aansienlike potensiaal om op ons be‐ nadering en uitkomste te verbeter. Masters 2025-06-11T09:54:49Z 2025-06-11T09:54:49Z 2025-03 Thesis https://scholar.sun.ac.za/handle/10019.1/132582 en Stellenbosch University xvi, 139 pages : illustrations application/pdf Stellenbosch : Stellenbosch University |
| spellingShingle | Monte Carlo method Machine learning Simulated annealing (Mathematics) Neural networks (Computer science) UCTD Mouton, Alexander Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants |
| title | Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants |
| title_full | Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants |
| title_fullStr | Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants |
| title_full_unstemmed | Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants |
| title_short | Multi-fidelity parameter tuning of parallel Monte-Carlo tree search variants |
| title_sort | multi fidelity parameter tuning of parallel monte carlo tree search variants |
| topic | Monte Carlo method Machine learning Simulated annealing (Mathematics) Neural networks (Computer science) UCTD |
| url | https://scholar.sun.ac.za/handle/10019.1/132582 |
| work_keys_str_mv | AT moutonalexander multifidelityparametertuningofparallelmontecarlotreesearchvariants |