Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Exploration of reinforcement learning algorithms implemented in the Wrapslide environment

Thesis (MEng)--Stellenbosch University, 2023.

Saved in:

Bibliographic Details
Main Author:	Jordaan, Ruan
Other Authors:	Schmidt-Dumont, Thorsten
Format:	Thesis
Language:	en_ZA en_ZA
Published:	Stellenbosch : Stellenbosch University 2023
Subjects:	Machine learning Reinforcement learning Puzzles Neural networks (Computer science) Computational grids (Computer systems)
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867614013082304512
access_status_str	Open Access
author	Jordaan, Ruan
author2	Schmidt-Dumont, Thorsten
author_browse	Jordaan, Ruan Schmidt-Dumont, Thorsten
author_facet	Schmidt-Dumont, Thorsten Jordaan, Ruan
author_sort	Jordaan, Ruan
collection	Thesis
dc_rights_str_mv	Stellenbosch University
description	Thesis (MEng)--Stellenbosch University, 2023.
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/127173
institution	Stellenbosch University (South Africa)
language	en_ZA en_ZA
last_indexed	2026-06-10T12:45:16.097Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2023
publishDateRange	2023
publishDateSort	2023
publisher	Stellenbosch : Stellenbosch University
publisherStr	Stellenbosch : Stellenbosch University
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/127173 Exploration of reinforcement learning algorithms implemented in the Wrapslide environment Jordaan, Ruan Schmidt-Dumont, Thorsten Nel, Gerhardus Stephanus Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering. Machine learning Reinforcement learning Puzzles Neural networks (Computer science) Computational grids (Computer systems) Thesis (MEng)--Stellenbosch University, 2023. ENGLISH ABSTRACT: Machine learning has seen unprecedented popularity in recent years with new techniques and algorithms being developed by institutions worldwide. As such, it can often be a notable challenge to implement a machine learning approach to a specific scenario. The aim of this project is to elucidate the process of solving problems using machine learning through the implementation of various reinforcement learning agents to solve the Wrapslide puzzle. In this project, a literature review is conducted pertaining to a variety of facets of machine learning. This review includes both broad topics such as general reinforcement and supervised machine learning algorithms, as well as more nuanced aspects such as activation functions in neural networks and optimisation methods. Following this review, a literature review pertaining to the optimisation of the hyper-parameters of machine learning algorithms was conducted. The the scope is then narrowed to the implementation of reinforcement learning algorithms, specifically Q-learning. Various Q-learning implementations from basic tabular to complex deep Q-learning with convolutional, recurrent layers and attention are evaluated and compared in the WrapSlide environment. The results of these evaluations are then statistically analysed and discussed in accordance with an appropriate statistical testing procedure formulated from the literature. The WrapSlide environment simulates the toroidal puzzle game, WrapSlide, in Python. This game functions similar to a Rubik’s cube in the sense that it comprises a grid of coloured squares which much be sorted to reach a solved state. WrapSlide differs from a Rubik’s cube in that it is represented as a two-dimensional grid where the coloured blocks of the grid must be sorted into quadrants. The puzzle grid can vary in both size and number of different colours. In this project, grid sizes of 4 × 4, 6 × 6, and 8 × 8 with two, three and four colours are used to test the different implementations’ ability to deal with problems of varying complexity. The project concluded from numerical results obtained through the statistical testing of the performance of the implemented reinforcement learning algorithms, that there was not a single model capable of solving each incarnation of the Wrapslide puzzle optimally. It was concluded that the long short term memory based deep Q-learning algorithms generally performed better than the other implemented model, only being outperformed by the convolutional deep Q-learning algorithm for the relatively complex 8 × 8 two-colour puzzle. None of the puzzlesolving agents were capable of solving the 6 ×6 three-, or four-colour puzzles or the 8 ×8 three-, or four-colour puzzles. AFRIKAANS OPSOMMING: Masjienleer het die afgelope paar jaar ongekende gewildheid gesien met nuwe tegnieke en algoritmes wat wˆereldwyd deur instellings ontwikkel is. As sodanig kan dit dikwels ’n noemenswaardige uitdaging wees om ’n masjienleerbenadering vir ’n spesifieke scenario te implementeer. Die doel van hierdie projek is om die proses om probleme op te los met behulp van masjienleer toe te lig deur die implementering van verskeie versterkingsleermiddels om die Wrapslide-raaisel op te los. In hierdie projek word ’n literatuuroorsig gedoen wat betrekking het op ’n verskeidenheid fasette van masjienleer. Hierdie oorsig sluit beide bre¨e onderwerpe in soos algemene versterking en masjienleeralgoritmes onder toesig, sowel as meer genuanseerde aspekte soos aktiveringsfunksies in neurale netwerke en optimaliseringsmetodes. Na aanleiding van hierdie oorsig is ’n literatuuroorsig met betrekking tot die optimalisering van die hiperparameters van masjienleeralgoritmes uitgevoer. Die omvang word dan vernou tot die implementering van versterkingsleeralgoritmes, spesifiek Q-leer. Verskeie Q-leer-implementerings van basiese tabelleer tot komplekse diep Q-leer met konvolusionele, herhalende lae en aandag word in die WrapSlide-omgewing ge¨evalueer en vergelyk. Die resultate van hierdie evaluasies word dan statisties ontleed en bespreek in ooreenstemming met ’n toepaslike statistiese toetsprosedure wat uit die literatuur geformuleer is. Die WrapSlide-omgewing simuleer die toro¨ıdale legkaartspeletjie, WrapSlide, in Python. Hierdie speletjie funksioneer soortgelyk aan ’n Rubik se kubus in die sin dat dit ’n rooster van gekleurde blokkies bevat wat baie gesorteer word om ’n opgeloste toestand te bereik. WrapSlide verskil van ’n Rubik se kubus deurdat dit voorgestel word as ’n tweedimensionele rooster waar die gekleurde blokke van die rooster in kwadrante gesorteer moet word. Die legkaartrooster kan in beide grootte en aantal verskillende kleure verskil. In hierdie projek word roostergroottes van 4 × 4, 6 × 6, en 8 × 8 met twee, drie en vier kleure gebruik om die verskillende implementerings se vermo¨e om probleme van wisselende kompleksiteit te hanteer, te toets. Die projek het uit numeriese resultate verkry deur die statistiese toetsing van die werkverrigting van die ge¨ımplementeerde versterkingsleeralgoritmes tot die gevolgtrekking gekom dat daar nie ’n enkele model was wat in staat was om elke inkarnasie van die Wrapslide-raaisel optimaal op te los nie. Daar is tot die gevolgtrekking gekom dat die lang-korttermyngeheue-gebaseerde diep Q-leeralgoritmes oor die algemeen beter presteer het as die ander ge¨ımplementeerde model, net beter gevaar het deur die konvolusionele diep Q-learning-algoritme vir die relatief komplekse 8 × 8 two- kleur legkaart. Nie een van die raaisels oplossende agente was in staat om die 6 × 6 drie- of vierkleurraaisels of die 8 × 8 drie- of vierkleurraaisels op te los nie. Masters 2023-02-13T15:20:55Z 2023-05-18T07:08:05Z 2023-02-13T15:20:55Z 2023-05-18T07:08:05Z 2023-03-01 Thesis http://hdl.handle.net/10019.1/127173 en_ZA en_ZA Stellenbosch University xxvi, 131 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle	Machine learning Reinforcement learning Puzzles Neural networks (Computer science) Computational grids (Computer systems) Jordaan, Ruan Exploration of reinforcement learning algorithms implemented in the Wrapslide environment
title	Exploration of reinforcement learning algorithms implemented in the Wrapslide environment
title_full	Exploration of reinforcement learning algorithms implemented in the Wrapslide environment
title_fullStr	Exploration of reinforcement learning algorithms implemented in the Wrapslide environment
title_full_unstemmed	Exploration of reinforcement learning algorithms implemented in the Wrapslide environment
title_short	Exploration of reinforcement learning algorithms implemented in the Wrapslide environment
title_sort	exploration of reinforcement learning algorithms implemented in the wrapslide environment
topic	Machine learning Reinforcement learning Puzzles Neural networks (Computer science) Computational grids (Computer systems)
url	http://hdl.handle.net/10019.1/127173
work_keys_str_mv	AT jordaanruan explorationofreinforcementlearningalgorithmsimplementedinthewrapslideenvironment

Full Text Available

Exploration of reinforcement learning algorithms implemented in the Wrapslide environment

Similar Items