Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Autonomous racing on unseen tracks using reinforcement learning

Jefferies, D. W. P. 2025. Autonomous racing on unseen tracks using reinforcement learning. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/43e6d233-2024-46d0-9ae8-ba3029360954

Saved in:
Bibliographic Details
Main Author: Jefferies, Devin Wayne Patrick
Other Authors: Schoeman, J. C.
Format: Thesis
Published: Stellenbosch : Stellenbosch University 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613849248595968
access_status_str Open Access
author Jefferies, Devin Wayne Patrick
author2 Schoeman, J. C.
author_browse Jefferies, Devin Wayne Patrick
Schoeman, J. C.
author_facet Schoeman, J. C.
Jefferies, Devin Wayne Patrick
author_sort Jefferies, Devin Wayne Patrick
collection Thesis
dc_rights_str_mv Stellenbosch University
description Jefferies, D. W. P. 2025. Autonomous racing on unseen tracks using reinforcement learning. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/43e6d233-2024-46d0-9ae8-ba3029360954
format Thesis
id oai:scholar.sun.ac.za:10019.1/132537
institution Stellenbosch University (South Africa)
last_indexed 2026-06-10T12:42:40.195Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/132537 Autonomous racing on unseen tracks using reinforcement learning Jefferies, Devin Wayne Patrick Schoeman, J. C. Evans, B. D. Stellenbosch University. Faculty of Engineering. Dept. of Electrical and Electronic Engineering. Reinforcement learning Automobiles, Racing -- Automation Real-time data processing Robotics -- Control systems UCTD Jefferies, D. W. P. 2025. Autonomous racing on unseen tracks using reinforcement learning. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/43e6d233-2024-46d0-9ae8-ba3029360954 Thesis (MEng)--Stellenbosch University, 2025. ENGLISH ABSTRACT: The rise of autonomous systems in the vehicle industry has highlighted their potential to increase safety and convenience, transforming the way vehicles interact with their environment. The racing domain is used to further explore the capabilities of autonomous systems, as it serves as an ideal test bed to push these algorithms to their performance and safety limits. In the racing domain, autonomous racing algorithms are tasked with generating control commands (such as speed and steering angle) to navigate a vehicle around a track as safely and as quickly as possible. In order to achieve this goal, autonomous racing algorithms that employ classical control methods can be used. These methods rely on accurate track and vehicle models to race according to preplanned trajectories around racetracks. This allows for consistent and repeatable racing behaviour; however, it limits their use to known, static environments. In comparison, deep reinforcement learning algorithms can learn to race without the need for these preplanned trajectories. These algorithms learn from a trial-and-error process, making them more applicable to generalisation. This generalisation ability makes them an alternative to classical algorithms in unseen and changing environments that are more reminiscent of real-world conditions. However, RL algorithms do have limitations, as they tend to underperform in comparison to classic methods. In this thesis, we introduce an end-to-end racing framework with improved performance that is comparable to classic algorithms while increasing its ability to generalise to unseen tracks. Our method uses a centre-orientated twin delayed deep deterministic policy gradient (CO-TD3) agent to race on the standard F1TENTH platform. We input sensor measurements into a deep reinforcement learning network and teach an agent how to race by controlling a vehicle’s speed and steering angle. We illustrate the effects that an optimal agent state vector and reward function have on racing performance and generalisation ability. In addition, we present a random track generator that can be used for research and testing various algorithms in the F1TENTH simulator. To illustrate the performance of our CO-TD3 agent, we conduct experiments in simulation and the results are compared to a benchmark of current racing algorithms. Our algorithm demonstrates robustness and generalisation ability by racing on a real vehicle after being trained in a simulated environment. These results demonstrate that our CO-TD3 agents are capable of achieving performance comparable to classic control algorithms in simulation, while also generalising effectively to unseen tracks both in simulation and in the real world. AFRIKAANSE OPSOMMING: Die toename van outonome stelsels in die voertuigbedryf het hul potensiaal beklemtoon om veiligheid en gerief te verhoog, wat die manier waarop voertuie funksioneer en met hul omgewing saamwerk, verander. Een van die opwindende maniere om verder ondersoek in te stel oor die vermo¨ens van outonomiese stelsels is outonomiese wedrenne. Wedrenne bied ‘n unieke en uitdagende omgeving wat die limiete en grense van outonomiese stelsels verder uitbrei. Hierdie outonome wedrenalgoritmes het die taak om beheeropdragte (soos spoed en stuurhoek) te genereer om ’n voertuig so veilig en so vinnig moontlik om ’n baan te navigeer. Om dit te bereik, gebruik outonome wedrenalgoritmes wat klassieke beheermetodes gebruik, akkurate baan- en voertuigmodelle om op optimale trajekte om renbane te jaag. Dit maak voorsiening vir konsekwente en herhaalbare wedrengedrag; dit beperk hul gebruik egter tot bekende, statiese omgewings. In vergelyking, kan diep versterking leer algoritmes leer om te jaag sonder die behoefte aan hierdie modelle. Hierdie algoritmes leer uit ’n proef-en-fout-proses, wat hulle meer toepaslik maak op veralgemening. Hierdie veralgemeningsvermo¨e maak hulle ’n alternatief vir klassieke algoritmes in ongesiene en veranderende omgewings wat meer aan werklike toestande herinner. Versterking leer algoritmes het egter beperkings, aangesien hulle geneig is om te onderpresteer in vergelyking met klassieke metodes. In hierdie tesis stel ons ’n end-tot-end-renraamwerk bekend met verbeterde werkverrigting wat vergelykbaar is met klassieke algoritmes, terwyl ons die vermo¨e daarvan om te veralgemeen na ongesiene bane verhoog. Ons metode gebruik ’n middelpunt-geori¨enteerde dubbel vertraagde diep deterministiese beleidgradi¨ent (CO-TD3) agent om op die standard F1TENTH-platform te jaag. Ons voer sensormetings in ’n diep versterkingsleer netwerk in en leer ’n agent hoe om te jaag deur ’n voertuig se spoed en stuurhoek te beheer. Ons illustreer die effekte wat ’n optimale agenttoestandvektor en beloningsfunksie op wedrenprestasie en veralgemeningsvermo¨e het. Daarbenewens bied ons ’n lukrake baangenerator aan wat gebruik kan word vir navorsing en toetsing van verskeie algoritmes in die F1TENTH simulator. Om die prestasie van ons CO-TD3 agent te illustreer, word dit in simulasie ge¨evalueer en die resultate word vergelyk met ’n maatstaf van huidige wedrenalgoritmes. Ons algoritme demonstreer robuustheid en veralgemeningsvermo¨e deur op ’n regte voertuig te jaag nadat dit in ’n gesimuleerde omgewing opgelei is. Dit wys dus dat ons CO-TD3 agente in staat is om prestasie te verrig wat vergelykbaar is met klassieke beheeralgoritmes, terwyl hulle in staat is om te veralgemeen na ongesiene bane in simulasie en in die werklike wˆereld. Masters 2025-06-10T14:02:17Z 2025-06-10T14:02:17Z 2025-03 Thesis https://scholar.sun.ac.za/handle/10019.1/132537 Stellenbosch University xii, 112 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle Reinforcement learning
Automobiles, Racing -- Automation
Real-time data processing
Robotics -- Control systems
UCTD
Jefferies, Devin Wayne Patrick
Autonomous racing on unseen tracks using reinforcement learning
title Autonomous racing on unseen tracks using reinforcement learning
title_full Autonomous racing on unseen tracks using reinforcement learning
title_fullStr Autonomous racing on unseen tracks using reinforcement learning
title_full_unstemmed Autonomous racing on unseen tracks using reinforcement learning
title_short Autonomous racing on unseen tracks using reinforcement learning
title_sort autonomous racing on unseen tracks using reinforcement learning
topic Reinforcement learning
Automobiles, Racing -- Automation
Real-time data processing
Robotics -- Control systems
UCTD
url https://scholar.sun.ac.za/handle/10019.1/132537
work_keys_str_mv AT jefferiesdevinwaynepatrick autonomousracingonunseentracksusingreinforcementlearning