Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’

Thesis (MEng)--Stellenbosch University, 2023.

Saved in:

Bibliographic Details
Main Author:	Murdoch, Andrew
Other Authors:	Schoeman, J-C
Format:	Thesis
Language:	en_ZA en_ZA
Published:	Stellenbosch : Stellenbosch University 2023
Subjects:	Partial end-to-end; reinforcement learning; robustness towards model-mismatch; autonomous racing Reinforcement learning Automated vehicles Automobiles, Racing Simulated annealing (Mathematics)
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867614001112809472
access_status_str	Open Access
author	Murdoch, Andrew
author2	Schoeman, J-C
author_browse	Murdoch, Andrew Schoeman, J-C
author_facet	Schoeman, J-C Murdoch, Andrew
author_sort	Murdoch, Andrew
collection	Thesis
dc_rights_str_mv	Stellenbosch University
description	Thesis (MEng)--Stellenbosch University, 2023.
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/128928
institution	Stellenbosch University (South Africa)
language	en_ZA en_ZA
last_indexed	2026-06-10T12:45:04.096Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2023
publishDateRange	2023
publishDateSort	2023
publisher	Stellenbosch : Stellenbosch University
publisherStr	Stellenbosch : Stellenbosch University
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/128928 Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’ Murdoch, Andrew Schoeman, J-C Jordaan, Willem Stellenbosch University. Faculty of Engineering. Dept. of Electrical and Electronic Engineering. Partial end-to-end; reinforcement learning; robustness towards model-mismatch; autonomous racing Reinforcement learning Automated vehicles Automobiles, Racing Simulated annealing (Mathematics) Thesis (MEng)--Stellenbosch University, 2023. ENGLISH ABSTRACT: The increasing popularity of self-driving cars has given rise to the emerging field of autonomous racing. In this domain, algorithms are tasked with processing sensor data to generate control commands (e.g., steering and throttle) that move a vehicle around a track safely and in the shortest possible time. This study addresses the significant issue of practical model-mismatch in learning-based solutions, particularly in reinforcement learning (RL), for autonomous racing. Model mismatch occurs when the vehicle dynamics model used for simulation does not accurately represent the real dynamics of the vehicle, leading to a decrease in algorithm performance. This is a common issue encountered when considering real-world deployments. To address this challenge, we propose a partial end-to-end algorithm which decouples the planning and control tasks. Within this framework, a reinforcement learning (RL) agent generates a trajectory comprising a path and velocity, which is subsequently tracked using a pure pursuit steering controller and a proportional velocity controller, respectively. In contrast, many learning-based algorithms utilise an end-to-end approach, whereby a deep neural network directly maps from sensor data to control commands. We extensively evaluate the partial end-to-end algorithm in a custom F1tenth simulation, under conditions where model-mismatches in vehicle mass, cornering stiffness coefficient, and road surface friction coefficient are present. In each of these scenarios, the performance of the partial end-to-end agents remained similar under both nominal and model-mismatch conditions, demonstrating an ability to reliably navigate complex tracks without crashing. Thus, by leveraging the robustness of a classical controller, our partial end-to-end driving algorithm exhibits better robustness towards model-mismatches than an end-to-end baseline algorithm. AFRIKAANSE OPSOMMING: Die toenemende gewildheid van selfbesturende motors het aanleiding gegee tot die opkomende veld van outonome wedrenne. In hierdie domein, het algoritmes die taak om sensordata te verwerk om beheeropdragte (bv., stuur en versneller) te genereer wat ’n voertuig veilig en in die kortste moontlike tyd om ’n baan beweeg. Hierdie studie spreek die beduidende kwessie van praktiese model-wanverhouding in leergebaseerde oplossings aan, veral in versterkingsleer (RL), vir outonome wedrenne. Model-wanpassing vind plaas wanneer die voertuigdinamika-model wat vir simulasie gebruik word nie die werklike dinamika van die voertuig akkuraat voorstel nie, wat lei tot ’n afname in algoritme-werkverrigting. Dit is ’n algemene probleem wat teegekom word wanneer werklike implementerings oorweeg word. Om hierdie uitdaging aan te spreek, stel ons ’n gedeeltelike- ‘end-to-end’-algoritme voor wat die beplanning- en beheertake ontkoppel. Binne hierdie raamwerk genereer ’n versterkingsleer (RL) agent ’n trajek wat ’n pad en snelheid bevat, wat vervolgens nagespoor word deur gebruik te maak van ’n suiwer agtervolgstuurbeheerder en ’n proporsionele snelheidsbeheerder, onderskeidelik. Daarteenoor gebruik baie leergebaseerde algoritmes ’n ‘end-to-end’-benadering, waardeur ’n diep neurale netwerk direk (DNN) vanaf sensordata karteer om opdragte te beheer. Ons evalueer die gedeeltelike- ‘end-to-end’-algoritme breedvoerig in ’n pasgemaakte ‘F1tenth’-simulasie, onder toestande waar model-wanverhoudings in voertuigmassa, draai styfheidskoeffisient en padoppervlakwrywingskoeffisient teenwoordig is. In elk van hierdie scenario’s het die werkverrigting van die gedeeltelike- ‘end-to-end’-agente dieselfde gebly onder beide nominale en model-wanpastoestande, wat ’n vermoe demonstreer om komplekse spore betroubaar te navigeer sonder om te verongeluk. Deur dus die robuustheid van ’n klassieke kontroleerder te benut, toon ons gedeeltelike- ‘end-to-end’- bestuursalgoritme beter robuustheid teenoor model-wanpassings as ’n ‘end-to-end’- basislynalgoritme. Masters 2023-11-20T08:58:04Z 2024-01-08T16:08:41Z 2023-11-20T08:58:04Z 2024-01-08T16:08:41Z 2023-12 Thesis https://scholar.sun.ac.za/handle/10019.1/128928 en_ZA en_ZA Stellenbosch University xii, 95 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle	Partial end-to-end; reinforcement learning; robustness towards model-mismatch; autonomous racing Reinforcement learning Automated vehicles Automobiles, Racing Simulated annealing (Mathematics) Murdoch, Andrew Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’
title	Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’
title_full	Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’
title_fullStr	Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’
title_full_unstemmed	Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’
title_short	Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’
title_sort	partial end to end reinforcement learning for robustness towards model mismatch in autonomous racing
topic	Partial end-to-end; reinforcement learning; robustness towards model-mismatch; autonomous racing Reinforcement learning Automated vehicles Automobiles, Racing Simulated annealing (Mathematics)
url	https://scholar.sun.ac.za/handle/10019.1/128928
work_keys_str_mv	AT murdochandrew partialendtoendreinforcementlearningforrobustnesstowardsmodelmismatchinautonomousracing

Full Text Available

Partial end-to-end reinforcement learning for robustness towards model-mismatch in autonomous racing’

Similar Items