Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Evaluating transformers as memory systems in reinforcement learning

Memory is an important component of effective learning systems and is crucial in non-Markovian as well as partially observable environments. In recent years, Long Short-Term Memory (LSTM) networks have been the dominant mechanism for providing memory in reinforcement learning, however, the success o...

Full description

Saved in:
Bibliographic Details
Main Author: Makkink, Thomas
Other Authors: Shock, Jonathan
Format: Thesis
Language:English
Published: Department of Mathematics and Applied Mathematics 2022
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613421666566145
access_status_str Open Access
author Makkink, Thomas
author2 Shock, Jonathan
author_browse Makkink, Thomas
Shock, Jonathan
author_facet Shock, Jonathan
Makkink, Thomas
author_sort Makkink, Thomas
collection Thesis
description Memory is an important component of effective learning systems and is crucial in non-Markovian as well as partially observable environments. In recent years, Long Short-Term Memory (LSTM) networks have been the dominant mechanism for providing memory in reinforcement learning, however, the success of transformers in natural language processing tasks has highlighted a promising and viable alternative. Memory in reinforcement learning is particularly difficult as rewards are often sparse and distributed over many time steps. Early research into transformers as memory mechanisms for reinforcement learning indicated that the canonical model is not suitable, and that additional gated recurrent units and architectural modifications are necessary to stabilize these models. Several additional improvements to the canonical model have further extended its capabilities, such as increasing the attention span, dynamically selecting the number of per-symbol processing steps and accelerating convergence. It remains unclear, however, whether combining these improvements could provide meaningful performance gains overall. This dissertation examines several extensions to the canonical Transformer as memory mechanisms in reinforcement learning and empirically studies their combination, which we term the Integrated Transformer. Our findings support prior work that suggests gating variants of the Transformer architecture may outperform LSTMs as memory networks in reinforcement learning. However, our results indicate that while gated variants of the Transformer architecture may be able to model dependencies over a longer temporal horizon, these models do not necessarily outperform LSTMs when tasked with retaining increasing quantities of information.
format Thesis
id oai:open.uct.ac.za:11427/35840
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:35:53.219Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2022
publishDateRange 2022
publishDateSort 2022
publisher Department of Mathematics and Applied Mathematics
publisherStr Department of Mathematics and Applied Mathematics
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/35840 Evaluating transformers as memory systems in reinforcement learning Makkink, Thomas Shock, Jonathan Pretorius, Arnu Mathematics and Applied Mathematics Memory is an important component of effective learning systems and is crucial in non-Markovian as well as partially observable environments. In recent years, Long Short-Term Memory (LSTM) networks have been the dominant mechanism for providing memory in reinforcement learning, however, the success of transformers in natural language processing tasks has highlighted a promising and viable alternative. Memory in reinforcement learning is particularly difficult as rewards are often sparse and distributed over many time steps. Early research into transformers as memory mechanisms for reinforcement learning indicated that the canonical model is not suitable, and that additional gated recurrent units and architectural modifications are necessary to stabilize these models. Several additional improvements to the canonical model have further extended its capabilities, such as increasing the attention span, dynamically selecting the number of per-symbol processing steps and accelerating convergence. It remains unclear, however, whether combining these improvements could provide meaningful performance gains overall. This dissertation examines several extensions to the canonical Transformer as memory mechanisms in reinforcement learning and empirically studies their combination, which we term the Integrated Transformer. Our findings support prior work that suggests gating variants of the Transformer architecture may outperform LSTMs as memory networks in reinforcement learning. However, our results indicate that while gated variants of the Transformer architecture may be able to model dependencies over a longer temporal horizon, these models do not necessarily outperform LSTMs when tasked with retaining increasing quantities of information. 2022-02-23T15:40:59Z 2022-02-23T15:40:59Z 2021 2022-02-23T15:34:07Z Master Thesis Masters MSc http://hdl.handle.net/11427/35840 eng application/pdf Department of Mathematics and Applied Mathematics Faculty of Science
spellingShingle Mathematics and Applied Mathematics
Makkink, Thomas
Evaluating transformers as memory systems in reinforcement learning
thesis_degree_str Master's
title Evaluating transformers as memory systems in reinforcement learning
title_full Evaluating transformers as memory systems in reinforcement learning
title_fullStr Evaluating transformers as memory systems in reinforcement learning
title_full_unstemmed Evaluating transformers as memory systems in reinforcement learning
title_short Evaluating transformers as memory systems in reinforcement learning
title_sort evaluating transformers as memory systems in reinforcement learning
topic Mathematics and Applied Mathematics
url http://hdl.handle.net/11427/35840
work_keys_str_mv AT makkinkthomas evaluatingtransformersasmemorysystemsinreinforcementlearning