Full Text Available

Access Repository Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews

Dissertation (MSc (Computer Science))--University of Pretoria, 2024.

Saved in:

Bibliographic Details
Other Authors:	Marivate, Vukosi
Format:	Thesis
Language:	English
Published:	University of Pretoria 2025
Subjects:	UCTD Sustainable Development Goals (SDGs) Large language models Sentiment analysis Retrieval-augmented generation Prompt engineering Conversational fine-tuning Retrieval augmented generation assessment Auto-regressive LLM
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613608319385600
access_status_str	Open Access
author2	Marivate, Vukosi
author_browse	Marivate, Vukosi
author_facet	Marivate, Vukosi
collection	Thesis
dc_rights_str_mv	© 2023 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
description	Dissertation (MSc (Computer Science))--University of Pretoria, 2024.
format	Thesis
id	oai:repository.up.ac.za:2263/101248
institution	University of Pretoria (South Africa)
language	English
last_indexed	2026-06-10T12:38:50.869Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from UPSpace — University of Pretoria Institutional Repository
publishDate	2025
publishDateRange	2025
publishDateSort	2025
publisher	University of Pretoria
publisherStr	University of Pretoria
record_format	dspace
source_str	UPSpace — University of Pretoria Institutional Repository
spelling	oai:repository.up.ac.za:2263/101248 From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews Marivate, Vukosi miehleketo.mathebula@tuks.co.za Modupe, Abiodun Mathebula, Miehleketo UCTD Sustainable Development Goals (SDGs) Large language models Sentiment analysis Retrieval-augmented generation Prompt engineering Conversational fine-tuning Retrieval augmented generation assessment Auto-regressive LLM Dissertation (MSc (Computer Science))--University of Pretoria, 2024. In contemporary society, social media enables rapid expression of public sentiment toward governmental policies and financial products. This immediacy and depth of sharing can serve as a virtual focus group for major financial decisions, offering a gold mine for understanding customer satisfaction and identifying new product features and services. Customer reviews are crucial for the profits and reputations of financial institutions. SA assesses customer feedback and media headlines to gauge sentiment but faces challenges with the brevity, abbreviations, and financial terminologies in social media content. Earlier studies used human-annotated text to create LBMs for training MLAs in SA. However, these models lacked robustness and failed to capture the full range of natural language semantics. Our research used advanced natural language processing to address this gap, gathering customer reviews from Hellopeter and financial data from the top five JSE-listed financial institutions in South Africa. We employed OpenAI's ChatGPT as a zero-shot learning model to produce human-like annotations for sentiment tasks. The feature vector from ChatGPT was input into BERT, BiLSTM, and a SoftMax function to measure and categorize sentiment. Oversampling methods addressed data imbalance, and visualization techniques were applied to review text and polarity. Our method performed as well as or better than recent cutting-edge methods, achieving an average score of 98.9%, an F1-measure of 97.7%, and an AUC of 91.90% with oversampling. Traditional LBMs, SVMs, and logistic regression achieved 86.68% accuracy and an AUC of 91.90%. The study demonstrates ChatGPT’s competence in annotating customer reviews with emotional tone or polarity, highlighting the benefits of integrating customer SA with financial analysis to prioritize customer preferences. To overcome LBMs' limitations and pre-defined sentiment lexicons, we developed LFEAR, which combines the RAG model with a conversational format for an ARFT. Fine-tuned on HelloPeter reviews, LFEAR demonstrated resilience and flexibility in analyzing sentiments across various domains. It achieved an average answer precision score of 98.45%, correctness of 93.85%, and context precision of 97.69% according to RAGAS metrics. The LFEAR model effectively conducted SA over multiple domains, demonstrating adaptability, proper sentiment annotation, and bias-free analysis. This approach is particularly beneficial for social media posts by financial sector stakeholders, including investors and institutions whose posts impact JSE-listed entities. Computer Science Msc (Computer Science) Unrestricted Faculty of Engineering, Built Environment and Information Technology None 2025-02-27T10:48:01Z 2025-02-27T10:48:01Z 2025-05 2024-11 Dissertation * A2025 http://hdl.handle.net/2263/101248 https://doi.org/10.25403/UPresearchdata.28504796 en © 2023 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. application/pdf University of Pretoria
spellingShingle	UCTD Sustainable Development Goals (SDGs) Large language models Sentiment analysis Retrieval-augmented generation Prompt engineering Conversational fine-tuning Retrieval augmented generation assessment Auto-regressive LLM From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews
title	From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews
title_full	From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews
title_fullStr	From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews
title_full_unstemmed	From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews
title_short	From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews
title_sort	from text annotation to an auto regressive language model for sentiment analysis in south african financial reviews
topic	UCTD Sustainable Development Goals (SDGs) Large language models Sentiment analysis Retrieval-augmented generation Prompt engineering Conversational fine-tuning Retrieval augmented generation assessment Auto-regressive LLM
url	http://hdl.handle.net/2263/101248 https://doi.org/10.25403/UPresearchdata.28504796

Full Text Available

From text annotation to an auto-regressive language model for sentiment analysis in South African financial reviews

Similar Items