Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Enhancing Retrieval Augmented Generation Through Robust Information Retrieval

Thesis (MEng)--Stellenbosch University, 2026.

Saved in:
Bibliographic Details
Main Author: Singh, Nikhiel Rahul
Other Authors: Gwetu, Mandla
Format: Thesis
Language:English
Published: Stellenbosch : Stellenbosch University 2026
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613782209986560
access_status_str Open Access
author Singh, Nikhiel Rahul
author2 Gwetu, Mandla
author_browse Gwetu, Mandla
Singh, Nikhiel Rahul
author_facet Gwetu, Mandla
Singh, Nikhiel Rahul
author_sort Singh, Nikhiel Rahul
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MEng)--Stellenbosch University, 2026.
format Thesis
id oai:scholar.sun.ac.za:10019.1/135844
institution Stellenbosch University (South Africa)
language English
last_indexed 2026-06-10T12:41:36.774Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2026
publishDateRange 2026
publishDateSort 2026
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/135844 Enhancing Retrieval Augmented Generation Through Robust Information Retrieval Singh, Nikhiel Rahul Gwetu, Mandla Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering. Thesis (MEng)--Stellenbosch University, 2026. Singh, N. R. 2026. Enhancing Retrieval Augmented Generation Through Robust Information Retrieval. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/2118f725-9746-4584-8c9c-4cffde5464c1 Retrieval-Augmented Generation (RAG) is a popular technique for grounding the responses of Large Language Models (LLMs). RAG works by extending Information Retrieval (IR) to incorporate LLMs that generate responses based on retrieved information and a query. RAG is commonly used within the field of Natural Language Processing (NLP), specifically Natural Language Generation (NLG). Existing literature tends to overemphasize elaborate retrieval strategies that involve multiple LLM calls, but fail to detail what algorithm and configurations, constituted the naïve approach that it was benchmarked against. We compare standard and elaborate retrieval methods and observe that both had similar performances, with standard methods being Generative AI (GenAI) independent. Despite LLMs displaying graduate level problem solving capabilities, there was much dissatisfaction regarding GenAI technologies, indicating a need for greater emphasis and focus on the errors, limitations and applicability of proposed solutions. We attempt to reverse this dissatisfaction by offering design guidance and clarity to anyone attempting to incorporate GenAI capabilities into their workflows by providing techniques that facilitate Exploratory Data Analysis (EDA) in a RAG context, which allowed us to notice that relevance does not necessarily correlate with higher similarity/lexical scores. We detail the internal workings of exhaustive and partial semantic similarity and lexical, rule-based, retrieval algorithms, and provide formal representation for deterministic evaluation metrics and error analysis techniques. Our experiments show that lexical retrieval covers more of the limitations of semantic similarity retrieval in open domain environments, indicating that lexical based algorithms are still relevant and the NFL (No Free Lunch) Theorem still applies in IR, as we notice that semantic similarity and lexical based approaches, excel in closed and open domain environments, respectively. It stands to reason that existing RAG techniques and methodologies will improve with better Retrieval insight, which is the focus of this thesis. Masters 2026-04-13T09:45:05Z 2026-04-13T09:45:05Z 2026-03 Thesis https://scholar.sun.ac.za/handle/10019.1/135844 en Stellenbosch University 122 pages application/pdf Stellenbosch : Stellenbosch University
spellingShingle Singh, Nikhiel Rahul
Enhancing Retrieval Augmented Generation Through Robust Information Retrieval
title Enhancing Retrieval Augmented Generation Through Robust Information Retrieval
title_full Enhancing Retrieval Augmented Generation Through Robust Information Retrieval
title_fullStr Enhancing Retrieval Augmented Generation Through Robust Information Retrieval
title_full_unstemmed Enhancing Retrieval Augmented Generation Through Robust Information Retrieval
title_short Enhancing Retrieval Augmented Generation Through Robust Information Retrieval
title_sort enhancing retrieval augmented generation through robust information retrieval
url https://scholar.sun.ac.za/handle/10019.1/135844
work_keys_str_mv AT singhnikhielrahul enhancingretrievalaugmentedgenerationthroughrobustinformationretrieval