Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Large language models and software testing

Thesis (MSc)--Stellenbosch University, 2024.

Saved in:
Bibliographic Details
Main Author: Dewey, Marco
Other Authors: Inggs, Cornelia P.
Format: Thesis
Language:en_ZA
en_ZA
Published: Stellenbosch : Stellenbosch University 2024
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613867435098112
access_status_str Open Access
author Dewey, Marco
author2 Inggs, Cornelia P.
author_browse Dewey, Marco
Inggs, Cornelia P.
author_facet Inggs, Cornelia P.
Dewey, Marco
author_sort Dewey, Marco
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MSc)--Stellenbosch University, 2024.
format Thesis
id oai:scholar.sun.ac.za:10019.1/130167
institution Stellenbosch University (South Africa)
language en_ZA
en_ZA
last_indexed 2026-06-10T12:42:57.574Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2024
publishDateRange 2024
publishDateSort 2024
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/130167 Large language models and software testing Dewey, Marco Inggs, Cornelia P. Visser, Willem Stellenbosch University. Faculty of Science. Dept. of Computer Science. Large language model -- Testing Computational linguistics -- Evaluation Language and languages -- Data processing Natural language processing (Computer science) -- Testing Computer software -- Testing UCTD Thesis (MSc)--Stellenbosch University, 2024. ENGLISH ABSTRACT: This thesis examines the viability of leveraging transformer-based large language models, exemplified by Codex, f or the a utomated g eneration of test suites in production software. By leveraging the abilities large language models exhibit for understanding and generating natural and coding languages, these models can analyze code and comments to generate contextually relevant test cases. Using these models in the domain of automatic software testing presents a potential solution to the oracle problem. The research involves a comparative analysis between Codex and a promi- nent automatic testing tool, EvoSuite, using the Commons-Lang library from the Defects4J benchmark. This comparison draws insights regarding Codex’s efficacy in ge nerating co verage te sts an d id entifying fa ulty be havior within production code. The findings o f t his thesis a rgue t hat C odex w hile demon- strating promise, exhibits limitations as an automatic testing tool in achieving high test coverage and uncovering software bugs. Moreover, the study high- lights potential challenges associated with utilizing open-source repositories for training and testing code generation by large language models, including the risk of incorporating inconsistent coding conventions and suboptimal software testing practices into these models. AFRIKAANSE OPSOMMING: Hierdie tesis ondersoek hoe prakties dit is om transformeerder-gebaseerde groot taalmodelle, soos byvoorbeeld Codex, vir die outomatiese generering van toetsgevalle vir produksiesagteware te gebruik. Deur gebruik te maak van die vermoëns van groot taalmodelle om natuurlike tale en programmeringstale te verstaan en te genereer, kan hierdie modelle kode en kommentaar analiseer om kontekstueel-relevante toetsgevalle te genereer. Die gebruik van hierdie modelle op die gebied van outomatiese sagtewaretoetsing bied ’n potensiële oplossing vir die orakelprobleem. Die navorsing behels ’n vergelykende analise tussen Codex en ’n promi- nente outomatiese toetsingshulpmiddel, EvoSuite, deur gebruik te maak van die Commons-Lang biblioteek wat deel is van die Defects4J maatstaf. Hierdie vergelyking bied insigte oor die doeltreffendheid van Codex om dekkingstoetse te genereer en foutiewe gedrag binne produksiekode te identifiseer. D ie be- vindinge van hierdie tesis beweer dat Codex, alhoewel dit belowend lyk, be- perkings toon as ’n outomatiese toetsingshulpmiddel om hoë toetsdekking te bereik en sagtewarefoute bloot te lê. Verder beklemtoon die studie potensiële uitdagings wat gepaard gaan met die gebruik van oopbronbewaarplekke vir die opleiding en toetsing van groot taalmodelle om kode te genereer, insluitend die risiko om onkonsekwente koderingskonvensies en suboptimale sagtewaretoets- praktyke in hierdie modelle in te sluit. Masters 2024-03-04T17:21:04Z 2024-04-26T07:44:57Z 2024-03-04T17:21:04Z 2024-04-26T07:44:57Z 2024-03 Thesis https://scholar.sun.ac.za/handle/10019.1/130167 en_ZA en_ZA Stellenbosch University viii, 90 pages application/pdf Stellenbosch : Stellenbosch University
spellingShingle Large language model -- Testing
Computational linguistics -- Evaluation
Language and languages -- Data processing
Natural language processing (Computer science) -- Testing
Computer software -- Testing
UCTD
Dewey, Marco
Large language models and software testing
title Large language models and software testing
title_full Large language models and software testing
title_fullStr Large language models and software testing
title_full_unstemmed Large language models and software testing
title_short Large language models and software testing
title_sort large language models and software testing
topic Large language model -- Testing
Computational linguistics -- Evaluation
Language and languages -- Data processing
Natural language processing (Computer science) -- Testing
Computer software -- Testing
UCTD
url https://scholar.sun.ac.za/handle/10019.1/130167
work_keys_str_mv AT deweymarco largelanguagemodelsandsoftwaretesting