Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Thesis (MSc)--Stellenbosch University, 2026.
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | English |
| Published: |
Stellenbosch : Stellenbosch University
2026
|
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867614037867495424 |
|---|---|
| access_status_str | Open Access |
| author | Khanyi, Masana Hlengiwe Michelle |
| author2 | Dunaiski, Marcel |
| author_browse | Dunaiski, Marcel Khanyi, Masana Hlengiwe Michelle |
| author_facet | Dunaiski, Marcel Khanyi, Masana Hlengiwe Michelle |
| author_sort | Khanyi, Masana Hlengiwe Michelle |
| collection | Thesis |
| dc_rights_str_mv | Stellenbosch University |
| description | Thesis (MSc)--Stellenbosch University, 2026. |
| format | Thesis |
| id | oai:scholar.sun.ac.za:10019.1/136187 |
| institution | Stellenbosch University (South Africa) |
| language | English |
| last_indexed | 2026-06-10T12:45:40.774Z |
| license_str | Other — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository |
| publishDate | 2026 |
| publishDateRange | 2026 |
| publishDateSort | 2026 |
| publisher | Stellenbosch : Stellenbosch University |
| publisherStr | Stellenbosch : Stellenbosch University |
| record_format | dspace |
| source_str | SUNScholar — Stellenbosch University Repository |
| spelling | oai:scholar.sun.ac.za:10019.1/136187 Insights into the South African research landscape through mining theses and dissertations using transformer-based language models Khanyi, Masana Hlengiwe Michelle Dunaiski, Marcel Van Lill, Milandre Stellenbosch University. Faculty of Science. Dept. of Computer Science. Thesis (MSc)--Stellenbosch University, 2026. Khanyi, M. H. M. 2026. Insights into the South African research landscape through mining theses and dissertations using transformer-based language models. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/c1718131-6742-4e0f-9980-1b72fc7cd181 Postgraduate research plays a critical role in the development of national research capacity and advanced knowledge production. In South Africa, electronic theses and dissertations (ETDs) constitute a substantial yet underutilised body of scholarly output for analysing postgraduate training, knowledge production, and scholarly influence. Despite their significance, ETDs are rarely incorporated into large-scale scientometric analyses due to fragmented institutional repositories, limited standardisation, and weak integration with global bibliographic infrastructures. This study develops a methodology for harvesting, enriching, and analysing ETDs using a combination of metadata integration, full-text mining, and citation analysis. Institutional ETD metadata are integrated with OpenAlex, an open-access scholarly knowledge graph, using a stemming-assisted title matching approach and persistent identifier mapping. In addition, the study applies automated PDF mining techniques to extract reference lists and citation contexts directly from ETD full texts, enabling fine-grained analysis of citation behaviour beyond aggregate citation counts. The enriched dataset supports citation network construction, concept mapping, supervisor–student linkage, and longitudinal analysis of postgraduate research output. Empirical analyses focus on South Africa’s research-intensive universities and examine institutional productivity, temporal growth patterns, language use, retention dynamics, and citation characteristics of postgraduate research between 2000 and 2024. The results reveal a concentration of postgraduate research output among a small number of institutions, sustained growth prior to 2020, and a marked decline thereafter. This post-2020 downturn is likely influenced by economic pressures, funding constraints, and the disruptive effects of the COVID-19 pandemic, with implications for future doctoral production and national policy targets. By exploring ETD metadata integration, full-text citation mining, and open bibliographic enrichment, this study extends traditional publication-based scientometrics and demonstrates the value of ETDs as instruments for monitoring postgraduate research training capacity and informing evidence-based higher education policy in South Africa. Masters 2026-04-24T11:54:47Z 2026-04-24T11:54:47Z 2026-03 Thesis https://scholar.sun.ac.za/handle/10019.1/136187 en Stellenbosch University 110 pages : ill. application/pdf Stellenbosch : Stellenbosch University |
| spellingShingle | Khanyi, Masana Hlengiwe Michelle Insights into the South African research landscape through mining theses and dissertations using transformer-based language models |
| title | Insights into the South African research landscape through mining theses and dissertations using transformer-based language models |
| title_full | Insights into the South African research landscape through mining theses and dissertations using transformer-based language models |
| title_fullStr | Insights into the South African research landscape through mining theses and dissertations using transformer-based language models |
| title_full_unstemmed | Insights into the South African research landscape through mining theses and dissertations using transformer-based language models |
| title_short | Insights into the South African research landscape through mining theses and dissertations using transformer-based language models |
| title_sort | insights into the south african research landscape through mining theses and dissertations using transformer based language models |
| url | https://scholar.sun.ac.za/handle/10019.1/136187 |
| work_keys_str_mv | AT khanyimasanahlengiwemichelle insightsintothesouthafricanresearchlandscapethroughminingthesesanddissertationsusingtransformerbasedlanguagemodels |