Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Using transformers to assign ICD codes to medical notes

Thesis (MSc)--Stellenbosch University, 2023.

Saved in:
Bibliographic Details
Main Author: Dreyer, Andrei Michael
Other Authors: Van der Merwe, Brink
Format: Thesis
Language:en_ZA
en_ZA
Published: Stellenbosch : Stellenbosch University 2023
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614115168518144
access_status_str Open Access
author Dreyer, Andrei Michael
author2 Van der Merwe, Brink
author_browse Dreyer, Andrei Michael
Van der Merwe, Brink
author_facet Van der Merwe, Brink
Dreyer, Andrei Michael
author_sort Dreyer, Andrei Michael
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MSc)--Stellenbosch University, 2023.
format Thesis
id oai:scholar.sun.ac.za:10019.1/127086
institution Stellenbosch University (South Africa)
language en_ZA
en_ZA
last_indexed 2026-06-10T12:46:54.487Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/127086 Using transformers to assign ICD codes to medical notes Dreyer, Andrei Michael Van der Merwe, Brink Stellenbosch University. Faculty of Science. Dept. of Computer Science. Medical care Medical records -- Data processing Diseases -- Classification Thesis (MSc)--Stellenbosch University, 2023. ENGLISH ABSTRACT: International Classification of Disease (ICD) coding plays a significant role in classifying morbidity and mortality rates. Currently, ICD codes are assigned to a patient’s medical record by hand by medical practitioners or specialist clinical coders. This practice is prone to errors, and training skilled clinical coders requires time and human resources. Automatic prediction of ICD codes can help alleviate this burden. In this research, we look at transformer-based architectures for predicting ICD codes. Firstly, we expand the size of an XLNet model with label-wise attention to determine whether an increase in model size leads to a better performing model. We also look at using two transformer-based architectures that are specifically designed to handle long input sequences and compare the results from these architectures based on our best-performing XLNet model. Lastly, we look at the use of different attention mechanisms with our XLNet model to determine which attention mechanism works the best. We found the following three things: an increase in model size does lead to better results, XLNet performs better than the architectures designed for longer sequence lengths, and the label-wise attention used by our XLNet model performs better than the other attention mechanisms. AFRIKAANS OPSOMMING: Internasionale Klassifikasie van Siektes (ICD)-kodering speel ’n beduidende rol in die klassifikasie van morbiditeit en sterftesyfers. Tans word ICD-kodes tot ’n pasi¨ent se mediese rekord met die hand deur mediese praktisyns of spesialis kliniese kodeerders toegeken. Hierdie praktyk is geneig tot foute, en opleiding van geskoolde kliniese kodeerders verg tyd en menslike hulpbronne. Outomatiese voorspelling van ICD-kodes kan help om hierdie las te verlig.In hierdie navorsing kyk ons na transformator-gebaseerde argitekture vir die voorspelling van ICDkodes. Eerstens brei ons die grootte van ’n XLNet-model uit met etiketwyse aandag om te bepaal of ’n toename in modelgrootte lei tot ’n beter presterende model. Ons kyk ook na die gebruik van twee transformator-gebaseerde argitekture wat spesifiek ontwerp is om lang invoerreekse te hanteer en vergelyk die resultate van hierdie argitekture gebaseer op ons beste presterende XLNet-model. Laastens kyk ons na die gebruik van verskillende aandagmeganismes met ons XLNet-model om te bepaal watter aandagmeganisme die beste werk. Ons het die volgende drie dinge gevind: ’n toename in modelgrootte lei wel tot beter resultate, XLNet presteer beter as die argitekture wat ontwerp is vir langer reekslengtes, en die etiketgewyse aandag wat deur ons XLNet-model gebruik word, presteer beter as die ander aandagmeganismes. Masters 2023-03-04T05:58:51Z 2023-05-18T07:03:37Z 2023-03-04T05:58:51Z 2023-05-18T07:03:37Z 2023-03 Thesis http://hdl.handle.net/10019.1/127086 en_ZA en_ZA Stellenbosch University xi, 84 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle Medical care
Medical records -- Data processing
Diseases -- Classification
Dreyer, Andrei Michael
Using transformers to assign ICD codes to medical notes
title Using transformers to assign ICD codes to medical notes
title_full Using transformers to assign ICD codes to medical notes
title_fullStr Using transformers to assign ICD codes to medical notes
title_full_unstemmed Using transformers to assign ICD codes to medical notes
title_short Using transformers to assign ICD codes to medical notes
title_sort using transformers to assign icd codes to medical notes
topic Medical care
Medical records -- Data processing
Diseases -- Classification
url http://hdl.handle.net/10019.1/127086
work_keys_str_mv AT dreyerandreimichael usingtransformerstoassignicdcodestomedicalnotes