Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Thesis (MSc)--Stellenbosch University, 2021.
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | en_ZA |
| Published: |
Stellenbosch : Stellenbosch University
2021
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613907469729792 |
|---|---|
| access_status_str | Open Access |
| author | Strydom, Stefan |
| author2 | Van der Merwe, Brink |
| author_browse | Strydom, Stefan Van der Merwe, Brink |
| author_facet | Van der Merwe, Brink Strydom, Stefan |
| author_sort | Strydom, Stefan |
| collection | Thesis |
| dc_rights_str_mv | Stellenbosch University |
| description | Thesis (MSc)--Stellenbosch University, 2021. |
| format | Thesis |
| id | oai:scholar.sun.ac.za:10019.1/123654 |
| institution | Stellenbosch University (South Africa) |
| language | en_ZA |
| last_indexed | 2026-06-10T12:43:36.390Z |
| license_str | Other — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository |
| publishDate | 2021 |
| publishDateRange | 2021 |
| publishDateSort | 2021 |
| publisher | Stellenbosch : Stellenbosch University |
| publisherStr | Stellenbosch : Stellenbosch University |
| record_format | dspace |
| source_str | SUNScholar — Stellenbosch University Repository |
| spelling | oai:scholar.sun.ac.za:10019.1/123654 Automatic assignment of diagnosis codes to free-form text medical notes Strydom, Stefan Van der Merwe, Brink Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Computer Science. Clinical auto-coding systems Machine learning Diagnosis related groups -- Automation Medical codes -- Automatic control UCTD Thesis (MSc)--Stellenbosch University, 2021. ENGLISH ABSTRACT: Clinical coding is the process of describing and categorising healthcare episodes according to standardised ontologies. The coded data have important downstream applications, including population morbidity studies, health systems planning and reimbursement. Clinical codes are generally assigned based on information contained in free-form text clinical notes by specialist human coders. This process is expensive, time-consuming, subject to human error and burdens scarce clinical human resources with administrative roles. An accurate automatic coding system can alleviate these problems. Clinical coding is a challenging task for machine learning systems. The source texts are often long, has a highly specialised vocabulary, contains non-standard clinician shorthand and the code sets can contain tens-of-thousands of codes. We review previous work on clinical auto-coding systems and perform an empirical analysis of widely used and current state-of-the-art machine learning approaches to the problem. We propose a novel attention mechanism that takes the text description of clinical codes into account. We also construct a small pre-trained transformer model that achieves state-of-the-art performance on the MIMIC II and III ICD-9 auto-coding tasks. To the best of our knowledge, it is the first successful application of a pre-trained transformer model on this task. AFRIKAANSE OPSOMMING: Kliniese kodering is die proses om gesondheidsorg-voorvalle volgens gestandaardiseerde ontologieë te beskryf en te kategoriseer. Die gekodeerde data het belangrike praktiese toepassings, insluitend studies omtrent die siektelas in die bevolking, gesondheidstelselbeplanning en regverdige vergoeding van medici. Kliniese kodes word gewoonlik toegeken deur klinies-opgeleide persone op grond van inligting vervat in vrye teks kliniese aantekeninge. Hierdie proses is duur, tydrowend, onderhewig aan menslike foute en belas skaars kliniese menslike hulpbronne met administratiewe rolle. ’n Akkurate outomatiese koderingstelsel kan help om hierdie probleme te verlig. Kliniese kodering is ’n uitdagende taak vir masjienleerstelsels. Die kliniese teks is dikwels lank, het ’n gespesialiseerde woordeskat, bevat nie-standaard kliniese snelskrif en die kodestelle kan tienduisende kodes bevat. Ons ondersoek vorige werk oor kliniese outokoderingstelsels en voer ’n empiriese analise uit van die mees algemene en beste-in-klas masjienleerbenaderings tot die probleem. Ons stel ’n nuwe aandagmeganisme voor wat die teksbeskrywing van kliniese kodes tydens klassifikasie in ag neem. Ons konstrueer ook ’n klein voorafopgeleide transformatormodel wat huidige maatstawwe vir die MIMIC II and III ICD-9 outokoderingstake oortref. Na ons beste wete is dit die eerste suksesvolle toepassing van ’n vooraf opgeleide transformatormodel vir hierdie taak. Masters 2021-09-05T13:16:11Z 2021-12-22T14:14:14Z 2021-09-05T13:16:11Z 2021-12-22T14:14:14Z 2021-12 Thesis http://hdl.handle.net/10019.1/123654 en_ZA Stellenbosch University xii, 102 pages application/pdf Stellenbosch : Stellenbosch University |
| spellingShingle | Clinical auto-coding systems Machine learning Diagnosis related groups -- Automation Medical codes -- Automatic control UCTD Strydom, Stefan Automatic assignment of diagnosis codes to free-form text medical notes |
| title | Automatic assignment of diagnosis codes to free-form text medical notes |
| title_full | Automatic assignment of diagnosis codes to free-form text medical notes |
| title_fullStr | Automatic assignment of diagnosis codes to free-form text medical notes |
| title_full_unstemmed | Automatic assignment of diagnosis codes to free-form text medical notes |
| title_short | Automatic assignment of diagnosis codes to free-form text medical notes |
| title_sort | automatic assignment of diagnosis codes to free form text medical notes |
| topic | Clinical auto-coding systems Machine learning Diagnosis related groups -- Automation Medical codes -- Automatic control UCTD |
| url | http://hdl.handle.net/10019.1/123654 |
| work_keys_str_mv | AT strydomstefan automaticassignmentofdiagnosiscodestofreeformtextmedicalnotes |