Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Machine learning approach for site classification

Brits, L. A. 2025. Machine Learning Approach for Site Classification. Unpublished doctoral dissertation. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/36808931-50fc-4d14-aa7b-fec9c853fc75

Saved in:
Bibliographic Details
Main Author: Brits, Laurence Armand
Other Authors: MacRobert, C.
Format: Thesis
Published: Stellenbosch : Stellenbosch University 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613816008736768
access_status_str Open Access
author Brits, Laurence Armand
author2 MacRobert, C.
author_browse Brits, Laurence Armand
MacRobert, C.
author_facet MacRobert, C.
Brits, Laurence Armand
author_sort Brits, Laurence Armand
collection Thesis
dc_rights_str_mv Stellenbosch University
description Brits, L. A. 2025. Machine Learning Approach for Site Classification. Unpublished doctoral dissertation. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/36808931-50fc-4d14-aa7b-fec9c853fc75
format Thesis
id oai:scholar.sun.ac.za:10019.1/132088
institution Stellenbosch University (South Africa)
last_indexed 2026-06-10T12:42:07.859Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/132088 Machine learning approach for site classification Brits, Laurence Armand MacRobert, C. Stellenbosch University. Faculty of Engineering. Dept. of Civil Engineering. Machine learning Geotechnical engineering Soils -- Classification -- Data processing Support vector machines Decision trees Natural language processing (Computer science) UCTD Brits, L. A. 2025. Machine Learning Approach for Site Classification. Unpublished doctoral dissertation. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/36808931-50fc-4d14-aa7b-fec9c853fc75 Thesis (MEng)--Stellenbosch University, 2025. ENGLISH ABSTRACT: This thesis presents a Machine Learning approach for site classification according to the Geotechnical Site Investigations for Housing Developments (GSFH-2) based on MCCSTO (moisture condition, colour, consistency, structure, texture and origin) descriptions. Based on input from practicing engineers and geologists a flowchart was developed to classify text descriptions of 416 individual soil layers into GSFH-2 classification. These soil layers were classified according to the expected soil movement by considering the descriptions of the moisture conditions, colour, consistency, structure, texture and origin of the soil. Three machine learning models, namely: Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF) models, were then developed using the database. Term Frequency–Inverse Document Frequency (TF-IDF) was used as an embedding technique in combination with other Natural Language Processing (NLP) methods, namely Lemmatisation, Porter-stemming, and N-grams to obtain the model that can best predict the classification of a given soil layer. To evaluate and analyse the results Feature Importance, Confusion Matrices, and statistical metrics (precision, recall, F-1 score, and accuracy) were used. The results revealed that the three models achieved an average accuracy of 70.3%, with the RF-model, utilizing only lower-casing as a preprocessing step, achieving the highest accuracy of 71% on the testing data. The RF-model was evaluated against a validation dataset and results indicated that the addition of verified labeled data can increase the accuracy of the proposed RF-model and reduce the overfitting of the model to the training data. AFRIKAANSE OPSOMMING: Hierdie tesis bied ’n Masjienleer-benadering vir terreinklassifikasie volgens die Geotegniese Terreinondersoeke vir Behuisingsontwikkelings (GSFH-2) gebaseer op die MCCSTO-beskrywings. Op grond van insette van praktiserende ingenieurs en geoloë is ’n vloeidiagram ontwikkel om 416 individuele grondlae in GSFH-2-klassifikasie te kategoriseer. Hierdie grondlae is geklassifiseer volgens die verwagte grondbeweging deur die beskrywings van die vogtoestand, kleur, digtheid of styfheid, struktuur, tekstuur en oorsprong van die grond in ag te neem. Drie masjienleermodelle, naamlik: Ondersteuningsvektormasjien (SVM), Besluitboom (DT), en Willekeurige Woud (RF)-modelle, is met behulp van die databasis ontwikkel. Termfrekwensie–Omgekeerde Dokumentfrekwensie (TF-IDF) is as ’n inbeddingstegniek gebruik in kombinasie met ander NTP-metodes, soos lemmatisering, Porter-stamvorming en N-gramme, om die model te verkry wat die beste voorspellingsvermoë vir die klassifikasie van ’n gegewe grondlaag bied. Om die resultate te evalueer en te ontleed, is kenmerkbelang, verwarringsmatrikse, en statistiese maatstawwe (presisie, herroepingsvermoë, F1-telling en akkuraatheid) gebruik. Die resultate het getoon dat die drie modelle ’n gemiddelde akkuraatheid van 70.3% behaal het, met die RF-model, met slegs hoofletterverandering as ’n voorverwerkingsstap, wat die hoogste akkuraatheid van 71% bereik het. Die RF-model is teen ’n valideringsdatastel geëvalueer, en die resultate het aangedui dat die toevoeging van geverifieerde gemerkte data die akkuraatheid van die voorgestelde RF-model kan verhoog en die oorpassing van die model by die opleidingsdata kan verminder. Masters 2025-05-23T06:34:38Z 2025-05-23T06:34:38Z 2025-03 Thesis https://scholar.sun.ac.za/handle/10019.1/132088 Stellenbosch University xi, 107 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle Machine learning
Geotechnical engineering
Soils -- Classification -- Data processing
Support vector machines
Decision trees
Natural language processing (Computer science)
UCTD
Brits, Laurence Armand
Machine learning approach for site classification
title Machine learning approach for site classification
title_full Machine learning approach for site classification
title_fullStr Machine learning approach for site classification
title_full_unstemmed Machine learning approach for site classification
title_short Machine learning approach for site classification
title_sort machine learning approach for site classification
topic Machine learning
Geotechnical engineering
Soils -- Classification -- Data processing
Support vector machines
Decision trees
Natural language processing (Computer science)
UCTD
url https://scholar.sun.ac.za/handle/10019.1/132088
work_keys_str_mv AT britslaurencearmand machinelearningapproachforsiteclassification