Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
de Greeff, S. 2025. A computer vision framework towards automated scene understanding & analysis. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/8523bc56-ade0-43cb-9a6c-6b55d015c216
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Published: |
Stellenbosch : Stellenbosch University
2025
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613942981853184 |
|---|---|
| access_status_str | Open Access |
| author | De Greeff, Sarah-lee |
| author2 | Grobler, Jacomine |
| author_browse | De Greeff, Sarah-lee Grobler, Jacomine |
| author_facet | Grobler, Jacomine De Greeff, Sarah-lee |
| author_sort | De Greeff, Sarah-lee |
| collection | Thesis |
| dc_rights_str_mv | Stellenbosch University |
| description | de Greeff, S. 2025. A computer vision framework towards
automated scene understanding & analysis. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/8523bc56-ade0-43cb-9a6c-6b55d015c216 |
| format | Thesis |
| id | oai:scholar.sun.ac.za:10019.1/132130 |
| institution | Stellenbosch University (South Africa) |
| last_indexed | 2026-06-10T12:44:09.875Z |
| license_str | Other — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository |
| publishDate | 2025 |
| publishDateRange | 2025 |
| publishDateSort | 2025 |
| publisher | Stellenbosch : Stellenbosch University |
| publisherStr | Stellenbosch : Stellenbosch University |
| record_format | dspace |
| source_str | SUNScholar — Stellenbosch University Repository |
| spelling | oai:scholar.sun.ac.za:10019.1/132130 A computer vision framework towards automated scene understanding & analysis De Greeff, Sarah-lee Grobler, Jacomine Samuels, J. A. Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering. Computer vision -- Industrial applications Pattern recognition systems Tracking (Engineering) Artificial intelligence -- Data processing UCTD de Greeff, S. 2025. A computer vision framework towards automated scene understanding & analysis. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/8523bc56-ade0-43cb-9a6c-6b55d015c216 Thesis (MEng)--Stellenbosch University, 2025. ENGLISH ABSTRACT: It is well-known that recent advancements in the domain of artificial intelligence and the increased capability of computer hardware have significantly advanced the field of computer vision – a field of study which enables computers to “see” and extract meaningful information from visual inputs, similar to human perception. A prominent application area within the domain of computer vision is scene understanding. Various powerful approaches towards scene understanding employ computer vision tasks to extrapolate semantic information about scenes, allowing computers to understand relationships between objects and their environments. Such computer vision tasks include object detection, recognition, tracking, pose estimation, and contextual reasoning. Most computer vision algorithms are deep learning based approaches but differ significantly in architecture. The computer vision tasks investigated in this thesis utilise architectures consisting of backbone, neck, and head architecture as well as alternative transformer architectures. Although computer vision applications are diverse, there remain fields that have not yet fully benefited from these developments. One such field is energy auditing – a process undertaken to evaluate and improve the energy management of buildings. In this thesis, a proof-of-concept framework is developed, capable of extracting information regarding appliances present in a given building scene or environment by employing object detection and object tracking tasks. The objective of the proposed framework is to train various object detection models and recommend the best-performing model for further implementation, in conjunction with object tracking models, to analyse video footage of environments needing to be audited. The framework facilitates the processing of raw data, training of object detection models with respect to the proposed data, and the deployment of the trained model with respect to unseen video footage. A structured literature review is conducted in this thesis to investigate the pertinent literature related to computer vision applications within the energy auditing domain. The fundamentals of deep learning, computer vision and energy auditing are also explored. The proposed framework is first applied to a subset of a publicly accepted benchmark dataset to verify its correct functioning. Subsequently, to further assess the framework’s performance and applicability, it is applied to a novel case study dataset provided by an industry partner, containing images of appliances common in an educational institution. The framework facilitates hyperparameter tuning to determine the best parameters for each model being trained. The best-performing model, RTDeTR, is then utilised to detect and track appliances of interest, providing information regarding the number of appliances present. The information attained by the models is essential for the environment’s energy consumption computation. Furthermore, the result is used to compute the total energy consumption of the space captured with six various videos. The results are validated by a subject matter expert, demonstrating the practical utility of the framework in real-world energy auditing applications. AFRIKAANSE OPSOMMING: Dit is algemeen bekend dat onlangse vooruitgang in die gebied van kunsmatige intelligensie en die verhoogde kapasiteit van rekenaarhardeware die veld van rekenaarvisie aansienlik bevorder het — ’n studieveld wat rekenaars in staat stel om te ”sien” en betekenisvolle inligting uit visuele insette te onttrek, soortgelyk aan menslike persepsie. ’n Prominente toepassingsgebied binne die domein van rekenaarvisie is toneelbegrip. Verskeie kragtige benaderings tot toneelbegrip maak gebruik van rekenaarvisie-take om semantiese inligting oor tonele te onttrek, wat rekenaars in staat stel om verhoudings tussen voorwerpe en hul omgewings te verstaan. Sulke rekenaarvisie-take sluit in voorwerpopsporing, herkenning, nasporing, posisie-skatting en kontekstuele redenasie. Die meeste rekenaarvisie-algoritmes is diep-leer gebaseerde benaderings, maar verskil aansienlik in argitektuur. Die rekenaarvisie-take wat in hierdie tesis ondersoek word, maak gebruik van argitekture wat bestaan uit rugsteen, nek en kop argitektuur, sowel as alternatiewe transformer argitekture. Alhoewel rekenaarvisie-toepassings uiteenlopend is, bly daar steeds velde wat nog nie ten volle voordeel getrek het uit hierdie ontwikkelings nie. Een so ’n veld is energie-oudits — ’n proses wat onderneem word om die energiebestuur van geboue te evalueer en te verbeter. In hierdie tesis word ’n bewys-van-konsep raamwerk ontwikkel wat inligting kan onttrek rakende toestelle wat in ’n gegewe geboutoneel of omgewing teenwoordig is deur gebruik te maak van voorwerpopsporing en voorwerpnasporingstakke. Die doel van die voorgestelde raamwerk is om verskeie voorwerpopsporingmodelle op te lei en die beste presterende model aan te beveel vir verdere implementering, in samewerking met voorwerpnasporingsmodelle, om videomateriaal van omgewings wat geouditeer moet word, te ontleed. Die raamwerk fasiliteer die verwerking van rou data, opleiding van voorwerpopsporingmodelle met betrekking tot die voorgestelde data, en die ontplooiing van die opgeleide model met betrekking tot onsigbare videomateriaal. ’n Gestruktureerde literatuuroorsig word in hierdie tesis uitgevoer om die relevante literatuur rakende rekenaarvisie-toepassings binne die energie-ouditsdomein te ondersoek. Die grondbeginsels van diep leer, rekenaarvisie en energie-oudits word ook ondersoek. Die voorgestelde raamwerk word eers toegepas op ’n substel van ’n algemeen aanvaarde maatstawedataset om die korrekte werking daarvan te verifieer. Vervolgens, om die raamwerk se prestasie en toepaslikheid verder te assesseer, word dit toegepas op ’n nuwe gevallestudiedataset wat deur ’n industrievennoot voorsien word, wat beelde bevat van toestelle wat algemeen voorkom in ’n opvoedkundige instelling. Die raamwerk fasiliteer hiperparameter-afstemming om die beste parameters vir elke model wat opgelei word, te bepaal. Die beste presterende model word dan gebruik om toestelle van belang op te spoor en na te spoor, wat inligting verskaf oor die aantal toestelle wat teenwoordig is. Die inligting wat deur die modelle verkry word, is noodsaaklik vir die berekening van die omgewing se energieverbruik. Verder word die resultaat gebruik om die totale energieverbruik van die ruimte te bereken, waar die resultate deur ’n vakkenner gevalideer word, wat die praktiese nut van die raamwerk in werklike energie-oudits-toepassings demonstreer. Masters 2025-05-27T08:09:07Z 2025-05-27T08:09:07Z 2025-03 Thesis https://scholar.sun.ac.za/handle/10019.1/132130 Stellenbosch University xxii, 166 pages : illustrations application/pdf Stellenbosch : Stellenbosch University |
| spellingShingle | Computer vision -- Industrial applications Pattern recognition systems Tracking (Engineering) Artificial intelligence -- Data processing UCTD De Greeff, Sarah-lee A computer vision framework towards automated scene understanding & analysis |
| title | A computer vision framework towards automated scene understanding & analysis |
| title_full | A computer vision framework towards automated scene understanding & analysis |
| title_fullStr | A computer vision framework towards automated scene understanding & analysis |
| title_full_unstemmed | A computer vision framework towards automated scene understanding & analysis |
| title_short | A computer vision framework towards automated scene understanding & analysis |
| title_sort | computer vision framework towards automated scene understanding analysis |
| topic | Computer vision -- Industrial applications Pattern recognition systems Tracking (Engineering) Artificial intelligence -- Data processing UCTD |
| url | https://scholar.sun.ac.za/handle/10019.1/132130 |
| work_keys_str_mv | AT degreeffsarahlee acomputervisionframeworktowardsautomatedsceneunderstandinganalysis AT degreeffsarahlee computervisionframeworktowardsautomatedsceneunderstandinganalysis |