Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Thesis (PhD)--Stellenbosch University, 2025.
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | English |
| Published: |
Stellenbosch : Stellenbosch University
2025
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613913389989888 |
|---|---|
| access_status_str | Open Access |
| author | Barrish, Daniel |
| author2 | Van Vuuren, Jan |
| author_browse | Barrish, Daniel Van Vuuren, Jan |
| author_facet | Van Vuuren, Jan Barrish, Daniel |
| author_sort | Barrish, Daniel |
| collection | Thesis |
| dc_rights_str_mv | Stellenbosch University |
| description | Thesis (PhD)--Stellenbosch University, 2025. |
| format | Thesis |
| id | oai:scholar.sun.ac.za:10019.1/134506 |
| institution | Stellenbosch University (South Africa) |
| language | English |
| last_indexed | 2026-06-10T12:43:41.995Z |
| license_str | Other — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository |
| publishDate | 2025 |
| publishDateRange | 2025 |
| publishDateSort | 2025 |
| publisher | Stellenbosch : Stellenbosch University |
| publisherStr | Stellenbosch : Stellenbosch University |
| record_format | dspace |
| source_str | SUNScholar — Stellenbosch University Repository |
| spelling | oai:scholar.sun.ac.za:10019.1/134506 On the theory and practice of anomaly detection in time series Barrish, Daniel Van Vuuren, Jan Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering. Anomaly detection (Computer security) Time-series analysis Threshold (Perception) Algorithm Data mining -- Statistical methods Thesis (PhD)--Stellenbosch University, 2025. Barrish, D. 2025. On the Theory and Practice of Anomaly Detection in Time Series. Unpublished doctoral dissertation. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/6daf69f0-8733-47f4-a028-c5c6c0029bef ENGLISH ABSTRACT: The detection of anomalies in time series data is a critical task in numerous domains, including industrial predictive maintenance, healthcare, information technology, and finance. Progress in the field is, however, hindered by a persistent gap between theoretical research and practical application, often due to flawed benchmarking practices, misaligned evaluation metrics, and a lack of comprehensive, end-to-end frameworks. These challenges are addressed in this dissertation in which a systematic, multi-faceted investigation into the theory and practice of time series anomaly detection is documented. A rigorous critique of existing public benchmark datasets is first undertaken, which reveals significant deficiencies in respect of label accuracy, unrealistic anomaly densities, and anomaly detection triviality. In response, a new, principled archive, called the Univariate Dataset Archive for Time Series Anomaly Detection, is introduced which includes curated public datasets and two novel synthetic datasets generated from complex dynamical systems. This archive provides a more reliable foundation for evaluating and comparing anomaly detection algorithms. Conventional anomaly detection evaluation methodologies are also challenged and the so-called realistic F-score is proposed in response. This novel metric is designed to better reflect the practical requirements of anomaly detection systems by appropriately handling contiguous anomalous events and individual false alarms. Leveraging this improved evaluation framework, a comprehensive comparative study of anomaly scoring algorithms is further conducted during which the well-known local outlier factor algorithm is identified as a top performer. This algorithm is subsequently enhanced quite significantly by facilitating graphics processing unit acceleration and adopting an ensembling approach. The resulting improved algorithm is empirically shown to achieve state-of-the-art accuracy. The often-overlooked, yet crucial, stages of thresholding and postprocessing are also investigated systematically. Simple, layered techniques are shown to improve the utility of anomaly alerts significantly. An efficient hyperparameter optimisation strategy based on the well-known tree-structured Parzen estimator is proposed and validated in the contexts of both static and streaming data scenarios in order to automate the tuning of the numerous parameters across the anomaly detection pipeline. novel anomaly detection framework called the Generic Anomaly Detection in Time Series framework. This framework is a modular, scalable, automated, online, and evolvable end-to-end pipeline designed to provide a robust and practical anomaly detection solution for real-world applications. By addressing foundational problems in benchmarking, evaluation, and optimisation, the framework establishes a rigorous and practical path forward for the field of time series anomaly detection. AFRIKAANSE OPSOMMING: Die opsporing van anomalie¨e in tydreeksdata is ’n kritieke taak in talle terreine, insluitend voorspellende instandhouding in die bedryf, gesondheidsorg, inligtingstegnologie en finansies. Vordering in die veld word egter belemmer deur ’n volgehoue gaping tussen teoretiese navorsing en praktiese toepassing, dikwels as gevolg van gebrekkige maatstafpraktyke, foutief-belynde evalueringsmaatstawwe en ’n gebrek aan omvattende, end-tot-end raamwerke. Hierdie uitdagings word in di´e proefskrif aangespreek, waarin ’n sistematiese, veelsydige ondersoek na die teorie en praktyk van tydreeksanomalie-opsporing gedokumenteer word. ’n Deeglike kritiek op bestaande openbare maatstafdatastelle word eers onderneem, waaruit beduidende tekortkominge ten opsigte van etiket-akkuraatheid, onrealistiese anomalie-digthede en anomalie-opsporingstrivialiteit aan die lig kom. In reaksie hierop word ’n nuwe, beginselvaste argief, genaamd die Een-veranderlike Datastel Argief vir Tydreeksanomalie-opsporing, bekendgestel, wat saamgestelde openbare datastelle insluit, sowel as twee nuwe sintetiese datastelle wat uit komplekse dinamiese stelsels gegenereer word. Hierdie argief bied ’n meer betroubare grondslag vir die evaluering en vergelyking van anomalie-opsporingsalgoritmes. Konvensionele evalueringsmetodologie¨e vir anomalie-opsporing word ook bevraagteken en die sogenaamde realistiese F-telling word in reaksie voorgestel. Hierdie nuwe maatstaf is ontwerp om die praktiese vereistes van anomalie-opsporingstelsels beter te weerspie¨el deur aaneenlopende anomalie-voorkomste en individuele vals alarms toepaslik te hanteer. Deur van hierdie verbeterde evalueringsraamwerk gebruik te maak, word ’n omvattende vergelykende studie van anomalie-opsporingsalgoritmes verder uitgevoer waartydens die bekende plaaslike uitskieter-faktoralgoritme as ’n toppresteerder ge¨ıdentifiseer word. Hierdie algoritme word vervolgens aansienlik verbeter deur grafiese verwekingseenheid-versnelling te bewerkstellig en ’n ensemble-benadering te volg. Daar word empiries getoon dat die gevolglike verbeterde algoritme die mees gevorderde akkuraatheid behaal. Die belangrike aspekte van drempelbepaling en naverwerking wat dikwels in die literatuur oor die hoof gesien word, word ook stelselmatig ondersoek. Daar word getoon dat eenvoudige, belaagde tegnieke daartoe in staat is om die nut van anomalie-waarskuwings aansienlik te verbeter. ’n Doeltreffende hiperparameter-optimeringstrategie gebaseer op die bekende Boom-gestruktureerde Parzen-beramer word voorgestel en in die konteks van beide statiese en stroomdata-scenario’s gevalideer om sodoende die afskatting van die talle parameters in die anomalie-opsporingspyplyn te outomatiseer. Laastens word die bogenoemde diverse navorsingsbydraes saamgesnaer deur ’n nuwe anomalieopsporingsraamwerk daar te stel wat as die Generiese Anomalie-opsporing in Tydreekse raamwerk bekendstaan. Hierdie raamwerk is ’n modulˆere, skaalbare, outomatiese, aanlyn en verderontwikkelbare end-tot-end pyplyn wat ontwerp is om ’n robuuste en praktiese anomalie-opsporingsoplossing vir werklike toepassings te bied. Deur fundamentele probleme in maatstafbepaling, evaluering en optimering aan te spreek, vestig die raamwerk ’n omvattende en praktiese pad vorentoe vir die veld van tydreeksanomalie-opsporing. Doctoral 2025-12-11T11:52:10Z 2025-12-11T11:52:10Z 2025-12 Thesis https://scholar.sun.ac.za/handle/10019.1/134506 en Stellenbosch University xxxii, 294 pages : illustrations application/pdf Stellenbosch : Stellenbosch University |
| spellingShingle | Anomaly detection (Computer security) Time-series analysis Threshold (Perception) Algorithm Data mining -- Statistical methods Barrish, Daniel On the theory and practice of anomaly detection in time series |
| title | On the theory and practice of anomaly detection in time series |
| title_full | On the theory and practice of anomaly detection in time series |
| title_fullStr | On the theory and practice of anomaly detection in time series |
| title_full_unstemmed | On the theory and practice of anomaly detection in time series |
| title_short | On the theory and practice of anomaly detection in time series |
| title_sort | on the theory and practice of anomaly detection in time series |
| topic | Anomaly detection (Computer security) Time-series analysis Threshold (Perception) Algorithm Data mining -- Statistical methods |
| url | https://scholar.sun.ac.za/handle/10019.1/134506 |
| work_keys_str_mv | AT barrishdaniel onthetheoryandpracticeofanomalydetectionintimeseries |