Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
The primary objective of this thesis is to develop rigorous Bayesian tools for common statistical challenges arising in modern science where there is a heightened demand for precise inference in the presence of large, known uncertainties. This thesis explores in detail two arenas where this manifest...
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | English |
| Published: |
Department of Mathematics and Applied Mathematics
2022
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613324392267776 |
|---|---|
| access_status_str | Open Access |
| author | Roberts, Ethan |
| author2 | Bassett, Bruce |
| author_browse | Bassett, Bruce Roberts, Ethan |
| author_facet | Bassett, Bruce Roberts, Ethan |
| author_sort | Roberts, Ethan |
| collection | Thesis |
| description | The primary objective of this thesis is to develop rigorous Bayesian tools for common statistical challenges arising in modern science where there is a heightened demand for precise inference in the presence of large, known uncertainties. This thesis explores in detail two arenas where this manifests. The first is the development and testing of a unified Bayesian anomaly detection and classification framework (BADAC) which allows principled anomaly detection in the presence of measurement uncertainties, which are rarely incorporated into machine learning algorithms. BADAC deals with uncertainties by marginalising over the unknown, true value of the data. Using simulated data with Gaussian noise as an example, BADAC is shown to be superior to standard algorithms in both classification and anomaly detection performance in the presence of uncertainties. Additionally, BADAC provides well-calibrated classification probabilities, valuable for use in scientific pipelines. BADAC is therefore ideal where computational cost is not a limiting factor and statistical rigour is important. We discuss approximations to speed up BADAC, such as the use of Gaussian processes, and finally introduce a new metric, the Rank-Weighted Score (RWS), that is particularly suited to evaluating an algorithm's ability to detect anomalies. The second major exploration in this thesis presents methods for rigorous statistical inference in the presence of classification uncertainties and errors. Although this is explored specifically through supernova cosmology, the context is general. Supernova cosmology without spectra will be an important component of future surveys due to massive increases in data volumes in next-generation surveys such as from the Vera C. Rubin Observatory. This lack of supernova spectra results both in uncertainty in the redshifts and type of the supernova, which if ignored, leads to significantly biased estimates of cosmological parameters. We present a hierarchical Bayesian formalism, zBEAMS, which addresses this problem by marginalising over the unknown or uncertain supernova redshifts and types to produce unbiased cosmological estimates that are competitive with supernova data with fully spectroscopically confirmed redshifts. zBEAMS thus provides a unified treatment of both photometric redshifts, classification uncertainty and host galaxy misidentification, effectively correcting the inevitable contamination in the Hubble diagram with little or no loss of statistical power. |
| format | Thesis |
| id | oai:open.uct.ac.za:11427/36053 |
| institution | University of Cape Town (South Africa) |
| language | eng |
| last_indexed | 2026-06-10T12:34:20.437Z |
| license_str | Not specified — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository |
| publishDate | 2022 |
| publishDateRange | 2022 |
| publishDateSort | 2022 |
| publisher | Department of Mathematics and Applied Mathematics |
| publisherStr | Department of Mathematics and Applied Mathematics |
| record_format | dspace |
| source_str | UCTD — University of Cape Town Open Access Repository |
| spelling | oai:open.uct.ac.za:11427/36053 Aspects of Bayesian inference, classification and anomaly detection Roberts, Ethan Bassett, Bruce Applied Mathematics The primary objective of this thesis is to develop rigorous Bayesian tools for common statistical challenges arising in modern science where there is a heightened demand for precise inference in the presence of large, known uncertainties. This thesis explores in detail two arenas where this manifests. The first is the development and testing of a unified Bayesian anomaly detection and classification framework (BADAC) which allows principled anomaly detection in the presence of measurement uncertainties, which are rarely incorporated into machine learning algorithms. BADAC deals with uncertainties by marginalising over the unknown, true value of the data. Using simulated data with Gaussian noise as an example, BADAC is shown to be superior to standard algorithms in both classification and anomaly detection performance in the presence of uncertainties. Additionally, BADAC provides well-calibrated classification probabilities, valuable for use in scientific pipelines. BADAC is therefore ideal where computational cost is not a limiting factor and statistical rigour is important. We discuss approximations to speed up BADAC, such as the use of Gaussian processes, and finally introduce a new metric, the Rank-Weighted Score (RWS), that is particularly suited to evaluating an algorithm's ability to detect anomalies. The second major exploration in this thesis presents methods for rigorous statistical inference in the presence of classification uncertainties and errors. Although this is explored specifically through supernova cosmology, the context is general. Supernova cosmology without spectra will be an important component of future surveys due to massive increases in data volumes in next-generation surveys such as from the Vera C. Rubin Observatory. This lack of supernova spectra results both in uncertainty in the redshifts and type of the supernova, which if ignored, leads to significantly biased estimates of cosmological parameters. We present a hierarchical Bayesian formalism, zBEAMS, which addresses this problem by marginalising over the unknown or uncertain supernova redshifts and types to produce unbiased cosmological estimates that are competitive with supernova data with fully spectroscopically confirmed redshifts. zBEAMS thus provides a unified treatment of both photometric redshifts, classification uncertainty and host galaxy misidentification, effectively correcting the inevitable contamination in the Hubble diagram with little or no loss of statistical power. 2022-03-11T10:43:27Z 2022-03-11T10:43:27Z 2021 2022-03-11T10:42:52Z Doctoral Thesis Doctoral PhD http://hdl.handle.net/11427/36053 eng application/pdf Department of Mathematics and Applied Mathematics Faculty of Science |
| spellingShingle | Applied Mathematics Roberts, Ethan Aspects of Bayesian inference, classification and anomaly detection |
| thesis_degree_str | Doctoral |
| title | Aspects of Bayesian inference, classification and anomaly detection |
| title_full | Aspects of Bayesian inference, classification and anomaly detection |
| title_fullStr | Aspects of Bayesian inference, classification and anomaly detection |
| title_full_unstemmed | Aspects of Bayesian inference, classification and anomaly detection |
| title_short | Aspects of Bayesian inference, classification and anomaly detection |
| title_sort | aspects of bayesian inference classification and anomaly detection |
| topic | Applied Mathematics |
| url | http://hdl.handle.net/11427/36053 |
| work_keys_str_mv | AT robertsethan aspectsofbayesianinferenceclassificationandanomalydetection |