Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Reliable likelihoods for out-of-distribution data from continuous-time normalising flows

Thesis (MSc)--Stellenbosch University, 2024.

Saved in:
Bibliographic Details
Main Author: Josias, Shane
Other Authors: Brink, Willie
Format: Thesis
Language:English
Published: Stellenbosch : Stellenbosch University 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614023489421312
access_status_str Open Access
author Josias, Shane
author2 Brink, Willie
author_browse Brink, Willie
Josias, Shane
author_facet Brink, Willie
Josias, Shane
author_sort Josias, Shane
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MSc)--Stellenbosch University, 2024.
format Thesis
id oai:scholar.sun.ac.za:10019.1/131688
institution Stellenbosch University (South Africa)
language English
last_indexed 2026-06-10T12:45:26.037Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/131688 Reliable likelihoods for out-of-distribution data from continuous-time normalising flows Josias, Shane Brink, Willie Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Applied Mathematics Division. Anomaly detection (Computer science) Machine learning Flowgraphs UCTD Thesis (MSc)--Stellenbosch University, 2024. ENGLISH ABSTRACT: A continuous-time normalising flow is a deep generative model that allows for exact likelihood evaluation by defining a transformation between data and samples from a base distribution. The transformation is implicitly defined as a solution to a neural ordinary differential equation neural ODE), and requires solution trajectories to be simulated by an ODE solver. This formulation eases invertibility, avoids the expensive determinant calculation in discrete-step normalising flows, and removes constraints on the neural network architecture underlying the transformation process. We examine two problems related to continuous-time normalising flows, focusing on their application as a generative model for image data. The first is the computational bottleneck in the simulation of solution trajectories, which can lead to long training times. The second relates to the reported phenomenon that normalising flow models assign higher likelihoods to out-of-distribution samples, than they do to in-distribution samples. For the first problem, we explore whether regularising the Jacobian of the neural ODE during training can improve computational efficiency. Our results indicate that Jacobian regularisation can reduce the number of function evaluations required by an ODE solver when computing solution trajectories, and can offer additional benefits such as robustness, and distance to decision boundaries for a classification problem. However, we argue that these benefits do not outweigh the time-cost of simulating solution trajectories and turn to the use of the conditional flow matching objective in continuous-time normalising flow training, as it circumvents the need to simulate solution trajectories. Models trained with this objective are called CFM models. For the second problem, we show that CFM models also assign higher likelihoods to out-of distribution data. We then explore whether multimodality in the base distribution can improve matters. The multimodal base distribution allows for class conditional sampling, but can lead to mode collapse in terms of its sampling ability and does not lead to reliable likelihoods on out-of-distribution data. We also show that these CFM models tend to fit to pixel content rather than semantic content, corroborating observations from the literature for discrete-step flows. Motivated by this realisation, we instead train CFM models on image feature representations obtained from a pretrained classifier, a pretrained autoencoder, and another autoencoder trained from scratch. We find that feature representations which do not contain image-specific structure can lead to reliable likelihoods from CFM models on out-of-distribution data. We do find that CFM models trained on our proposed feature representations generate samples of a lower quality, and suggest avenues for future work. AFRIKAANSE OPSOMMING: Geen opsomming beskikbaar. Doctoral 2025-02-05T14:01:56Z 2025-02-05T14:01:56Z 2024-12 Thesis https://scholar.sun.ac.za/handle/10019.1/131688 en Stellenbosch University 83 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle Anomaly detection (Computer science)
Machine learning
Flowgraphs
UCTD
Josias, Shane
Reliable likelihoods for out-of-distribution data from continuous-time normalising flows
title Reliable likelihoods for out-of-distribution data from continuous-time normalising flows
title_full Reliable likelihoods for out-of-distribution data from continuous-time normalising flows
title_fullStr Reliable likelihoods for out-of-distribution data from continuous-time normalising flows
title_full_unstemmed Reliable likelihoods for out-of-distribution data from continuous-time normalising flows
title_short Reliable likelihoods for out-of-distribution data from continuous-time normalising flows
title_sort reliable likelihoods for out of distribution data from continuous time normalising flows
topic Anomaly detection (Computer science)
Machine learning
Flowgraphs
UCTD
url https://scholar.sun.ac.za/handle/10019.1/131688
work_keys_str_mv AT josiasshane reliablelikelihoodsforoutofdistributiondatafromcontinuoustimenormalisingflows