Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Thesis (MSc)--Stellenbosch University, 2025.
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Published: |
Stellenbosch : Stellenbosch University
2026
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867614037858058240 |
|---|---|
| access_status_str | Open Access |
| author | Perry, Keegan Kyle |
| author2 | Bester, Willem |
| author_browse | Bester, Willem Perry, Keegan Kyle |
| author_facet | Bester, Willem Perry, Keegan Kyle |
| author_sort | Perry, Keegan Kyle |
| collection | Thesis |
| dc_rights_str_mv | Stellenbosch University |
| description | Thesis (MSc)--Stellenbosch University, 2025. |
| format | Thesis |
| id | oai:scholar.sun.ac.za:10019.1/134764 |
| institution | Stellenbosch University (South Africa) |
| last_indexed | 2026-06-10T12:45:40.774Z |
| license_str | Other — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository |
| publishDate | 2026 |
| publishDateRange | 2026 |
| publishDateSort | 2026 |
| publisher | Stellenbosch : Stellenbosch University |
| publisherStr | Stellenbosch : Stellenbosch University |
| record_format | dspace |
| source_str | SUNScholar — Stellenbosch University Repository |
| spelling | oai:scholar.sun.ac.za:10019.1/134764 Verified derivative-based regular expression matching Perry, Keegan Kyle Bester, Willem Van der Merwe, Brink Schulze, Walter Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Computer Science Division. Regular expressions (Computer science) Formal methods (Computer science) Sequential machine theory Thesis (MSc)--Stellenbosch University, 2025. Perry, K. K. 2025. Verified Derivative-Based Regular Expression Matching. Unpublished masters thesis. Stellenbosch: Stellenbosch University [online]. Available: https://scholar.sun.ac.za/items/c1a8427d-bcc5-43f3-b15e-f9c4a9b6d91d ENGLISH ABSTRACT: Regular expressions are often used in security-related contexts where the correctness of the underlying matcher is vitally important. Whereas the traditional problem of regular expression membership is well understood, and correct implementations are easy to verify, modern regular expression matchers have fundamentally changed the matching problem: Instead of just determining whether a string is in the language of a regular expression, modern matchers can also report the location of such a string in the input, as well as how subexpressions match substrings of the input. Historically, in the absence of employing formal approaches, support for new features has resulted in ad hoc solutions leading to informal standards, and conversely, interpretations of informal standards leading to ad hoc implementations. In this thesis, we take a formal approach to practical regular expression matching by verifying the correctness of several implementations with respect to the leftmost-longest and leftmost-greedy disambiguation strategies. When a regular expression is ambiguous, the reported match depends on which matching path is taken. To make the result unique, practical matchers adopt a disambiguation policy, typically either leftmost-longest or leftmost-greedy, to choose a single path. We focus on matching solutions that use Brzozowski derivatives, which have been employed in a number of recent formalisations of leftmost-longest matching. Building on this work, we provide our own verified implementations of leftmost-longest matching, and also introduce the formalisation of leftmost-greedy matching with derivatives, which, thus far, has not been considered in the literature. All formalisations we present are executable and verified in the Lean proof assistant. AFRIKAANSE OPSOMMING: Regulêre uitdrukkings word dikwels gebruik in sekuriteitsverwante kontekste waarin die korrektheid van die onderliggende soekenjin van kardinale belang is. Hoewel die tradisionele probleem van regulêre-uitdrukking lidmaatskap goed verstaan word en korrekte implementerings maklik is om te verifieer, het moderne regulêre-uitdrukking enjins die soekprobleem fundamenteel verander: In plaas daarvan om bloot vas te stel of ’n string in die taal van ’n regulêre uitdrukking is, rapporteer moderne enjins ook die posisie van so ’n string in die toevoer, asook hoe deeluitdrukkings teenoor substringe van die toevoer pas. By afwesigheid van die gebruik van formele metodes, het die ondersteuning van nuwe funksionaliteit histories gelei van ad hoc-oplossings tot informele standaarde en, omgekeerd, vertolkings van informele standaarde het gelei tot ad hoc-implementerings. In dié tesis neem ons ’n formele benadering tot praktiese regulêre-uitdrukking soekenjins aan deur die korrektheid van verskeie implementerings met betrekking tot die mees-linkse langste en mees-linkse gulsige ondubbelsinnigmakingstrategieë te verifieer. Wanneer ’n regulêre uitdrukking dubbelsinnig is, hang die gerapporteerde trefslag af van watter passingspad geneem word. Om die resultaat uniek te maak, neem praktiese enjins ’n ondubbelsinnnigmakingsbeleid aan, tipies óf ’n mees-linkse langste óf ’n mees-linkse gulsige, om ’n enkele pad te kies. Ons fokus op Brzozowski-afgeleides, wat in ’n aantal onlangse formaliserings van mees-linkse langste soektogte gebruik is. Gebou op dié werk voorsien ons ook ons eie geverifieerde implementerings van mees-linkse langste soektogte, en ons stel ook die formalisering van mees-linkse gulsige soektoegte bekend, wat tot dusver nog nie in die literatuur oorweeg is nie. Al die formaliserings wat ons aanbied, is uitvoerbaar en geverifieer met die Lean-bewysassistent. Masters 2026-01-07T08:42:14Z 2026-01-07T08:42:14Z 2025-12 Thesis https://scholar.sun.ac.za/handle/10019.1/134764 Stellenbosch University viii, 117 pages application/pdf Stellenbosch : Stellenbosch University |
| spellingShingle | Regular expressions (Computer science) Formal methods (Computer science) Sequential machine theory Perry, Keegan Kyle Verified derivative-based regular expression matching |
| title | Verified derivative-based regular expression matching |
| title_full | Verified derivative-based regular expression matching |
| title_fullStr | Verified derivative-based regular expression matching |
| title_full_unstemmed | Verified derivative-based regular expression matching |
| title_short | Verified derivative-based regular expression matching |
| title_sort | verified derivative based regular expression matching |
| topic | Regular expressions (Computer science) Formal methods (Computer science) Sequential machine theory |
| url | https://scholar.sun.ac.za/handle/10019.1/134764 |
| work_keys_str_mv | AT perrykeegankyle verifiedderivativebasedregularexpressionmatching |