Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Dissertation (MEng)--University of Pretoria, 2009.
| Other Authors: | |
|---|---|
| Format: | Thesis |
| Published: |
University of Pretoria
2013
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613628064071680 |
|---|---|
| access_status_str | Open Access |
| author2 | Kourie, Derrick G. |
| author_browse | Kourie, Derrick G. |
| author_facet | Kourie, Derrick G. |
| collection | Thesis |
| dc_rights_str_mv | © 2009, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. |
| description | Dissertation (MEng)--University of Pretoria, 2009. |
| format | Thesis |
| id | oai:repository.up.ac.za:2263/27536 |
| institution | University of Pretoria (South Africa) |
| last_indexed | 2026-06-10T12:39:09.918Z |
| license_str | Other — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from UPSpace — University of Pretoria Institutional Repository |
| publishDate | 2013 |
| publishDateRange | 2013 |
| publishDateSort | 2013 |
| publisher | University of Pretoria |
| publisherStr | University of Pretoria |
| record_format | dspace |
| source_str | UPSpace — University of Pretoria Institutional Repository |
| spelling | oai:repository.up.ac.za:2263/27536 Finite state automaton construction through regular expression hashing Kourie, Derrick G. Watson, Bruce William spring.haas_meester@gmail.com Coetser, Rayner Johannes Lodewikus Regular expression hashing Finite state Super-automaton Approximate automaton UCTD Dissertation (MEng)--University of Pretoria, 2009. In this study, the regular expressions forming abstract states in Brzozowski’s algorithm are not remapped to sequential state transition table addresses as would be the case in the classical approach, but are hashed to integers. Two regular expressions that are hashed to the same hash code are assigned the same integer address in the state transition table, reducing the number of states in the automaton. This reduction does not necessarily lead to the construction of a minimal automaton: no restrictions are placed on the hash function hashing two regular expressions to the same code. Depending on the quality of the hash function, a super-automaton, previously referred to as an approximate automaton, or an exact automaton can be constructed. When two regular expressions are hashed to the same state, and they do not represent the same regular language, a super-automaton is constructed. A super-automaton accepts the regular language of the input regular expression, in addition to some extra strings. If the hash function is bad, many regular expressions that do not represent the same regular language will be hashed together, resulting in a smaller automaton that accepts extra strings. In the ideal case, two regular expressions will only be hashed together when they represent the same regular language. In this case, an exact minimal automaton will be constructed. It is shown that, using the hashing approach, an exact or super-automaton is always constructed. Another outcome of the hashing approach is that a non-deterministic automaton may be constructed. A new version of the hashing version of Brzozowski’s algorithm is put forward which constructs a deterministic automaton. A method is also put forward for measuring the difference between an exact and a super-automaton: this takes the form of the k-equivalence measure: the k-equivalence measure measures the number of characters up to which the strings of two regular expressions are equal. The better the hash function, the higher the value of k, up to the point where the hash function results in regular expressions being hashed together if and only if they have the same regular language. Using the k-equivalence measure, eight generated hash functions and one hand coded hash function are evaluated for a large number of short regular expressions, which are generated using G¨odel numbers. The k-equivalence concept is extended to the average k-equivalence value in order to evaluate the hash functions for longer regular expressions. The hand coded hash function is found to produce good results. Copyright Computer Science unrestricted 2013-09-07T11:44:37Z 2010-08-25 2013-09-07T11:44:37Z 2010-04-12 2009 2010-08-25 Dissertation Coetser, RJL 2009, Finite state automaton construction through regular expression hashing, MSc dissertation, University of Pretoria, Pretoria, viewed yymmdd < http://hdl.handle.net/2263/27536 > E10/460/gm http://hdl.handle.net/2263/27536 http://upetd.up.ac.za/thesis/available/etd-08252010-133710/ © 2009, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. application/pdf University of Pretoria |
| spellingShingle | Regular expression hashing Finite state Super-automaton Approximate automaton UCTD Finite state automaton construction through regular expression hashing |
| title | Finite state automaton construction through regular expression hashing |
| title_full | Finite state automaton construction through regular expression hashing |
| title_fullStr | Finite state automaton construction through regular expression hashing |
| title_full_unstemmed | Finite state automaton construction through regular expression hashing |
| title_short | Finite state automaton construction through regular expression hashing |
| title_sort | finite state automaton construction through regular expression hashing |
| topic | Regular expression hashing Finite state Super-automaton Approximate automaton UCTD |
| url | http://hdl.handle.net/2263/27536 http://upetd.up.ac.za/thesis/available/etd-08252010-133710/ |