Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Flexible finite automata-based algorithms for detecting microsatellites in DNA

Dissertation (MSc (Computer Science))--University of Pretoria, 2010.

Saved in:
Bibliographic Details
Other Authors: Kourie, Derrick G.
Format: Thesis
Published: University of Pretoria 2013
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613553144365056
access_status_str Open Access
author2 Kourie, Derrick G.
author_browse Kourie, Derrick G.
author_facet Kourie, Derrick G.
collection Thesis
dc_rights_str_mv © 2010, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
description Dissertation (MSc (Computer Science))--University of Pretoria, 2010.
format Thesis
id oai:repository.up.ac.za:2263/27335
institution University of Pretoria (South Africa)
last_indexed 2026-06-10T12:37:58.345Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from UPSpace — University of Pretoria Institutional Repository
publishDate 2013
publishDateRange 2013
publishDateSort 2013
publisher University of Pretoria
publisherStr University of Pretoria
record_format dspace
source_str UPSpace — University of Pretoria Institutional Repository
spelling oai:repository.up.ac.za:2263/27335 Flexible finite automata-based algorithms for detecting microsatellites in DNA Kourie, Derrick G. driddc@unisa.ac.za De Ridder, Corne Approximate tandem repeats Regular expression Finite automata Microsatellites UCTD Dissertation (MSc (Computer Science))--University of Pretoria, 2010. Apart from contributing to Computer Science, this research also contributes to Bioinformatics, a subset of the subject discipline Computational Biology. The main focus of this dissertation is the development of a data-analytical and theoretical algorithm to contribute to the analysis of DNA, and in particular, to detect microsatellites. Microsatellites, considered in the context of this dissertation, refer to consecutive patterns contained by genomic sequences. A perfect tandem repeat is defined as a string of nucleotides which is repeated at least twice in a sequence. An approximate tandem repeat is a string of nucleotides repeated consecutively at least twice, with small differences between the instances. The research presented in this dissertation was inspired by molecular biologists who were discovered to be visually scanning genetic sequences in search of short approximate tandem repeats or so called microsatellites. The aim of this dissertation is to present three algorithms that search for short approximate tandem repeats. The algorithms comprise the implementation of finite automata. Thus the hypothesis posed is as follows: Finite automata can detect microsatellites effectively in DNA. "Effectively" includes the ability to fine-tune the detection process so that redundant data is avoided, and relevant data is not missed during search. In order to verify whether the hypothesis holds, three theoretical related algorithms have been proposed based on theorems from finite automaton theory. They are generically referred to as the FireìSat algorithms. These algorithms have been implemented, and the performance of FireìSat2 has been investigated and compared to other software packages. From the results obtained, it is clear that the performance of these algorithms differ in terms of attributes such as speed, memory consumption and extensibility. In respect of speed performance, FireìSat outperformed rival software packages. It will be seen that the FireìSat algorithms have several parameters that can be used to tune their search. It should be emphasized that these parameters have been devised in consultation with the intended user community, in order to enhance the usability of the software. It was found that the parameters of FireìSat can be set to detect more tandem repeats than rival software packages, but also tuned to limit the number of detected tandem repeats. Copyright Computer Science unrestricted 2013-09-07T11:11:38Z 2010-09-13 2013-09-07T11:11:38Z 2010-05-09 2010-09-13 2010-08-17 Dissertation De Ridder, C 2010, Flexible finite automata-based algorithms for detecting microsatellites in DNA, MSc dissertation, University of Pretoria, Pretoria, viewed yymmdd < http://hdl.handle.net/2263/27335 > C10/548/gm http://hdl.handle.net/2263/27335 http://upetd.up.ac.za/thesis/available/etd-08172010-202532/ © 2010, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. application/pdf University of Pretoria
spellingShingle Approximate tandem repeats
Regular expression
Finite automata
Microsatellites
UCTD
Flexible finite automata-based algorithms for detecting microsatellites in DNA
title Flexible finite automata-based algorithms for detecting microsatellites in DNA
title_full Flexible finite automata-based algorithms for detecting microsatellites in DNA
title_fullStr Flexible finite automata-based algorithms for detecting microsatellites in DNA
title_full_unstemmed Flexible finite automata-based algorithms for detecting microsatellites in DNA
title_short Flexible finite automata-based algorithms for detecting microsatellites in DNA
title_sort flexible finite automata based algorithms for detecting microsatellites in dna
topic Approximate tandem repeats
Regular expression
Finite automata
Microsatellites
UCTD
url http://hdl.handle.net/2263/27335
http://upetd.up.ac.za/thesis/available/etd-08172010-202532/