Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Analysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models

Most phylogenetic trees are inferred using time-reversible evolutionary models that assume that the relative rates of substitution for any given pair of nucleotides are the same regardless of the direction of the substitutions. However, there is no reason to assume that the underlying biochemical mu...

Full description

Saved in:
Bibliographic Details
Main Author: Sianga, Rita
Other Authors: Martin, Darrin
Format: Thesis
Language:English
Published: Department of Integrative Biomedical Sciences (IBMS) 2023
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613174696509440
access_status_str Open Access
author Sianga, Rita
author2 Martin, Darrin
author_browse Martin, Darrin
Sianga, Rita
author_facet Martin, Darrin
Sianga, Rita
author_sort Sianga, Rita
collection Thesis
description Most phylogenetic trees are inferred using time-reversible evolutionary models that assume that the relative rates of substitution for any given pair of nucleotides are the same regardless of the direction of the substitutions. However, there is no reason to assume that the underlying biochemical mutational processes that cause substitutions are similarly symmetrical. Here, we evaluate the effect on phylogenetic inference in empirical viral and simulated data of incorporating non-reversibility into models of nucleotide substitution processes. I consider two non-reversible nucleotide substitution models: (1) a 6-rate nonreversible model (NREV6) that is applicable to analyzing mutational processes in double-stranded genomes in that complementary substitutions occur at identical rates; and (2) a 12-rate non-reversible model (NREV12) that is applicable to analyzing mutational processes in single-stranded (ss) genomes in that all substitution types are free to occur at different rates. Using likelihood ratio and Akaike Information Criterion-based model tests, we show that, surprisingly, NREV12 provided a significantly better fit than the General Time Reversible (GTR) and NREV6 models to 21/31 dsRNA and 20/30 dsDNA datasets. As expected, however, NREV12 provided a significantly better fit to 24/33 ssDNA and 40/47 ssRNA datasets. I tested how non-reversibility impacts the accuracy with which phylogenetic trees are inferred. As simulated degrees of non-reversibility (DNR) increased, the tree topology inferences using both NREV12 and GTR became more accurate, whereas inferred tree branch lengths became less accurate. I conclude that while non-reversible models should be helpful in the analysis of mutational processes in most virus species, there is no pressing need to use these models for routine phylogenetic inference. Finally, I introduce a web application, RpNRM, that roots phylogenetic trees using a non-reversible nucleotide substitution model. The phylogenetic tree is rooted on every branch and the likelihoods of each rooting are determined and compared with the highest likelihood tree being identified as that with the most plausible rooting. The rooting accuracy of RpNRM was compared to that of the outgroup rooting method, the midpoint rooting method and another non-reversible model-based rooting method implemented in the program IQTREE. I find that although the RpNRM and IQTREE reversible model-based methods are not as accurate on their own as outgroup or midpoint rooting methods, they nevertheless provide an independent means of verifying the root locations that are inferred by these other methods.
format Thesis
id oai:open.uct.ac.za:11427/38541
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:31:56.645Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher Department of Integrative Biomedical Sciences (IBMS)
publisherStr Department of Integrative Biomedical Sciences (IBMS)
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/38541 Analysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models Sianga, Rita Martin, Darrin Phylogenetic Inference of Non Reversible Nucleotide Substitution Models Most phylogenetic trees are inferred using time-reversible evolutionary models that assume that the relative rates of substitution for any given pair of nucleotides are the same regardless of the direction of the substitutions. However, there is no reason to assume that the underlying biochemical mutational processes that cause substitutions are similarly symmetrical. Here, we evaluate the effect on phylogenetic inference in empirical viral and simulated data of incorporating non-reversibility into models of nucleotide substitution processes. I consider two non-reversible nucleotide substitution models: (1) a 6-rate nonreversible model (NREV6) that is applicable to analyzing mutational processes in double-stranded genomes in that complementary substitutions occur at identical rates; and (2) a 12-rate non-reversible model (NREV12) that is applicable to analyzing mutational processes in single-stranded (ss) genomes in that all substitution types are free to occur at different rates. Using likelihood ratio and Akaike Information Criterion-based model tests, we show that, surprisingly, NREV12 provided a significantly better fit than the General Time Reversible (GTR) and NREV6 models to 21/31 dsRNA and 20/30 dsDNA datasets. As expected, however, NREV12 provided a significantly better fit to 24/33 ssDNA and 40/47 ssRNA datasets. I tested how non-reversibility impacts the accuracy with which phylogenetic trees are inferred. As simulated degrees of non-reversibility (DNR) increased, the tree topology inferences using both NREV12 and GTR became more accurate, whereas inferred tree branch lengths became less accurate. I conclude that while non-reversible models should be helpful in the analysis of mutational processes in most virus species, there is no pressing need to use these models for routine phylogenetic inference. Finally, I introduce a web application, RpNRM, that roots phylogenetic trees using a non-reversible nucleotide substitution model. The phylogenetic tree is rooted on every branch and the likelihoods of each rooting are determined and compared with the highest likelihood tree being identified as that with the most plausible rooting. The rooting accuracy of RpNRM was compared to that of the outgroup rooting method, the midpoint rooting method and another non-reversible model-based rooting method implemented in the program IQTREE. I find that although the RpNRM and IQTREE reversible model-based methods are not as accurate on their own as outgroup or midpoint rooting methods, they nevertheless provide an independent means of verifying the root locations that are inferred by these other methods. 2023-09-12T08:26:45Z 2023-09-12T08:26:45Z 2023 2023-09-12T08:17:11Z Doctoral Thesis Doctoral PhD http://hdl.handle.net/11427/38541 eng application/pdf Department of Integrative Biomedical Sciences (IBMS) Faculty of Health Sciences
spellingShingle Phylogenetic Inference of Non Reversible Nucleotide Substitution Models
Sianga, Rita
Analysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models
thesis_degree_str Doctoral
title Analysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models
title_full Analysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models
title_fullStr Analysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models
title_full_unstemmed Analysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models
title_short Analysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models
title_sort analysis of the impact on phylogenetic inference of non reversible nucleotide substitution models
topic Phylogenetic Inference of Non Reversible Nucleotide Substitution Models
url http://hdl.handle.net/11427/38541
work_keys_str_mv AT siangarita analysisoftheimpactonphylogeneticinferenceofnonreversiblenucleotidesubstitutionmodels