Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset

Human Immunodeficiency Virus (HIV) rapidly escapes cytotoxic T-cell lymphocyte (CTL) immune responses exerted by the host. Mutation patterns and HLA associated footprints linked to viral escape have been identified, making it possible to use viral sequence data, combined with the host HLA allele inf...

Full description

Saved in:
Bibliographic Details
Main Author: Mphahlele, Ruth
Other Authors: Williamson, Carolyn
Format: Thesis
Language:English
Published: Computational Biology Division 2024
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613305694060544
access_status_str Open Access
author Mphahlele, Ruth
author2 Williamson, Carolyn
author_browse Mphahlele, Ruth
Williamson, Carolyn
author_facet Williamson, Carolyn
Mphahlele, Ruth
author_sort Mphahlele, Ruth
collection Thesis
description Human Immunodeficiency Virus (HIV) rapidly escapes cytotoxic T-cell lymphocyte (CTL) immune responses exerted by the host. Mutation patterns and HLA associated footprints linked to viral escape have been identified, making it possible to use viral sequence data, combined with the host HLA allele information, to predict escape. Next-Generation Sequencing (NGS) approaches enable the generation of large sequence datasets, and the detection of viral populations present at very low frequencies in an infected individual at any given time. These datasets allow for the study of changes in viral populations within a host over time and provide a means to understand the kinetics and pathway(s) of escape. While tools exist that allow the prediction of escape in sequence data with small sequence numbers per sampling timepoint, these tools often have limitations in analysing large NGS data sets. In this project, we developed a workflow for identifying the kinetics of CTL escape in longitudinal HIV-1 next-generation datasets of gag sequences generated using an Illumina Miseq platform over the duration of drug-naïve infection. This acquired data set was generated from 15 women over a period of one to seven years and comprised of 4583 short read gag sequences (544 bp). We identified tools for identifying CTL escape in deep sequencing datasets and used pre-defined criteria to screen these tools. The outputs were validated using a test dataset from a previous study that identified escape. We selected the Epitope Matcher tool as having the most potential to identify CTL epitopes and escape mutations. To further support evidence of escape and identify additional putative escape mutations, we identified sites with high Shannon entropy (>=0.25) and sites evolving under positive selection using HyphyFUBAR. The sites were verified using the HLA association and CTL epitope variants and escape mutations lists, or data generated by Epitope Matcher. Using the Epitope Matcher tool, we identified seven HLA-B restricted gag epitopes in six individuals of which putative escape was identified in seven epitopes, commonly occurring in the late chronic phase of infection. The most common epitope in the population was YL9 (found in 60% of the participants) (Gag HXB2 coordinates 296 to 304) restricted by HLA B*15:03, B*15:10 and B*42:01. Toggling of amino acids within epitopes as a result of potential fitness cost associated with a specific change, was observed in five of seven epitopes. We further identified 35 high Shannon entropy sites, where nine of these sites were found within epitopes identified by Epitope Matcher. Additionally, nine of the high Shannon entropy sites were evolving under positive selection. With supporting evidence, we can predict that the mutation T310S (found in the AW11 epitope, restricted by allele B*58:01), is likely to be associated with escape. This study is important in that it provides a pipeline that will enable semiautomated analysis of NGS data. Using this approach, we have provided a better understanding of the kinetics and frequency of CTL escape over the course of HIV infection. Additionally, we have identified frequently targeted sites across the Gag p24 region and across individuals. This study is relevant to inform CTL-based vaccine prevention and treatment strategies.
format Thesis
id oai:open.uct.ac.za:11427/39727
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:34:00.978Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2024
publishDateRange 2024
publishDateSort 2024
publisher Computational Biology Division
publisherStr Computational Biology Division
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/39727 The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset Mphahlele, Ruth Williamson, Carolyn Martin Darren Bioinformatics Human Immunodeficiency Virus (HIV) rapidly escapes cytotoxic T-cell lymphocyte (CTL) immune responses exerted by the host. Mutation patterns and HLA associated footprints linked to viral escape have been identified, making it possible to use viral sequence data, combined with the host HLA allele information, to predict escape. Next-Generation Sequencing (NGS) approaches enable the generation of large sequence datasets, and the detection of viral populations present at very low frequencies in an infected individual at any given time. These datasets allow for the study of changes in viral populations within a host over time and provide a means to understand the kinetics and pathway(s) of escape. While tools exist that allow the prediction of escape in sequence data with small sequence numbers per sampling timepoint, these tools often have limitations in analysing large NGS data sets. In this project, we developed a workflow for identifying the kinetics of CTL escape in longitudinal HIV-1 next-generation datasets of gag sequences generated using an Illumina Miseq platform over the duration of drug-naïve infection. This acquired data set was generated from 15 women over a period of one to seven years and comprised of 4583 short read gag sequences (544 bp). We identified tools for identifying CTL escape in deep sequencing datasets and used pre-defined criteria to screen these tools. The outputs were validated using a test dataset from a previous study that identified escape. We selected the Epitope Matcher tool as having the most potential to identify CTL epitopes and escape mutations. To further support evidence of escape and identify additional putative escape mutations, we identified sites with high Shannon entropy (>=0.25) and sites evolving under positive selection using HyphyFUBAR. The sites were verified using the HLA association and CTL epitope variants and escape mutations lists, or data generated by Epitope Matcher. Using the Epitope Matcher tool, we identified seven HLA-B restricted gag epitopes in six individuals of which putative escape was identified in seven epitopes, commonly occurring in the late chronic phase of infection. The most common epitope in the population was YL9 (found in 60% of the participants) (Gag HXB2 coordinates 296 to 304) restricted by HLA B*15:03, B*15:10 and B*42:01. Toggling of amino acids within epitopes as a result of potential fitness cost associated with a specific change, was observed in five of seven epitopes. We further identified 35 high Shannon entropy sites, where nine of these sites were found within epitopes identified by Epitope Matcher. Additionally, nine of the high Shannon entropy sites were evolving under positive selection. With supporting evidence, we can predict that the mutation T310S (found in the AW11 epitope, restricted by allele B*58:01), is likely to be associated with escape. This study is important in that it provides a pipeline that will enable semiautomated analysis of NGS data. Using this approach, we have provided a better understanding of the kinetics and frequency of CTL escape over the course of HIV infection. Additionally, we have identified frequently targeted sites across the Gag p24 region and across individuals. This study is relevant to inform CTL-based vaccine prevention and treatment strategies. 2024-05-27T08:47:40Z 2024-05-27T08:47:40Z 2023 2024-05-22T08:34:57Z Thesis / Dissertation Masters MSc http://hdl.handle.net/11427/39727 eng application/pdf Computational Biology Division Faculty of Health Sciences
spellingShingle Bioinformatics
Mphahlele, Ruth
The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset
thesis_degree_str Master's
title The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset
title_full The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset
title_fullStr The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset
title_full_unstemmed The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset
title_short The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset
title_sort identification of cytotoxic t lymphocyte ctl escape in a large longitudinal subtype c hiv 1 sequence dataset
topic Bioinformatics
url http://hdl.handle.net/11427/39727
work_keys_str_mv AT mphahleleruth theidentificationofcytotoxictlymphocytectlescapeinalargelongitudinalsubtypechiv1sequencedataset
AT mphahleleruth identificationofcytotoxictlymphocytectlescapeinalargelongitudinalsubtypechiv1sequencedataset