Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Thesis (MSc)--Stellenbosch University, 2024.
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Published: |
Stellenbosch : Stellenbosch University
2025
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613785855885312 |
|---|---|
| access_status_str | Open Access |
| author | Mohlaba, Tiego |
| author2 | Patterton, Hugh |
| author_browse | Mohlaba, Tiego Patterton, Hugh |
| author_facet | Patterton, Hugh Mohlaba, Tiego |
| author_sort | Mohlaba, Tiego |
| collection | Thesis |
| dc_rights_str_mv | Stellenbosch University |
| description | Thesis (MSc)--Stellenbosch University, 2024. |
| format | Thesis |
| id | oai:scholar.sun.ac.za:10019.1/131841 |
| institution | Stellenbosch University (South Africa) |
| last_indexed | 2026-06-10T12:41:40.401Z |
| license_str | Other — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository |
| publishDate | 2025 |
| publishDateRange | 2025 |
| publishDateSort | 2025 |
| publisher | Stellenbosch : Stellenbosch University |
| publisherStr | Stellenbosch : Stellenbosch University |
| record_format | dspace |
| source_str | SUNScholar — Stellenbosch University Repository |
| spelling | oai:scholar.sun.ac.za:10019.1/131841 Development of a comprehensive single cell RNA sequencing pipeline: pre-processing, quality control and identification of outliers Mohlaba, Tiego Patterton, Hugh Tromp, Gerard Maasdorp, Elizna Van der Spuy, Gian Stellenbosch University. Faculty of Science. Centre for Bioinformatics & Computational Biology. Single-cell RNA sequencing -- Technique RNA -- Computer simulation High performance computing Pipelines -- Data processing Pipelines -- Computer simulation Workflow management systems Bioinformatics UCTD Thesis (MSc)--Stellenbosch University, 2024. ENGLISH ABSTRACT: Single-cell RNA sequencing (scRNA-seq) derives data from individual cells in tissues and provides substantial insight into the transcriptomic variability of cell types and the complexity of tissues. This technique has helped to reveal a relationship between cell heterogeneity and infectious disease progression. As with any other method, technical noise, technically challenging library construction, variable cDNA capture efficiency, sequencing depth, batch effects, and bias, affect the utility and the interpretation of results. This is particularly relevant due to the need for amplification of the limited quantity of RNA material found in each cell. Pre-processing, quality control and normalisation are critical elements of scRNA analysis, but are often overlooked. Robust and reproducible quality control and standardisation increase the utility and veracity of interpretation of the data. Numerous existing pipelines assume that basic quality control has been applied. The absence of stringent and robust pre-processing increases the probability of incorrect assignment of cell types or of incorrect gene expression profiles in cell sub-types. I created a robust scRNA-seq computational pipeline that is more focused on the initial part of the scRNA-seq analysis workflow: improving the preliminary processing steps (pre-processing and quality control). Existing pipeline workflow management tools were assessed, and Nextflow was selected as it aligns with the project goals. I subsequently developed a pipeline that consists of containerised (using Singularity) bioinformatics tools. The Centre for High-Performance Computing (CHPC) Cluster was used for the handling and execution of the single-cell data as well as the development of the pipeline. The result of this study is the construction of a pipeline that is robust and that generates reproducible results. AFRIKAANSE OPSOMMING: Enkelsel-RNS-volgordebepaling (scRNA-seq) verkry data van individuele selle wat in weefsels voorkom, wat aansienlike insig bied in die veranderlikheid binne seltipes en die kompleksiteit van weefsels. Dit het gehelp om 'n verband tussen selheterogeniteit en die ontwikkeling van aansteeklike siektes na te spoor. Soos met enige ander metode word die bruikbaarheid en interpretasie van die resultate beï nvloed deur tegniese geraas wat ontstaan deur komplikasies met biblioteekkonstruksie, veranderlike cDNA- opvangvolgordebepalingsdiepte, bondeleffekte en vooroordeel. Dit is veral relevant as gevolg van die behoefte aan amplifikasie van die beperkte hoeveelheid RNA-materiaal wat in elke sel gevind word. Voorverwerking, kwaliteitsbeheer en normalisering is kritieke elemente van scRNA-analise wat dikwels oor die hoof gesien word. Robuuste en herhaalbare gehaltebeheer en standaardisering verhoog die bruikbaarheid en interpreteerbaarheid van data. Talle bestaande verwerkings pypleidings veronderstel dat basiese gehaltebeheer toegepas is. Afwesigheid van streng en robuuste voorverwerking verhoog die waarskynlikheid van valse ontdekking van nuwe seltipes of verkeerde geenuitdrukking profiele in selsubtipes. Ek het 'n robuuste scRNA-seq pyplyn geskep wat meer gefokus is op die verbetering van die aanvanklike verwerking (voorverwerking, gehaltebeheer). Bestaande pyplynwerkvloeibestuurnutsmiddels is ondersoek Nextflow is gekies aangesien dit ooreenstem met ons aannames. Ek het 'n pyplyn ontwikkel wat bestaan uit gehouerde (met Singularity) bioinformatika-instrumente. Die Centre for High-Performance Computing (CHPC) rekenaarbondel is gebruik vir die hantering en uitvoering van die enkelseldata sowel as pyplynontwikkeling. Die beoogde resultaat is 'n pyplyn wat gehard en herhaalbaar is. Masters 2025-04-02T12:26:40Z 2025-04-02T12:26:40Z 2024-12 Thesis https://scholar.sun.ac.za/handle/10019.1/131841 Stellenbosch University xvi, 108 pages : illustrations application/pdf Stellenbosch : Stellenbosch University |
| spellingShingle | Single-cell RNA sequencing -- Technique RNA -- Computer simulation High performance computing Pipelines -- Data processing Pipelines -- Computer simulation Workflow management systems Bioinformatics UCTD Mohlaba, Tiego Development of a comprehensive single cell RNA sequencing pipeline: pre-processing, quality control and identification of outliers |
| title | Development of a comprehensive single cell RNA sequencing pipeline: pre-processing, quality control and identification of outliers |
| title_full | Development of a comprehensive single cell RNA sequencing pipeline: pre-processing, quality control and identification of outliers |
| title_fullStr | Development of a comprehensive single cell RNA sequencing pipeline: pre-processing, quality control and identification of outliers |
| title_full_unstemmed | Development of a comprehensive single cell RNA sequencing pipeline: pre-processing, quality control and identification of outliers |
| title_short | Development of a comprehensive single cell RNA sequencing pipeline: pre-processing, quality control and identification of outliers |
| title_sort | development of a comprehensive single cell rna sequencing pipeline pre processing quality control and identification of outliers |
| topic | Single-cell RNA sequencing -- Technique RNA -- Computer simulation High performance computing Pipelines -- Data processing Pipelines -- Computer simulation Workflow management systems Bioinformatics UCTD |
| url | https://scholar.sun.ac.za/handle/10019.1/131841 |
| work_keys_str_mv | AT mohlabatiego developmentofacomprehensivesinglecellrnasequencingpipelinepreprocessingqualitycontrolandidentificationofoutliers |