Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers

Recently, many "molecular profiling" projects have yielded vast amounts of genetic, epigenetic, transcription, protein expression, metabolic and drug response data for cancerous tumours, healthy tissues, and cell lines. We aim to facilitate a multi-scale understanding of these high-dimensional biolo...

Full description

Saved in:
Bibliographic Details
Main Author: Sinkala, Musalula
Other Authors: Martin, Darren
Format: Thesis
Language:English
Published: Department of Clinical Laboratory Sciences 2021
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614091275665408
access_status_str Open Access
author Sinkala, Musalula
author2 Martin, Darren
author_browse Martin, Darren
Sinkala, Musalula
author_facet Martin, Darren
Sinkala, Musalula
author_sort Sinkala, Musalula
collection Thesis
description Recently, many "molecular profiling" projects have yielded vast amounts of genetic, epigenetic, transcription, protein expression, metabolic and drug response data for cancerous tumours, healthy tissues, and cell lines. We aim to facilitate a multi-scale understanding of these high-dimensional biological data and the complexity of the relationships between the different data types taken from human tumours. Further, we intend to identify molecular disease subtypes of various cancers, uncover the subtype-specific drug targets and identify sets of therapeutic molecules that could potentially be used to inhibit these targets. We collected data from over 20 publicly available resources. We then leverage integrative computational systems analyses, network analyses and machine learning, to gain insights into the pathophysiology of pancreatic cancer and 32 other human cancer types. Here, we uncover aberrations in multiple cell signalling and metabolic pathways that implicate regulatory kinases and the Warburg effect as the likely drivers of the distinct molecular signatures of three established pancreatic cancer subtypes. Then, we apply an integrative clustering method to four different types of molecular data to reveal that pancreatic tumours can be segregated into two distinct subtypes. We define sets of proteins, mRNAs, miRNAs and DNA methylation patterns that could serve as biomarkers to accurately differentiate between the two pancreatic cancer subtypes. Then we confirm the biological relevance of the identified biomarkers by showing that these can be used together with pattern-recognition algorithms to infer the drug sensitivity of pancreatic cancer cell lines accurately. Further, we evaluate the alterations of metabolic pathway genes across 32 human cancers. We find that while alterations of metabolic genes are pervasive across all human cancers, the extent of these gene alterations varies between them. Based on these gene alterations, we define two distinct cancer supertypes that tend to be associated with different clinical outcomes and show that these supertypes are likely to respond differently to anticancer drugs. Overall, we show that the time has already arrived where we can leverage available data resources to potentially elicit more precise and personalised cancer therapies that would yield better clinical outcomes at a much lower cost than is currently being achieved.
format Thesis
id oai:open.uct.ac.za:11427/32983
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:46:31.808Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2021
publishDateRange 2021
publishDateSort 2021
publisher Department of Clinical Laboratory Sciences
publisherStr Department of Clinical Laboratory Sciences
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/32983 Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers Sinkala, Musalula Martin, Darren Mulder, Nicola Barth, Stefan big data data integration biology machine learning Recently, many "molecular profiling" projects have yielded vast amounts of genetic, epigenetic, transcription, protein expression, metabolic and drug response data for cancerous tumours, healthy tissues, and cell lines. We aim to facilitate a multi-scale understanding of these high-dimensional biological data and the complexity of the relationships between the different data types taken from human tumours. Further, we intend to identify molecular disease subtypes of various cancers, uncover the subtype-specific drug targets and identify sets of therapeutic molecules that could potentially be used to inhibit these targets. We collected data from over 20 publicly available resources. We then leverage integrative computational systems analyses, network analyses and machine learning, to gain insights into the pathophysiology of pancreatic cancer and 32 other human cancer types. Here, we uncover aberrations in multiple cell signalling and metabolic pathways that implicate regulatory kinases and the Warburg effect as the likely drivers of the distinct molecular signatures of three established pancreatic cancer subtypes. Then, we apply an integrative clustering method to four different types of molecular data to reveal that pancreatic tumours can be segregated into two distinct subtypes. We define sets of proteins, mRNAs, miRNAs and DNA methylation patterns that could serve as biomarkers to accurately differentiate between the two pancreatic cancer subtypes. Then we confirm the biological relevance of the identified biomarkers by showing that these can be used together with pattern-recognition algorithms to infer the drug sensitivity of pancreatic cancer cell lines accurately. Further, we evaluate the alterations of metabolic pathway genes across 32 human cancers. We find that while alterations of metabolic genes are pervasive across all human cancers, the extent of these gene alterations varies between them. Based on these gene alterations, we define two distinct cancer supertypes that tend to be associated with different clinical outcomes and show that these supertypes are likely to respond differently to anticancer drugs. Overall, we show that the time has already arrived where we can leverage available data resources to potentially elicit more precise and personalised cancer therapies that would yield better clinical outcomes at a much lower cost than is currently being achieved. 2021-02-24T18:07:38Z 2021-02-24T18:07:38Z 2020 2021-02-24T18:06:52Z Doctoral Thesis Doctoral PhD http://hdl.handle.net/11427/32983 eng application/pdf Department of Clinical Laboratory Sciences Faculty of Health Sciences
spellingShingle big data
data integration
biology
machine learning
Sinkala, Musalula
Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers
thesis_degree_str Doctoral
title Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers
title_full Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers
title_fullStr Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers
title_full_unstemmed Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers
title_short Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers
title_sort leveraging big data resources and data integration in biology applying computational systems analyses and machine learning to gain insights into the biology of cancers
topic big data
data integration
biology
machine learning
url http://hdl.handle.net/11427/32983
work_keys_str_mv AT sinkalamusalula leveragingbigdataresourcesanddataintegrationinbiologyapplyingcomputationalsystemsanalysesandmachinelearningtogaininsightsintothebiologyofcancers