Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
ENGLISH ABSTRACT : Cancer disease is an abnormal growth of cells, which may be caused by mutations in genes which, as a result, alter the way cells function mainly in the way they grow and divide. Cancer cells are regulated by complex interactions mediated by a group of proteins and miRNAs which a...
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | en_ZA |
| Published: |
Stellenbosch : Stellenbosch University
2018
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| Summary: | ENGLISH ABSTRACT : Cancer disease is an abnormal growth of cells, which may be caused by mutations in genes which, as a result, alter the way cells function mainly in the
way they grow and divide. Cancer cells are regulated by complex interactions
mediated by a group of proteins and miRNAs which are expressed and repressed. With the help of transcriptomic technologies such as RNA–sequencing
(RNA–seq), it is now possible to profile thousands of genes at once to create
a global picture of the functions of cells. Here, the study employs a statistical
approach, called Significance Analysis of Microarray (SAM), to identify genes
that are differentially expressed in breast cancer patients. Genes with scores
greater than a threshold are deemed potentially significant. Genes identified as
significantly different are used for twofold reasons. First, the study uses these
significantly identified genes to predict breast cancer using three machine learning algorithms. The machine learning algorithms used are random forests, artificial neural networks and support vector machines. Secondly, clinical details
of patients and significantly identified genes are combined to build a survival
model to predict the probability of survival and risk to the event in breast cancer patients. Using The Cancer Genome Atlas (TCGA) as the primary data for the study, SAM reported 23 genes as significantly different. Further investigations revealed that these 23 significant genes are involved in tumour suppression, angiogenesis, cell growth factor, tumourigenesis, cell proliferation, tumour
progression and tumour necrosis activities. In predicting breast cancer, 10 out
of the 23 genes contribute significantly to the model. Finally, it was identified
that log–logistic distribution best describes the survival time of breast cancer patients. Moreover, the survival model revealed that expression levels of six genes
influence the survival probability of a breast cancer patient. |
|---|