Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells

Background: Accurate prediction of protein subcellular localization is critical for understanding protein function and guiding experimental research. Recent advances in deep learning have enabled high-throughput image-based methods to tackle this problem by leveraging large-scale immunofluorescence...

Full description

Saved in:
Bibliographic Details
Main Author: Msipa, Sibongiseni Letticia
Other Authors: Sinkala, Musalula
Format: Thesis
Language:English
English
Published: Department of Integrative Biomedical Sciences (IBMS) 2026
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613250013626368
access_status_str Open Access
author Msipa, Sibongiseni Letticia
author2 Sinkala, Musalula
author_browse Msipa, Sibongiseni Letticia
Sinkala, Musalula
author_facet Sinkala, Musalula
Msipa, Sibongiseni Letticia
author_sort Msipa, Sibongiseni Letticia
collection Thesis
description Background: Accurate prediction of protein subcellular localization is critical for understanding protein function and guiding experimental research. Recent advances in deep learning have enabled high-throughput image-based methods to tackle this problem by leveraging large-scale immunofluorescence microscopy datasets. The aim of this study is to comparatively evaluate convolutional neural network (CNN) architectures and Transformer- based models for the multi-label classification of protein subcellular localization in eukaryotic cells, using large-scale immunofluorescence image datasets. Methods: In this study, we comparatively evaluated convolutional neural network (CNN) architectures (DenseNet121, Xception, and InceptionV3) and transformer-based models (Vision Transformer and Swin Transformer) for multi-label classification of protein localization in eukaryotic cells. Using 12,565 immunofluorescence images from the Human Protein Atlas—representing 15 subcellular compartments—we performed transfer learning by replacing the final layers of pretrained ImageNet models to accommodate multi-label output. All models were trained with iterative stratification to handle class imbalance and evaluated on held-out test images. Results and discussion: Our findings indicate that CNN-based models, particularly DenseNet121 and Xception, achieve the highest overall accuracy and F1-scores, successfully recognizing both abundant and underrepresented classes. In contrast, transformers demonstrated variable performance. While the Swin Transformer surpassed the Vision Transformer, neither consistently matched CNN performance—likely reflecting the data requirements and hyperparameter sensitivity of transformer architectures. Visualization techniques (Grad-CAM in CNNs and attention maps in transformers) confirmed that well- performing models localize salient features to biologically relevant regions, suggesting they learn meaningful morphological cues Conclusion: These results underscore CNNs' suitability for subcellular localization analysis with moderate-scale datasets, while transformers may require more extensive tuning or larger training sets to reach comparable accuracy. Our findings suggest that CNNs, especially DenseNet121 and Xception, exhibit superior performance over transformer models in predicting protein localization. CNN-based models demonstrate higher accuracy and interpretability, positioning them as preferred choices for advancing functional proteomics and computational drug discovery.
format Thesis
id oai:open.uct.ac.za:11427/42545
institution University of Cape Town (South Africa)
language English
eng
last_indexed 2026-06-10T12:33:08.525Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2026
publishDateRange 2026
publishDateSort 2026
publisher Department of Integrative Biomedical Sciences (IBMS)
publisherStr Department of Integrative Biomedical Sciences (IBMS)
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/42545 Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells Msipa, Sibongiseni Letticia Sinkala, Musalula Protein Subcellular Localization Convolutional Neural Networks Vision Transformers Deep Learning Multi-Label Classification Background: Accurate prediction of protein subcellular localization is critical for understanding protein function and guiding experimental research. Recent advances in deep learning have enabled high-throughput image-based methods to tackle this problem by leveraging large-scale immunofluorescence microscopy datasets. The aim of this study is to comparatively evaluate convolutional neural network (CNN) architectures and Transformer- based models for the multi-label classification of protein subcellular localization in eukaryotic cells, using large-scale immunofluorescence image datasets. Methods: In this study, we comparatively evaluated convolutional neural network (CNN) architectures (DenseNet121, Xception, and InceptionV3) and transformer-based models (Vision Transformer and Swin Transformer) for multi-label classification of protein localization in eukaryotic cells. Using 12,565 immunofluorescence images from the Human Protein Atlas—representing 15 subcellular compartments—we performed transfer learning by replacing the final layers of pretrained ImageNet models to accommodate multi-label output. All models were trained with iterative stratification to handle class imbalance and evaluated on held-out test images. Results and discussion: Our findings indicate that CNN-based models, particularly DenseNet121 and Xception, achieve the highest overall accuracy and F1-scores, successfully recognizing both abundant and underrepresented classes. In contrast, transformers demonstrated variable performance. While the Swin Transformer surpassed the Vision Transformer, neither consistently matched CNN performance—likely reflecting the data requirements and hyperparameter sensitivity of transformer architectures. Visualization techniques (Grad-CAM in CNNs and attention maps in transformers) confirmed that well- performing models localize salient features to biologically relevant regions, suggesting they learn meaningful morphological cues Conclusion: These results underscore CNNs' suitability for subcellular localization analysis with moderate-scale datasets, while transformers may require more extensive tuning or larger training sets to reach comparable accuracy. Our findings suggest that CNNs, especially DenseNet121 and Xception, exhibit superior performance over transformer models in predicting protein localization. CNN-based models demonstrate higher accuracy and interpretability, positioning them as preferred choices for advancing functional proteomics and computational drug discovery. 2026-01-13T07:17:55Z 2026-01-13T07:17:55Z 2025 2026-01-12T07:35:06Z Thesis / Dissertation Masters MSc http://hdl.handle.net/11427/42545 en eng application/pdf Department of Integrative Biomedical Sciences (IBMS) Faculty of Health Sciences University of Cape Town
spellingShingle Protein Subcellular Localization
Convolutional Neural Networks
Vision Transformers
Deep Learning
Multi-Label Classification
Msipa, Sibongiseni Letticia
Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
thesis_degree_str Master's
title Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_full Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_fullStr Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_full_unstemmed Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_short Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_sort evaluating convolutional neural networks and transformer architectures for image based prediction of protein localization in eukaryotic cells
topic Protein Subcellular Localization
Convolutional Neural Networks
Vision Transformers
Deep Learning
Multi-Label Classification
url http://hdl.handle.net/11427/42545
work_keys_str_mv AT msipasibongiseniletticia evaluatingconvolutionalneuralnetworksandtransformerarchitecturesforimagebasedpredictionofproteinlocalizationineukaryoticcells