Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells

Background: Accurate prediction of protein subcellular localization is critical for understanding protein function and guiding experimental research. Recent advances in deep learning have enabled high-throughput image-based methods to tackle this problem by leveraging large-scale immunofluorescence...

Full description

Saved in:

Bibliographic Details
Main Author:	Msipa, Sibongiseni Letticia
Other Authors:	Sinkala, Musalula
Format:	Thesis
Language:	English English
Published:	Department of Integrative Biomedical Sciences (IBMS) 2026
Subjects:	Protein Subcellular Localization Convolutional Neural Networks Vision Transformers Deep Learning Multi-Label Classification
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613250013626368
access_status_str	Open Access
author	Msipa, Sibongiseni Letticia
author2	Sinkala, Musalula
author_browse	Msipa, Sibongiseni Letticia Sinkala, Musalula
author_facet	Sinkala, Musalula Msipa, Sibongiseni Letticia
author_sort	Msipa, Sibongiseni Letticia
collection	Thesis
description	Background: Accurate prediction of protein subcellular localization is critical for understanding protein function and guiding experimental research. Recent advances in deep learning have enabled high-throughput image-based methods to tackle this problem by leveraging large-scale immunofluorescence microscopy datasets. The aim of this study is to comparatively evaluate convolutional neural network (CNN) architectures and Transformer- based models for the multi-label classification of protein subcellular localization in eukaryotic cells, using large-scale immunofluorescence image datasets. Methods: In this study, we comparatively evaluated convolutional neural network (CNN) architectures (DenseNet121, Xception, and InceptionV3) and transformer-based models (Vision Transformer and Swin Transformer) for multi-label classification of protein localization in eukaryotic cells. Using 12,565 immunofluorescence images from the Human Protein Atlas—representing 15 subcellular compartments—we performed transfer learning by replacing the final layers of pretrained ImageNet models to accommodate multi-label output. All models were trained with iterative stratification to handle class imbalance and evaluated on held-out test images. Results and discussion: Our findings indicate that CNN-based models, particularly DenseNet121 and Xception, achieve the highest overall accuracy and F1-scores, successfully recognizing both abundant and underrepresented classes. In contrast, transformers demonstrated variable performance. While the Swin Transformer surpassed the Vision Transformer, neither consistently matched CNN performance—likely reflecting the data requirements and hyperparameter sensitivity of transformer architectures. Visualization techniques (Grad-CAM in CNNs and attention maps in transformers) confirmed that well- performing models localize salient features to biologically relevant regions, suggesting they learn meaningful morphological cues Conclusion: These results underscore CNNs' suitability for subcellular localization analysis with moderate-scale datasets, while transformers may require more extensive tuning or larger training sets to reach comparable accuracy. Our findings suggest that CNNs, especially DenseNet121 and Xception, exhibit superior performance over transformer models in predicting protein localization. CNN-based models demonstrate higher accuracy and interpretability, positioning them as preferred choices for advancing functional proteomics and computational drug discovery.
format	Thesis
id	oai:open.uct.ac.za:11427/42545
institution	University of Cape Town (South Africa)
language	English eng
last_indexed	2026-06-10T12:33:08.525Z
license_str	Not specified — see source repository
provenance_str_mv	Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate	2026
publishDateRange	2026
publishDateSort	2026
publisher	Department of Integrative Biomedical Sciences (IBMS)
publisherStr	Department of Integrative Biomedical Sciences (IBMS)
record_format	dspace
source_str	UCTD — University of Cape Town Open Access Repository
spelling	oai:open.uct.ac.za:11427/42545 Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells Msipa, Sibongiseni Letticia Sinkala, Musalula Protein Subcellular Localization Convolutional Neural Networks Vision Transformers Deep Learning Multi-Label Classification Background: Accurate prediction of protein subcellular localization is critical for understanding protein function and guiding experimental research. Recent advances in deep learning have enabled high-throughput image-based methods to tackle this problem by leveraging large-scale immunofluorescence microscopy datasets. The aim of this study is to comparatively evaluate convolutional neural network (CNN) architectures and Transformer- based models for the multi-label classification of protein subcellular localization in eukaryotic cells, using large-scale immunofluorescence image datasets. Methods: In this study, we comparatively evaluated convolutional neural network (CNN) architectures (DenseNet121, Xception, and InceptionV3) and transformer-based models (Vision Transformer and Swin Transformer) for multi-label classification of protein localization in eukaryotic cells. Using 12,565 immunofluorescence images from the Human Protein Atlas—representing 15 subcellular compartments—we performed transfer learning by replacing the final layers of pretrained ImageNet models to accommodate multi-label output. All models were trained with iterative stratification to handle class imbalance and evaluated on held-out test images. Results and discussion: Our findings indicate that CNN-based models, particularly DenseNet121 and Xception, achieve the highest overall accuracy and F1-scores, successfully recognizing both abundant and underrepresented classes. In contrast, transformers demonstrated variable performance. While the Swin Transformer surpassed the Vision Transformer, neither consistently matched CNN performance—likely reflecting the data requirements and hyperparameter sensitivity of transformer architectures. Visualization techniques (Grad-CAM in CNNs and attention maps in transformers) confirmed that well- performing models localize salient features to biologically relevant regions, suggesting they learn meaningful morphological cues Conclusion: These results underscore CNNs' suitability for subcellular localization analysis with moderate-scale datasets, while transformers may require more extensive tuning or larger training sets to reach comparable accuracy. Our findings suggest that CNNs, especially DenseNet121 and Xception, exhibit superior performance over transformer models in predicting protein localization. CNN-based models demonstrate higher accuracy and interpretability, positioning them as preferred choices for advancing functional proteomics and computational drug discovery. 2026-01-13T07:17:55Z 2026-01-13T07:17:55Z 2025 2026-01-12T07:35:06Z Thesis / Dissertation Masters MSc http://hdl.handle.net/11427/42545 en eng application/pdf Department of Integrative Biomedical Sciences (IBMS) Faculty of Health Sciences University of Cape Town
spellingShingle	Protein Subcellular Localization Convolutional Neural Networks Vision Transformers Deep Learning Multi-Label Classification Msipa, Sibongiseni Letticia Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
thesis_degree_str	Master's
title	Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_full	Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_fullStr	Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_full_unstemmed	Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_short	Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells
title_sort	evaluating convolutional neural networks and transformer architectures for image based prediction of protein localization in eukaryotic cells
topic	Protein Subcellular Localization Convolutional Neural Networks Vision Transformers Deep Learning Multi-Label Classification
url	http://hdl.handle.net/11427/42545
work_keys_str_mv	AT msipasibongiseniletticia evaluatingconvolutionalneuralnetworksandtransformerarchitecturesforimagebasedpredictionofproteinlocalizationineukaryoticcells

Full Text Available

Evaluating convolutional neural networks and transformer architectures for image-based prediction of protein localization in eukaryotic cells

Similar Items