Full Text Available

Access Repository Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

A Robust Framework for Graph Construction in Vision Graph Neural Networks

In Computer Vision, the method of representing an image has a profound effect on the performance of a model. Traditionally speaking, an image is treated as a grid of pixels and can be processed via Convolution Neural Net- works (CNN). An image can also be treated as a sequence of patches. Vision Tra...

Full description

Saved in:

Bibliographic Details
Main Author:	Elsharkawi, Ismael
Format:	Thesis
Published:	AUC Knowledge Fountain 2025
Subjects:	GNNs; Graph Neural Networks; ViG; Vision Graph Neural Networks; Graph Representation Learning; Computer Vision Artificial Intelligence and Robotics
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613431570366464
access_status_str	Open Access
author	Elsharkawi, Ismael
author_browse	Elsharkawi, Ismael
author_facet	Elsharkawi, Ismael
author_sort	Elsharkawi, Ismael
collection	Thesis
description	In Computer Vision, the method of representing an image has a profound effect on the performance of a model. Traditionally speaking, an image is treated as a grid of pixels and can be processed via Convolution Neural Net- works (CNN). An image can also be treated as a sequence of patches. Vision Transformers and MLP-Mixers (Multi-Layer Perceptron Mixers) are two types of models that process an image as a sequence. A more generic representation than grids and sequences would be graphs. That is why Vision Graph Neural Network (ViG) construct a graph for an image and process the image as a graph of patches. However, graph construction is based on K-Nearest Neighbors (k-NN). Using k-NN to construct a graph could lead to missing important edges while enforcing other less important edges in order to satisfy the ”k” constraint on each node’s neighborhood. To overcome this challenge, we present two graph construction methodologies. The first is called Similarity Thresholded Graph Construction (STGC), while the other is called Learnable Reparameterized Graph Construction (LRGC). In STGC, an edge is picked if it has a normalized similarity score higher than a pre-defined threshold. In addition, to fight oversmoothing, we present a decreasing threshold framework. Using STGC, we show experimentally that our model outperforms the State Of The Art graph-based models on ImageNet image classification without introducing a computational overhead. For LRGC, which does not need any hyper-parameter tuning, similarity scores are replaced by learnable attention scores and the threshold for each layer becomes learnable. We prove that LRGC achieves a similar performance to the best hyper-parameter combination of STGC on Imagenette without the need for tuning hyper-parameters.
format	Thesis
id	oai:fount.aucegypt.edu:etds-3477
institution	American University in Cairo (Egypt)
last_indexed	2026-06-10T12:35:59.828Z
license_str	Not specified — see source repository
provenance_str_mv	Harvested via OAI-PMH from AUC Knowledge Fountain — bepress
publishDate	2025
publishDateRange	2025
publishDateSort	2025
publisher	AUC Knowledge Fountain
publisherStr	AUC Knowledge Fountain
record_format	dspace
source_str	AUC Knowledge Fountain — bepress
spelling	oai:fount.aucegypt.edu:etds-3477 A Robust Framework for Graph Construction in Vision Graph Neural Networks Elsharkawi, Ismael In Computer Vision, the method of representing an image has a profound effect on the performance of a model. Traditionally speaking, an image is treated as a grid of pixels and can be processed via Convolution Neural Net- works (CNN). An image can also be treated as a sequence of patches. Vision Transformers and MLP-Mixers (Multi-Layer Perceptron Mixers) are two types of models that process an image as a sequence. A more generic representation than grids and sequences would be graphs. That is why Vision Graph Neural Network (ViG) construct a graph for an image and process the image as a graph of patches. However, graph construction is based on K-Nearest Neighbors (k-NN). Using k-NN to construct a graph could lead to missing important edges while enforcing other less important edges in order to satisfy the ”k” constraint on each node’s neighborhood. To overcome this challenge, we present two graph construction methodologies. The first is called Similarity Thresholded Graph Construction (STGC), while the other is called Learnable Reparameterized Graph Construction (LRGC). In STGC, an edge is picked if it has a normalized similarity score higher than a pre-defined threshold. In addition, to fight oversmoothing, we present a decreasing threshold framework. Using STGC, we show experimentally that our model outperforms the State Of The Art graph-based models on ImageNet image classification without introducing a computational overhead. For LRGC, which does not need any hyper-parameter tuning, similarity scores are replaced by learnable attention scores and the threshold for each layer becomes learnable. We prove that LRGC achieves a similar performance to the best hyper-parameter combination of STGC on Imagenette without the need for tuning hyper-parameters. 2025-01-31T08:00:00Z thesis application/pdf https://fount.aucegypt.edu/etds/2431 https://fount.aucegypt.edu/context/etds/article/3477/viewcontent/Ismael_Elsharkawi___Masters_Thesis___V4.pdf Theses and Dissertations AUC Knowledge Fountain GNNs; Graph Neural Networks; ViG; Vision Graph Neural Networks; Graph Representation Learning; Computer Vision Artificial Intelligence and Robotics
spellingShingle	GNNs; Graph Neural Networks; ViG; Vision Graph Neural Networks; Graph Representation Learning; Computer Vision Artificial Intelligence and Robotics Elsharkawi, Ismael A Robust Framework for Graph Construction in Vision Graph Neural Networks
title	A Robust Framework for Graph Construction in Vision Graph Neural Networks
title_full	A Robust Framework for Graph Construction in Vision Graph Neural Networks
title_fullStr	A Robust Framework for Graph Construction in Vision Graph Neural Networks
title_full_unstemmed	A Robust Framework for Graph Construction in Vision Graph Neural Networks
title_short	A Robust Framework for Graph Construction in Vision Graph Neural Networks
title_sort	robust framework for graph construction in vision graph neural networks
topic	GNNs; Graph Neural Networks; ViG; Vision Graph Neural Networks; Graph Representation Learning; Computer Vision Artificial Intelligence and Robotics
url	https://fount.aucegypt.edu/etds/2431 https://fount.aucegypt.edu/context/etds/article/3477/viewcontent/Ismael_Elsharkawi___Masters_Thesis___V4.pdf
work_keys_str_mv	AT elsharkawiismael arobustframeworkforgraphconstructioninvisiongraphneuralnetworks AT elsharkawiismael robustframeworkforgraphconstructioninvisiongraphneuralnetworks

Full Text Available

A Robust Framework for Graph Construction in Vision Graph Neural Networks

Similar Items