Similar Items: MAGNUS: Multi-Attention Guided Network for Unified Segmentation via CNN-ViT Fusion
- AT‐ViT: Area‐Targeted Multi‐View Vision Transformer With Cross‐Attention and Multi‐Scale Patching for Plant Trait Recognition in Herbarium Images
- ViT-ResNet Fusion: An Explainable Hybrid Framework for High-Accuracy Multiclass Lung Disease Classification in Chest X-Rays
- Mobile3ViT: An Improved Hybrid CNN‐Visual Transformer Model for Automatic Gastrointestinal Image Recognition
- H-ViT: hardware-friendly post-training quantization for efficient vision transformer inference
- Let ViT Speak: Generative Language-Image Pre-training
- Dense-Attention CNN with Spatial-Attention Fusion for Robust Facial Expression Recognition