Similar Items: A comparative analysis of video vision transformers on word-level sign language datasets
- Hybrid lightweight vision transformers with attention mechanism for feature extraction and classification of product designs
- SkelFormer: An adaptive hierarchical transformer-based approach on skeleton graphs for human action recognition in video sequences
- Using transformer-based models for Vietnamese language detection
- Exploring phonological complexity in statistical learning of artificial words
- Distribution of visuo-attentional resources while reading multiple words
- Deep learning–driven image captioning: Progress through transformers and large language models