1.
A Benchmark Study of Hybrid CNN-Transformer Architectures in Vision-Language Tasks. Emerging Science Research. 2025;3(01):36-49. Accessed July 1, 2025. http://emergingpub.com/index.php/sr/article/view/78