Ro-Yolo for Optimized Small Object Detection in Remote Sensing
14 Pages Posted: 23 May 2025
Abstract
Remote Sensing Target Detection has significant applications in fields such as traffic monitoring and urban planning. However, the detection of small targets in UAV aerial imagery presents challenges, such as their high proportions, dense distribution, and significant scale variations. To address these issues, this paper proposes an efficient remote sensing small object detector network—RO-YOLO. First, the RFAConv module is introduced to replace traditional convolution, and the Fast Cross Stage Feature (FCSF) module is designed to improve detection accuracy while effectively controlling the model’s parameter size. Second, to enhance the network's capability for fine-grained feature extraction of small objects, a plug-and-play Backbone-to-Neck Feature Bridge (BNFB) module is developed. This module uses a multi-branch structure to capture different receptive fields, achieving multi-scale feature enhancement. Additionally, to facilitate sufficient fusion of detailed information from shallow feature maps and semantic information from deep feature maps, the Refinement and Alternation Feature Network (RAFN) structure is proposed. This design significantly reduces the model's parameter size while improving detection performance. Finally, the ACmix attention mechanism is integrated to further enhance the model's target focus capability. Experimental results on the challenging VisDrone, TinyPerson, and DOTA benchmark datasets demonstrate that RO-YOLO outperforms the original YOLOv8s, achieving mAP@50 improvements of 9.4%, 5.4%, and 2.3%, respectively, with a 71.1% reduction in parameter size. These results validate the effectiveness and generalization ability of RO-YOLO in small object detection tasks.
Keywords: Small object detection, Feature Enhancement, Multiscale feature extraction, Unmanned aerial vehicle (UAV) image, Attention Mechanism
Suggested Citation: Suggested Citation