RE-FusionNet: A Resource-Efficient Multi-Task Network for Joint Traffic Element Detection and State Recognition
DOI:
https://doi.org/10.14741/ijcet/v.16.3.3Keywords:
Attention-Based Perception, Autonomous Driving, Traffic Light Detection, Traffic Sign Detection, Multi-Task LearningAbstract
Autonomous driving is dependent upon reliable perception of traffic elements including traffic lights and traffic signs in order to ensure both safety and efficiency in making decisions. However, latest computer vision-based approaches treat object detection and semantic state recognition as either separate tasks, or they utilize expensive, hardware-dependent sensor technology to continuously feed in video data that can greatly increase the processing cost and limit deployments in resource-limited environments. This problem has been addressed through the development of Re-FusionNet; a lightweight, attention-enhanced convolutional architecture designed to perform both joint traffic light and traffic sign detection along with semantic state recognition utilizing only RGB image data. The Re-FusionNet utilizes a dual-path fusion encoder (DPFE), differential attention module (DAM), along with a compact convolutional backbone to enable contextual feature representation to be enhanced and to focus on relevant traffic-related objects. In addition to its ability to process images sequentially by only sampling a few frames from a video stream rather than continuously feeding it into the network, it enables efficient capture of contextual information. Finally, all three tasks are performed simultaneously via a single multi-task prediction head that predicts both object location and state, allowing for joint training. Results demonstrate that the Re-FusionNet achieves an average precision (mAP) of 97.3% @ 0.5 IoU on the BDD100K dataset while achieving a state recognition accuracy of 90.0%, and an mAP of 94.2% @ 0.5 IoU on the LISA dataset. Moreover, due to the low computational requirements, it can achieve frame rates above 150 fps. Overall, results clearly show that the Re-FusionNet represents an efficient and deployable method for real-time traffic perception applications within autonomous vehicles and intelligent transportation systems.
