TY - GEN
T1 - Exploring Cross-fusion and Curriculum Learning for Multi-modal Human Detection on Drones
AU - Safa, Ali
AU - Ocket, Ilja
AU - Catthoor, Francky
AU - Gielen, Georges
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/1/17
Y1 - 2022/1/17
N2 - In a number of applications ranging from warehouse management to people search and rescue, drones will need to evolve in the vicinity of human agents. In those situations, robust and fail-safe human detection by drones must be provided. However, human detection systems used on drones are currently based on single imaging cameras, beside a growing number of works investigating more robust detection schemes via sensor fusion. In the drone context, the fusion of standard RGB and event-based cameras has emerged, while in the automotive context, the fusion of RGB with radars has been proposed for up-most safety towards environmental conditions. In this paper, our aim is to debut the investigation of RGB, event-based camera and radar fusion. First, we acquire a novel dataset for the task of people detection in an indoor, industrial setting, by mounting the sensor fusion suite on a drone. Then, we propose a baseline convolutional neural network (CNN) architecture augmented with cross-fusion highways for sensor fusion and people detection. To train the network, we propose a novel multimodal curriculum learning procedure and demonstrate that our method (termed SAUL) greatly enhances the robustness of the system towards hard RGB failures ( on the peak F1 score) and provides a significant gain in detection performance ( on the peak F1 score) compared to the BlackIn procedure, previously proposed for cross-fusion network training. Finally, we report the performance of our system through precision-recall curve analysis and perform additional ablation studies to shed light on the key aspect of our system.
AB - In a number of applications ranging from warehouse management to people search and rescue, drones will need to evolve in the vicinity of human agents. In those situations, robust and fail-safe human detection by drones must be provided. However, human detection systems used on drones are currently based on single imaging cameras, beside a growing number of works investigating more robust detection schemes via sensor fusion. In the drone context, the fusion of standard RGB and event-based cameras has emerged, while in the automotive context, the fusion of RGB with radars has been proposed for up-most safety towards environmental conditions. In this paper, our aim is to debut the investigation of RGB, event-based camera and radar fusion. First, we acquire a novel dataset for the task of people detection in an indoor, industrial setting, by mounting the sensor fusion suite on a drone. Then, we propose a baseline convolutional neural network (CNN) architecture augmented with cross-fusion highways for sensor fusion and people detection. To train the network, we propose a novel multimodal curriculum learning procedure and demonstrate that our method (termed SAUL) greatly enhances the robustness of the system towards hard RGB failures ( on the peak F1 score) and provides a significant gain in detection performance ( on the peak F1 score) compared to the BlackIn procedure, previously proposed for cross-fusion network training. Finally, we report the performance of our system through precision-recall curve analysis and perform additional ablation studies to shed light on the key aspect of our system.
KW - curriculum learning
KW - deep learning
KW - drones
KW - people detection
KW - sensor fusion
UR - https://www.scopus.com/pages/publications/85133389310
U2 - 10.1145/3522784.3522785
DO - 10.1145/3522784.3522785
M3 - Conference contribution
AN - SCOPUS:85133389310
T3 - ACM International Conference Proceeding Series
SP - 1
EP - 7
BT - Proceedings of System Engineering for Constrained Embedded Systems - DroneSE
PB - Association for Computing Machinery
T2 - 2022 Workshop on System Engineering for Constrained Embedded Systems - Drone Systems Engineering and Rapid Simulation and Performance Evaluation: Methods and Tools, DroneSE and RAPIDO 2022 - Presented at HiPEAC 2022 Conference
Y2 - 20 June 2022 through 22 June 2022
ER -