TY - JOUR
T1 - Crowd behavior detection
T2 - leveraging video swin transformer for crowd size and violence level analysis
AU - Qaraqe, Marwa
AU - Yang, Yin David
AU - Varghese, Elizabeth B.
AU - Basaran, Emrah
AU - Elzein, Almiqdad
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/8/26
Y1 - 2024/8/26
N2 - Abstract: In recent years, crowd behavior detection has posed significant challenges in the realm of public safety and security, even with the advancements in surveillance technologies. The ability to perform real-time surveillance and accurately identify crowd behavior by considering factors such as crowd size and violence levels can avert potential crowd-related disasters and hazards to a considerable extent. However, most existing approaches are not viable to deal with the complexities of crowd dynamics and fail to distinguish different violence levels within crowds. Moreover, the prevailing approach to crowd behavior recognition, which solely relies on the analysis of closed-circuit television (CCTV) footage and overlooks the integration of online social media video content, leads to a primarily reactive methodology. This paper proposes a crowd behavior detection framework based on the swin transformer architecture, which leverages crowd counting maps and optical flow maps to detect crowd behavior across various sizes and violence levels. To support this framework, we created a dataset comprising videos capable of recognizing crowd behaviors based on size and violence levels sourced from CCTV camera footage and online videos. Experimental analysis conducted on benchmark datasets and our proposed dataset substantiates the superiority of our proposed approach over existing state-of-the-art methods, showcasing its ability to effectively distinguish crowd behaviors concerning size and violence level. Our method’s validation through Nvidia’s DeepStream Software Development Kit (SDK) highlights its competitive performance and potential for real-time intelligent surveillance applications. Graphical abstract: (Figure presented.)
AB - Abstract: In recent years, crowd behavior detection has posed significant challenges in the realm of public safety and security, even with the advancements in surveillance technologies. The ability to perform real-time surveillance and accurately identify crowd behavior by considering factors such as crowd size and violence levels can avert potential crowd-related disasters and hazards to a considerable extent. However, most existing approaches are not viable to deal with the complexities of crowd dynamics and fail to distinguish different violence levels within crowds. Moreover, the prevailing approach to crowd behavior recognition, which solely relies on the analysis of closed-circuit television (CCTV) footage and overlooks the integration of online social media video content, leads to a primarily reactive methodology. This paper proposes a crowd behavior detection framework based on the swin transformer architecture, which leverages crowd counting maps and optical flow maps to detect crowd behavior across various sizes and violence levels. To support this framework, we created a dataset comprising videos capable of recognizing crowd behaviors based on size and violence levels sourced from CCTV camera footage and online videos. Experimental analysis conducted on benchmark datasets and our proposed dataset substantiates the superiority of our proposed approach over existing state-of-the-art methods, showcasing its ability to effectively distinguish crowd behaviors concerning size and violence level. Our method’s validation through Nvidia’s DeepStream Software Development Kit (SDK) highlights its competitive performance and potential for real-time intelligent surveillance applications. Graphical abstract: (Figure presented.)
KW - Crowd behavior detection
KW - Crowd size
KW - DeepStream
KW - Swin transformer
KW - Violence Level
UR - https://www.scopus.com/pages/publications/85202074781
U2 - 10.1007/s10489-024-05775-6
DO - 10.1007/s10489-024-05775-6
M3 - Article
AN - SCOPUS:85202074781
SN - 0924-669X
VL - 54
SP - 10709
EP - 10730
JO - Applied Intelligence
JF - Applied Intelligence
IS - 21
ER -