Video-based physical violence detection model for efficient public space surveillance
International Journal of Informatics and Communication Technology
Abstract
This study aims to develop an effective real-time model for detecting violence in public spaces, focusing on achieving a balance between accuracy and computational efficiency. We evaluate various model architectures, with the main comparison between the ConvLSTM2D and Conv3D models commonly used in video analysis to capture spatial and temporal features. The ConvLSTM2D model, combined with preprocessing layers such as change detection and motion blur, showed optimal performance, achieving 86% accuracy after Bayesian optimization. With a low parameter count of 25,137, this model enables fast inference in just 0.010 seconds, making it suitable for real-time applications that require efficient computation. In contrast, the Conv3D model, which is also combined with preprocessing layers such as change detection and motion blur and has more than nine million parameters, shows a lower accuracy of 77.5% as well as a slower inference time of 0.025 seconds, making it unsuitable for real-time applications. The results of this study show that the ConvLSTM2D model is promising for real-time violence detection systems in public spaces, where a fast and accurate response is essential to prevent further acts of violence.
Discover Our Library
Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.





