Denoising Diffusion-Augmented Hybrid Video Anomaly Detection via Reconstructing Noised Frames

Kai Cheng; Yaning Pan; Yang Liu; Xinhua Zeng; Rui Feng

doi:10.24963/ijcai.2024/77

Denoising Diffusion-Augmented Hybrid Video Anomaly Detection via Reconstructing Noised Frames

Kai Cheng, Yaning Pan, Yang Liu, Xinhua Zeng, Rui Feng

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

Main Track. Pages 695-703. https://doi.org/10.24963/ijcai.2024/77

PDF BibTeX

Video Anomaly Detection (VAD) is crucial for enhancing security and surveillance systems through automatic identification of irregular events, thereby enabling timely responses and augmenting overall situational awareness. Although existing methods have achieved decent detection performances on benchmarks, their predicted objects still remain ambiguous in terms of the semantic aspect. To overcome this limitation, we propose the Denoising diffusion-augmented Hybrid Video Anomaly Detection (DHVAD) framework. The proposed Denoising diffusion-based Reconstruction Unit (DRU) enhances the understanding of semantically accurate normality as a crucial component in DHVAD. Meanwhile, we propose a detection strategy that integrates the advantages of a prediction-based Frame Prediction Unit (FPU) with DRU by exploring the spatial-temporal consistency seamlessly. The competitive performance of DHVAD compared with state-of-the-art methods on three benchmark datasets proves the effectiveness of our framework. The extended experimental analysis demonstrates that our framework can gain a better understanding of the normality in terms of semantic accuracy for VAD and efficiently leverage the strengths of both components.

Keywords:

Computer Vision: CV: Video analysis and understanding

Computer Vision: CV: Representation learning

Computer Vision: CV: Scene analysis and understanding