Video Frame Interpolation with Densely Queried Bilateral Correlation

Video Frame Interpolation with Densely Queried Bilateral Correlation

Chang Zhou, Jie Liu, Jie Tang, Gangshan Wu

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 1786-1794. https://doi.org/10.24963/ijcai.2023/198

Video Frame Interpolation (VFI) aims to synthesize non-existent intermediate frames between existent frames. Flow-based VFI algorithms estimate intermediate motion fields to warp the existent frames. Real-world motions' complexity and the reference frame's absence make motion estimation challenging. Many state-of-the-art approaches explicitly model the correlations between two neighboring frames for more accurate motion estimation. In common approaches, the receptive field of correlation modeling at higher resolution depends on the motion fields estimated beforehand. Such receptive field dependency makes common motion estimation approaches poor at coping with small and fast-moving objects. To better model correlations and to produce more accurate motion fields, we propose the Densely Queried Bilateral Correlation (DQBC) that gets rid of the receptive field dependency problem and thus is more friendly to small and fast-moving objects. The motion fields generated with the help of DQBC are further refined and up-sampled with context features. After the motion fields are fixed, a CNN-based SynthNet synthesizes the final interpolated frame. Experiments show that our approach enjoys higher accuracy and less inference time than the state-of-the-art. Source code is available at https://github.com/kinoud/DQBC.
Keywords:
Computer Vision: CV: Computational photography
Computer Vision: CV: Motion and tracking
Computer Vision: CV: Video analysis and understanding