改进FlowNetS的光流估计算法研究-《计算机技术与发展》

文章信息/Info

Title:: Advanced Research on Improving Optical Flow Estimation Based on FlowNetS

文章编号:: 1673-629X(2025)07-0001-07

作者:: 王雅妮1; 翟正军1; 2; 代巍1; 申思远1; 1. 西北工业大学计算机学院,陕西西安 710129;
2. 西北工业大学计算机测控与仿真技术研究所,陕西西安 710072

Author(s):: WANG Ya-ni1; ZHAI Zheng-jun1; 2; DAI Wei1; SHEN Si-yuan1; 1. School of Computer Science,Northwestern Polytechnical University,Xi’an 710129,China;
2. Institute of Computer Measurement,Control and Simulation Technology,Northwestern Polytechnical University,Xi’an 710072,China

关键词:: 光流估计; 分层策略; 多尺度估计; 叠加融合; 卷积神经网络

Keywords:: optical flow estimation; hierarchical strategy; multi-scale estimation; additive fusion; convolutional neural network

分类号:: TP391.41

DOI:: 10.20165/j.cnki.ISSN1673-629X.2025.0041

摘要:: 近年来,基于深度学习的光流估计方法在准确性和效率方面取得了显著进展。然而,其仍面临着对训练数据依赖性强、对特定场景敏感、计算资源消耗大、物理约束利用不足以及可解释性差等问题。该文提出了一种基于卷积神经网络的叠加融合方法,旨在提升光流特征的表达能力,提高预测精度并降低内存消耗。具体而言,设计了一种叠加融合模块即连续加法块(Continue-ADD-Block)。连续加法块通过添加一个 Conv7 层并采用连续下采样进行融合,有效地整合了不同尺度下的信息,增强了对复杂场景和多尺度运动的处理能力。在 Flying Chairs、KITTI Flow 2015 和 MPI-Sintel (clean 和 final)数据集上的实验结果表明,连续加法块在降低资源消耗和内存占用的同时,在复杂场景下取得了更高的精度,表现为更低的最大的平均端点误差(End Point Error,EPE) 即更高的精度,并且在 Flying Chairs 数据集中精度提升 2. 86% ,在 KITTI Flow 2015 数据集中精度提升 0. 70% ,在 MPI-Sintel (clean 和 final)数据集中精度分别提升 7. 40% 、2. 43% 。这表明该方法在复杂场景下具有更强的鲁棒性,为光流估计领域提供了一种新的解决方案。

Abstract:: Deep learning-based methods for optical flow estimation have shown remarkable progress in recent years,achieving notable gains in both accuracy and efficiency. However, these methods continue to suffer from several shortcomings, including a strong dependency on extensive training datasets,sensitivity to specific environmental conditions,high computational costs,a lack of effective in-corporation of physical constraints,and limited interpretability. To address these issues,we propose additive fusion methodology using convolutional neural networks ( CNNs) to enhance the representational capacity of optical flow features, improve the accuracy of predictions,and simultaneously reduce memory requirements. We specifically introduce a novel stacked fusion module, termed the Continue-ADD-Block.This module effectively consolidates multi-scale information through the inclusion of a Conv7 layer and the ap-plication of consecutive downsampling for the fusion process. This integration process strengthens the model’s capacity to cope with complex scenes characterized by multi-scale motion. Empirical evaluations conducted on the Flying Chairs,KITTI Flow 2015,and MPI-Sintel (both clean and final versions) datasets,demonstrate that the Continue-ADD-BLOCK achieves superior accuracy (expressed as a lower maximal average End - Point Error ( EPE )) in complex scenarios. Critically, this performance gain is achieved while simultaneously reducing resource consumption and the memory footprint. Specifically,the approach demonstrated accuracy improvements of 2. 86% on Flying Chairs,0. 70% on KITTI Flow 2015,and 7. 40% and 2. 43% on the MPI-Sintel (clean and final) datasets,respec-tively. These findings highlight the proposed method’s enhanced robustness under challenging circumstances,thus offering a novel and effective approach for optical flow estimation.

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

文章信息/Info

常用功能

导航/Navigate

工具/Tools

统计/Statistics