FlowGS: End-to-End Correspondence-Guided 3D Gaussian Splatting from Sparse Unposed Images

FLOWGS NVS Output

Abstract

Although radiance fields represented by 3D Gaussians with sparse input view have achieved remarkable success in novel view synthesis, it still requires initialization through Structure from Motion (SfM) and is essentially a sequential two-stage task. Recent methods solve this problem in a SfM-independent manner, concentrating on optimizing camera poses, yet they still require known intrinsics as input.

In this study, we introduce FlowGS, an end-to-end correspondence-guided sparse 3D Gaussian Splatting (3DGS) method that enables realistic scene reconstruction from uncalibrated and unposed images. FlowGS achieves the simultaneous optimization of intrinsics, poses, depths, and Gaussian parameters by integrating optical flow with 3DGS. Furthermore, we design a Multi-Gaussian Guided Depth strategy to address the excessive smoothing depth at object edges caused by depth convolutional neural networks. Additionally, we observe that sparse views characterized by limited overlap cause optical flow mismatches, consequently generating the erroneous rendering of 3D points at infinite distances. To tackle this issue, we propose a Sphere Projection method to map these irrelevant points onto nearby spherical surfaces. The experimental results on the Tanks and Temples and Static Hikes datasets demonstrate that our methodology achieves competitive performance against state-of-the-art approaches.

FLOWGS: turns output NVS under the 3, 6, 12 views on the Family scene of Tanks and Temples.

FLOWGS NVS Output

Qualitative Comparisons on Tanks and Temples.

FLOWGS NVS Output

Qualitative Comparisons on Static Hikes.