Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

Julian Chibane1,2,   Aayush Bansal3,   Verica Lazova1,2,   Gerard Pons-Moll1,2

1University of Tübingen, Germany
2Max Planck Institute for Informatics, Saarland Informatics Campus, Germany
3Carnegie Mellon University, USA

CVPR 2021 Virtual

Citation (Bibtex)

Overview Video

Abstract & Method

In this work, we introduce Stereo Radiance Fields (SRF), a neural view synthesis approach that is trained end-to-end, generalizes to new scenes in a single forward pass, and requires only sparse views at test time. (b)

In contrast, pure data driven synthesis (a) requires dense input images and time intensive scene memorization for each new scene.

SRF intuition: Building on intuition from stereo reconstruction systems, SRF achive this by composing information of image pairs. 3D points on an opaque, non-occluded surface will project to similar-looking regions when viewed from different perspectives (blue). A point in free space, will not (red).

Method: To predict pixel colors of a novel view (grey camera), we shoot a camera ray into the scene, sample points along it and predict color and density per sample, which are fused into a single color using volumetric renderings.(cf. NeRF) For the color and density prediction we (a) project the sample into all reference views, where we extract point specific CNN features. (b): Next, we compair pairs of features with learned similarity functions, emulating correspondance finding. (c): We compute aggregated stereo features with CNNs and pool them into a single encoding of correspondance, wich is decoded into color and density (d).


    title = {Stereo Radiance Fields (SRF): Learning View Synthesis from Sparse Views of Novel Scenes },
    author = {Chibane, Julian and Bansal, Aayush and Lazova, Verica and Pons-Moll, Gerard},
    booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {jun},
    organization = {{IEEE}},
    year = {2021},