Novel View Synthesis on iPhone Images

The novel view synthesis task is to render images from novel viewpoints given an RGB scene capture. Those images are captured by an iPhone camera (60 FPS, 1920x1440), with COLMAP-derived camera poses for a subset of the training frames. Additional information such as depth maps, ARKit poses, and IMU data are also included. Check out the details at the section iphone/ in the documentation ScanNet++ Documentation.

This novel view synthesis track is more challenging than the one on DSLR images since it involves real-world consumer device captures, which often include motion blur, exposure variations, and limited field of view. To evaluate, we use high-quality DSLR images as our test frames, and they are undistorted with the iPhone intrinsic parameters.

We provide the identical evaluation and the ground-truth image preparation scripts in the ScanNet++ Toolbox.

Evaluation and Metrics

We evaluate the similarity between the ground truth (GT) and generated RGB images. The metrics are

Peak signal-to-noise ratio (PSNR)
Similarity index measure (SSIM)
Perceptual image patch similarity (LPIPS)

For each pair of generated and ground-truth images, we compute these three metrics, and the numbers reported in the table are the average over all the images across all the scenes.

Evaluation is done on GT images (undistorted fish-eye DSLR test images with iPhone intrinsic parameters) with a resolution of 1920 x 1440. Submitted images will be automatically resized if their resolutions differ from this. Due to inconsistent lighting conditions, a color-correction process based on optimal transport will be applied to the rendered images. Please refer to the ScanNet++ Toolbox for the implementation. The metrics with color-corrected are noted as PSNR (CC), SSIM (CC), and LPIPS (CC).

Results

The iPhone NVS test set consists of 12 scenes.

Methods	PSNR (RAW)	SSIM (RAW)	LPIPS (RAW)	PSNR (CC)	SSIM (CC)	LPIPS (CC)
Gaussian Splatting w/ exposure compensation	15.138	0.821	0.427	19.936	0.848	0.352
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. SIGGRAPH 2023
Gaussian Splatting w/ depth	14.138	0.805	0.456	19.761	0.849	0.363
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. SIGGRAPH 2023
Splatfacto	14.086	0.807	0.437	19.272	0.848	0.363
Matthew Tancik, Ethan Weber, Evonne Ng, Ruilong Li, Brent Yi, Justin Kerr, Terrance Wang, Alexander Kristoffersen, Jake Austin, Kamyar Salahi, Abhik Ahuja, David McAllister, Angjoo Kanazawa. Nerfstudio: A Modular Framework for Neural Radiance Field Development. SIGGRAPH 2023
Gaussian Splatting	14.290	0.808	0.456	19.105	0.844	0.374
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. SIGGRAPH 2023

Please refer to the submission instructions before making a submission

Submit results