Abstract

Electronic Imaging

2470-11732470-1173

Society for Imaging Science and Technology

IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA

10.2352/EI.2023.35.17.3DIA-102

3DIA-102

Article

Learned visual localization with camera pose refinement and verification based on differentiable renderer

Tsai

Chanchang

Tokyo Institute of Technology, Japan

Taira

Hajime

Tokyo Institute of Technology, Japan

Okutomi

Masatoshi

Tokyo Institute of Technology, Japan

Abstract

This manuscript presents a new CNN-based visual localization method that seeks a camera location of an input RGB image with respect to a pre-collected RGB-D images database. To determine an accurate camera pose, we employ a coarse-to-fine localization manner that firstly finds coarse location candidates via image retrieval, then refines them using local 3D structure represented by each retrieved RGB-D image. We use a CNN feature extractor and a relative pose estimator for coarse prediction that do not sufficiently require a scene-specific training. Furthermore, we propose a new pose refinement-verification module that simultaneously evaluates and refines camera poses using differentiable renderer. Experimental results on public datasets show that our proposed pipeline achieves accurate localization on both trained and unknown scenes.

16 1 2023

3DIA

3D Imaging and Applications 2023

17 102-1 102-6

2023

Visual localizationPlace recognitionDeep learningDifferentiable rendererCamera pose estimationPose verification