Learned visual localization with camera pose refinement and verification based on differentiable renderer

Chanchang  Tsai; Hajime  Taira; Masatoshi  Okutomi

doi:10.2352/EI.2023.35.17.3DIA-102

Abstract

This manuscript presents a new CNN-based visual localization method that seeks a camera location of an input RGB image with respect to a pre-collected RGB-D images database. To determine an accurate camera pose, we employ a coarse-to-fine localization manner that firstly finds coarse location candidates via image retrieval, then refines them using local 3D structure represented by each retrieved RGB-D image. We use a CNN feature extractor and a relative pose estimator for coarse prediction that do not sufficiently require a scene-specific training. Furthermore, we propose a new pose refinement-verification module that simultaneously evaluates and refines camera poses using differentiable renderer. Experimental results on public datasets show that our proposed pipeline achieves accurate localization on both trained and unknown scenes.

Electronic Imaging

2470-1173

Society for Imaging Science and Technology

IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA

10.2352/EI.2023.35.17.3DIA-102

3DIA-102

Article

Learned visual localization with camera pose refinement and verification based on differentiable renderer

TsaiChanchang

Tokyo Institute of Technology, Japan

TairaHajime

Tokyo Institute of Technology, Japan

OkutomiMasatoshi

Tokyo Institute of Technology, Japan

Abstract

1612023

3DIA

3D Imaging and Applications 2023

102-1

102-6

2023

Visual localizationPlace recognitionDeep learningDifferentiable rendererCamera pose estimationPose verification

articleview.keywords