FisheyeDistanceNet++: Self-Supervised Fisheye Distance Estimation with Self-Attention, Robust Loss Function and Camera View Generalization

Varun Ravi Kumar; Senthil Yogamani; Stefan Milz; Patrick Mäder

doi:10.2352/ISSN.2470-1173.2021.17.AVM-181

Abstract

FisheyeDistanceNet [1] proposed a self-supervised monocular depth estimation method for fisheye cameras with a large field of view (> 180°). To achieve scale-invariant depth estimation, FisheyeDistanceNet supervises depth map predictions over multiple scales during training. To overcome this bottleneck, we incorporate self-attention layers and robust loss function [2] to FisheyeDistanceNet. A general adaptive robust loss function helps obtain sharp depth maps without a need to train over multiple scales and allows us to learn hyperparameters in loss function to aid in better optimization in terms of convergence speed and accuracy. We also ablate the importance of Instance Normalization over Batch Normalization in the network architecture. Finally, we generalize the network to be invariant to camera views by training multiple perspectives using front, rear, and side cameras. Proposed algorithm improvements, FisheyeDistanceNet++, result in 30% relative improvement in RMSE while reducing the training time by 25% on the WoodScape dataset. We also obtain state-of-the-art results on the KITTI dataset, in comparison to other self-supervised monocular methods.

72010604

Electronic Imaging

2470-1173

Society for Imaging Science and Technology

IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA

10.2352/ISSN.2470-1173.2021.17.AVM-181

2470-1173(20210118)2021:17L.1811;1-

ei_24701173_v2021n17_Input/s11.xml

/ist/ei/2021/00002021/00000017/art00011

Articles

FisheyeDistanceNet++: Self-Supervised Fisheye Distance Estimation with Self-Attention, Robust Loss Function and Camera View Generalization

Ravi KumarVarun

YogamaniSenthil

MilzStefan

MäderPatrick

18012021

2021

181-1

181-11

2021

Automated DrivingDepth EstimationSurround View cameras

articleview.keywords