GG-Net: Gaze Guided Network for Self-driving Cars

M. Abdelkarim; M.K. Abbas; Alaa Osama; Dalia Anwar; Mostafa Azzam; M. Abdelalim; H. Mostafa; Samah El-Tantawy; Ibrahim Sobh

doi:10.2352/ISSN.2470-1173.2021.17.AVM-171

Abstract

Imitation learning is used massively in autonomous driving for training networks to predict steering commands from frames using annotated data collected by an expert driver. Believing that the frames taken from a front-facing camera are completely mimicking the driver’s eyes raises the question of how eyes and the complex human vision system attention mechanisms perceive the scene. This paper proposes the idea of incorporating eye gaze information with the frames into an end-to-end deep neural network in the lane-following task. The proposed novel architecture, GG-Net, is composed of a spatial transformer network (STN), and a multitask network to predict steering angle as well as the gaze map for the input frame. The experimental results of this architecture show a great improvement in steering angle prediction accuracy of 36% over the baseline with inference time of 0.015 seconds per frame (66 fps) using NVIDIA K80 GPU enabling the proposed model to operate in real-time. We argue that incorporating gaze maps enhances the model generalization capability to the unseen environments. Additionally, a novel course-steering angle conversion algorithm with a complementing mathematical proof is proposed.

72010604

Electronic Imaging

2470-1173

Society for Imaging Science and Technology

IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA

10.2352/ISSN.2470-1173.2021.17.AVM-171

2470-1173(20210118)2021:17L.1711;1-

ei_24701173_v2021n17_Input/s5.xml

/ist/ei/2021/00002021/00000017/art00005

Articles

GG-Net: Gaze Guided Network for Self-driving Cars

AbdelkarimM.

AbbasM.K.

OsamaAlaa

AnwarDalia

AzzamMostafa

AbdelalimM.

MostafaH.

El-TantawySamah

SobhIbrahim

18012021

2021

171-1

171-8

2021

Eye TrackingImitation LearningAutonomous DrivingSTNMultitask Learning

articleview.keywords