Back to articles
Article
Volume: 34 | Article ID: IPAS-367
Image
Deep reinforcement learning approach to predict head movement in 360° videos
  DOI :  10.2352/EI.2022.34.10.IPAS-367  Published OnlineJanuary 2022
Abstract
Abstract

The popularity of 360° videos has grown immensely in the last few years. One probable reason is the availability of low-cost devices and ease in capturing them. Additionally, users have shown interest in this particular type of media due to its inherent feature of being immersive, which is completely absent in traditional 2D videos. Nowadays such powerful 360° videos have many applications such as generating various content-specific videos (gaming, knowledge, travel, sports, educational, etc.), during surgeries by medical professionals, in autonomous vehicles, etc. A typical 360° video when seen through a Head Mounted Display (HMD) gives an immersive feeling, where the viewer perceives standing within the real environment in a virtual platform. Similar to real life, at any point in time, the viewer can view only a particular region and not the entire 360° content. Viewers adopts physical movement for exploring the total 360° content. However, due to the large volume of 360° media, it faces challenges during transmission. Adaptive compression techniques have been incorporated in this regard, which is in accordance with the viewing behaviour of a viewer. Therefore, with the growing popularity and usage of 360° media, the adaptive compression methodologies are in development. One important factor in adaptive compression is the estimation of the natural field-of-view (FOV) of a viewer watching 360° content using a HMD. The FOV estimation task becomes more challenging due to the spatial displacement of the viewer with respect to the dynamically changing video content. In this work, we propose a model to estimate the FOV of a user viewing a 360° video using an HMD. This task is popularly known as the Virtual Cinematography. The proposed FOVSelectionNet is primarily based on a reinforcement learning framework. In addition to this, saliency estimation is proved to be a very powerful indicator for attention modelling. Therefore, in this proposed network we utilise a saliency indicator for driving the reward function of the reinforcement learning framework. Experiments are performed on the benchmark Pano2Vid 360° dataset, and the results are observed to be similar to human exploration

Subject Areas :
Views 468
Downloads 148
 articleview.views 468
 articleview.downloads 148
  Cite this article 

Tanmay Ambadkar, Pramit Mazumdar, "Deep reinforcement learning approach to predict head movement in 360° videosin Proc. IS&T Int’l. Symp. on Electronic Imaging: Image Processing: Algorithms and Systems,  2022,  pp 367-1 - 367-5,  https://doi.org/10.2352/EI.2022.34.10.IPAS-367

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2022
ei
Electronic Imaging
2470-1173
2470-1173
Society for Imaging Science and Technology
IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA