Recent advancements in 3D data capture have enabled the real-time acquisition of high-resolution 3D range data, even in mobile devices. However, this type of high bit-depth data remains difficult to efficiently transmit over a standard broadband connection. The most successful techniques for tackling this data problem thus far have been image-based depth encoding schemes that leverage modern image and video codecs. To our knowledge, no published work has directly optimized the end-to-end losses of a depth encoding scheme passing through a lossy image compression codec. In contrast, our compression-resilient neural depth encoding method leverages deep learning to efficiently encode depth maps into 24-bit RGB representations that minimize end-to-end depth reconstruction errors when compressed with JPEG. Our approach employs a fully differentiable pipeline, including a differentiable approximation of JPEG, allowing it to be trained end-to-end on the FlyingThings3D dataset with randomized JPEG qualities. On a Microsoft Azure Kinect depth recording, the neural depth encoding method was able to significantly outperform an existing state-of-the-art depth encoding method in terms of both root-mean-square error (RMSE) and mean absolute error (MAE) in a wide range of image qualities, all with over 20% lower average file sizes. Our method offers an efficient solution for emerging 3D streaming and 3D telepresence applications, enabling high-quality 3D depth data storage and transmission.
Modern computing and imaging technologies have allowed for many recent advances to be made in the field of 3D range imaging: range data can now be acquired at speeds much faster than real-time, with sub-millimeter precision. However, these benefits come at the cost of an increased quantity of data being generated by 3D range imaging systems, potentially limiting the number of applications that can take advantage of this technology. One common approach to the compression of 3D range data is to encode it within the three color channels of a traditional 24-bit RGB image. This paper presents a novel method for the modification and compression of 3D range data such that the original depth information can be stored within, and recovered from, only two channels of a traditional 2D RGB image. Storage within a traditional image format allows for further compression to be realized via lossless or lossy image compression techniques. For example, when JPEG 80 was used to store the encoded output image, this method was able to achieve an 18.2% reduction in file size when compared to a similar three-channel, image-base compression method, with only a corresponding 0.17% reduction in global reconstruction accuracy.