
The spatial frequency response (SFR) has long been a crucial metric for evaluating imaging quality, particularly in camera performance assessment. However, the constraints of chart-based assessment limited the evaluation of natural scenes, making it challenging to evaluate resolution accurately in real-world environments. Notably, the development of the natural scene spatial frequency response (NS-SFR) has enabled resolution evaluation from natural scenes, extending its utility to diverse applications. Nevertheless, existing NS-SFR methods have been limited to two-dimensional analysis, neglecting depth-dependent behaviors such as variations in sharpness across focal planes. To address this limitation, we propose a depth-aware extension of NS-SFR, integrating depth dimension into modulation transfer function (MTF) analysis, and establish a model of the depth-MTF relationship that derives a representative MTF value for a single image’s resolution. Our approach extends conventional planar NS-SFR analysis into a 3D depth-augmented framework that accounts for depth-dependent variations in MTF. Also our results suggest that our approach enables a more resilient and informative methodology for accurate cross-sensor comparison, yielding predictions that show a reasonable correspondence with resolution tendencies observed in natural scenes, while enhancing robustness under varying illumination.

Monocular depth estimation is an important task in scene understanding with applications to pose, segmentation and autonomous navigation. Deep Learning methods relying on multilevel features are currently used for extracting local information that is used to infer depth from a single RGB image. We present an efficient architecture that utilizes the features from multiple levels with fewer connections compared to previous networks. Our model achieves comparable scores for monocular depth estimation with better efficiency on the memory requirements and computational burden.