Natural image statistics are well known to have a spatial frequency power spectra that has a 1/f^a behavior, with a typically stated as between 2 and 4. This indicates an invariance to scale. Further work has theorized how the visual system is tuned for such statistics in visual cortex (V1) [1]. Color image statistics also show an invariance to scale [2]. The luminance histogram is typically understood to be log normal with respect to luminance, although for HDR images, a subcomponent with skew toward much higher luminances is observed. Color statistics were initially described at the simplest level via the gray world hypothesis [3], but more details are now available, even at the hyperspectral [4]. The a power function for HDR was found to increase from the lower values of 2 to more typical values of 4 and 5 [5]. For temporal statistics, the data tends to be measured primarily for media, with a 1/f^a for scene cut statistics [6], and temporal frequency and temporal frequency for media with a focus on the motion statistics via optical flow [7]. Statistics for purely natural as well as human made environments (e.g., buildings and the resulting perspective geometry) have been studied, each having different orientation statistics [8]. The use of image statistics for standardized assessment of television power consumption was used to replace test targets, which were often detected and used to lower TV power consumption in well known cheating schemes. To prevent this, a short test video that had luminance statistics matching 48 hours of broadcast content was generated and used for TV power testing [9]. The highly adaptative nature of current TVs (power limiting, dual modulation, dynamic response) has motivated researchers to incorporate complex noise fields following natural image statistics into measurement targets [10,11]. One particular natural image statistic-based still image test target (dead leaves) is widely used in camera optics and sensor development. Algorithm development and testing for image and video processing has almost always been ad hoc, with a mixture of geometric test targets and hand selected test images, sometimes aiming to be corner cases, sometimes not. More recently, large data sets of images have been used to train various neural network models for tasks such as super resolution, bit rate compression, and dynamic range mapping. However, images are not ergodic, and possibly not even wide-sense stationary. We propose the use of imagery based on noise following the natural image statistics for spatio-chromatic (and temporal) to compactly probe the wide variety of image possibilities for algorithmic development, in addition to the existing uses for image capture and display analysis. While we don’t suggest replacing actual practical imagery, we believe such noise fields can augment image algorithm analysis. To address the problem of non-ergodicity, we allow the basic power term a in the natural image statistic model to vary over a large range in a video, such that it includes the extremes of white noise and low frequency gradients. We use color image statistic models that include decorrelated colors to generate the RGB video. We will present results for traditional adaptive data compression (with chromatic subsampling), as well as a more contemporary neural network approach (Neural Fields [12]) as applied to upscaling and denoising. We analyze the results both visually and through several recent color image quality models. Field DJ. Relations between the statistics of natural images and the response properties of cortical cells. J. Opt. Soc. Am. A, 1987; 4:2379-2394 C. Parraga, T. Troscianko, and D.J. Tolhurst (2002) spatiochromatic properties of natural images and human vision. Current Biology V 12 R. M. Evans, Method for correcting photographic color prints, US Patent 2,571,697 (1951) A. Chakrabarti and T. Zickler (2011) Statistics of real-world hyperspectral images CVPR R. Dror, A. Willsky, and E. Adelson (2004) statistical characterization of real-world illumination. JOV V4 J. Cutting (2019) Sequences in popular cinema generate inconsistent event segmentation. Attn. Percept. And Psycho. V 81. D. Lee, H. Ko, J. Kim, and A. Bovik (2021) On the space-time statistics of motion pictures. JOSA A V 38 #7 A. Torralba and A. Oliva (2003) Statistics of natural image categories, Network: Computational Neural Systems 14 391-412 International Electrotechnical Commission, IEC 62087:2008(E), “Methods of measurement for the power consumption of audio, video, and related Equipment. Kunkel T, Daly S. 57-1: Spatiotemporal Noise Targets Inspired by Natural Imagery Statistics. SID Symposium Digest of Technical Papers, 2020, 51:842-845. Kunkel, T, Friedrich, F. Utilizing advanced spatio-temporal backgrounds with dynamic test signals for high dynamic range display metrology. J Soc Inf Display. 2022; 30( 5): 423– 432. https://doi.org/10.1002/jsid.1125 Yiheng Xie1, Towaki Takikawa, Shunsuke Saito, Or Litany, Shiqin Yan, Numair Khan, Federico Tombari, James Tompkin, Vincent Sitzmann, Srinath Sridhar1, "Neural Fields in Visual Computing and Beyond", Eurographics / CGF State-of-the-Art Report, 2022.
Scott Daly, Timo Kunkel, Guan-Ming Su, Anustup Choudhury, "Spatiochromatic and temporal natural image statistics modelling: Applications from display analysis to neural networks" in Electronic Imaging, 2023, pp 188-1 - 188-7, https://doi.org/10.2352/EI.2023.35.15.COLOR-188