Naturalness is a complex appearance attribute that is dependent on multiple visual appearance attributes like color, gloss, roughness, and their interaction. It impacts the perceived quality of an object and should therefore be reproduced correctly. In recent years, the use of color 3D printing technology has seen considerable growth in different fields like cultural heritage, medical, entertainment, and fashion for producing 3D objects with the correct appearance. This paper investigates the reproduction of naturalness attribute using a color 3D printing technology and the naturalness perception of the 3D printed objects. Results indicate that naturalness perception of 3D printed objects is highly subjective but is found to be objectively dependent mainly on a printed object’s surface elevation and roughness.
Evaluating perceptual image and video quality is crucial for multimedia technology development. This study investigated nation-based differences in quality assessment using three large-scale crowdsourced datasets (KonIQ-10k, KADID-10k, NIVD), analyzing responses from diverse countries including the US, Japan, India, Brazil, Venezuela, Russia, and Serbia. We hypothesized that cultural factors influence how observers interpret and apply rating scales like the Absolute Category Rating (ACR) and Degradation Category Rating (DCR). Our advanced statistical models, employing both frequentist and Bayesian approaches, incorporated country-specific components such as variable thresholds for rating categories and lapse rates to account for unintended errors. Our analysis revealed significant cross-cultural variations in rating behavior, particularly regarding extreme response styles. Notably, US observers showed a 35–39% higher propensity for extreme ratings compared to Japanese observers when evaluating the same video stimuli, aligning with established research on cultural differences in response styles. Furthermore, we identified distinct patterns in threshold placement for rating categories across nationalities, indicating culturally influenced variations in scale interpretation. These findings contribute to a more comprehensive understanding of image quality in a global context and have important implications for quality assessment dataset design, offering new opportunities to investigate cultural differences difficult to capture in laboratory environments.
This study presents a novel character-level writer verification framework for ancient manuscripts, employing a building-block approach that integrates decision strategies across multiple token levels, including characters, words, and sentences. The proposed system utilized edge-directional and hinge features along with machine learning techniques to verify the hands that wrote the Great Isaiah Scroll. A custom dataset containing over 12,000 samples of handwritten characters from the associated scribes was used for training and testing. The framework incorporated character-specific parameter tuning, resulting in 22 separate models and demonstrated that each character has distinct features that enhance system performance. Evaluation was conducted through soft voting, comparing probability scores across different token levels, and contrasting the results with majority voting. This approach provides a detailed method for multi-scribe verification, bridging computational and paleographic methods for historical manuscript studies.
An ideal archival storage system combines longevity, accessibility, low cost, high capacity, and human readability to ensure the persistence and future readability of stored data. At Archiving 2024 [B. M. Lunt, D. Kemp, M. R. Linford, and W. Chiang, “How long is long-term? An update,” Archiving (2024)], the authors’ research group presented a paper that summarized several efforts in this area, including magnetic tapes, optical disks, hard disk drives, solid-state drives, Project Silica (a Microsoft project), DNA, and projects C-PROM, Nano Libris, and Mil Chispa (the last three being the authors’ research). Each storage option offers unique advantages in each of the desirable characteristics. This paper provides information on other efforts in this area, including the work by Cerabyte, Norsam Technologies, and Group 47 DOTS, and an update on the authors’ projects C-PROM, Nano Libris, and Mil Chispa.
Predicting the perceived brightness and lightness of image elements using color appearance models is important for the design and evaluation of HDR displays. This paper presents a series of experiments to examine perceived brightness/lightness for displayed stimuli of differing sizes. The number of observers in the first pilot experiment was 7, in the second and third pilot experiments was 6, and in the main experiment was 14. The target and test stimuli in the main experiment were 10∘ and 1∘ field of view, respectively. The results indicate a small, but consistent, effect that brightness increases with stimulus size. The effect is dependent on the stimulus lightness level but not on the hue or saturation of the stimuli. A preliminary model is also introduced to enhance models such as CIECAM16 with the capability of predicting brightness and lightness as a function of stimulus size. The proposed model yields good performance in terms of perceived brightness/lightness prediction.
With the escalating demands for rendering technology, exclusive reliance on rendering programs is no longer sufficient. The collection of surface information pertaining to real-world materials has become a crucial aspect of computer graphics. The acquisition of material surface information is pivotal in the fields of digital reconstruction and virtual reality. In this article, we introduce a material acquisition device that employs multiple cameras and lights. This device utilizes a combination of multiple cameras and lights to capture objects from various angles and lighting conditions, resulting in more comprehensive and realistic material surface information. Specifically, the device features multiple cameras and strategically placed lighting. Four cameras are positioned at 10∘, 35∘, 60∘, and 85∘ to capture various aspects of the surface of the object while 24 point lights are placed in three layers of the hemisphere at 10∘, 35∘, and 60∘ with eight lights per layer spaced 45∘ apart. This approach facilitates the acquisition of rich and diverse material surface information by integrating multiple perspectives and lighting conditions.
Low-light images often fail to accurately capture color and texture, limiting their practical applications in imaging technology. The use of low-light image enhancement technology can effectively restore the color and texture information contained in the image. However, current low-light image enhancement is directly calculated from low-light to normal-light images, ignoring the basic principles of imaging, and the image enhancement effect is limited. The Retinex model splits the image into illumination components and reflection components, and uses the decomposed illumination and reflection components to achieve end-to-end enhancement of low-light images. Inspired by the Retinex theory, this study proposes a low-light image enhancement method based on multispectral reconstruction. This method first uses a multispectral reconstruction algorithm to reconstruct a metameric multispectral image of a normal-light RGB image. Then it uses a deep learning network to learn the end-to-end mapping relationship from a low-light RGB image to a normal-light multispectral image. In this way, any low-light image can be reconstructed into a normal-light multispectral image. Finally, the corresponding normal-light RGB image is calculated according to the colorimetry theory. To test the proposed method, the popular dataset for low-light image enhancement, LOw-Light (LOL) is adopted to compare the proposed method and the existing methods. During the test, a multispectral reconstruction method based on reversing the image signal processing of RGB imaging is used to reconstruct the corresponding metameric multispectral image of each normal-light RGB image in LOL. The deep learning architecture proposed by Zhang et al. with the convolutional block attention module added is used to establish the mapping relationship between the low-light RGB images and the corresponding reconstructed multispectral images. The proposed method is compared to existing methods such as self-supervised, RetinexNet, RRM, KinD, RUAS, and URetinex-Net. In the context of the LOL dataset and an illuminant chosen for rendering, the results show that the low-light image enhancement method proposed in this study is better than the existing methods.
In eye-tracking based 3D displays, system latency due to eye-tracking and 3D rendering causes an error between the actual eye position and the tracked position, which is proportional to the viewer’s movement. This discrepancy makes viewers to see 3D content from a non-optimal position, thereby increasing 3D crosstalk and degrading the quality of 3D images under dynamic viewing conditions. In this paper, we investigate the latency issue, distinguish each source of system latency and study the display margin of eye-tracking based 3D display. To reduce 3D crosstalk during viewer’s motion, we propose a motion compensation method by predicting viewer’s eye position. The effectiveness of our motion compensation method is validated by experiments using previously implemented 3D display prototype and the results show that the prediction error decreased to 24.6%, indicating that the accuracy of eye pupil position became 4 times higher, and crosstalk reduced to a level similar to that of a 1/4 latency system.
Object detection in varying traffic scenes presents significant challenges in real-world applications. Thermal image utilization is acknowledged as a beneficial approach to enhance RGB image detection, especially in suboptimal lighting conditions. However, harnessing the combined potential of RGB and thermal images remains a formidable task. We tackle this by implementing an illumination-guided adaptive information fusion technique across both data types. Thus, we propose the illumination-guided with crossmodal attention transformer fusion (ICATF), a novel object detection framework that skillfully integrates features from RGB and thermal data. Further, an illumination-guided module is developed to adapt features to current lighting conditions, steering the learning process towards the most informative data fusion. Then, we incorporate frequency domain convolutions within the network’s backbone to assimilate spectral context and derive more nuanced features. In addition, we fuse the differential modality features for multispectral pedestrian detection with illumination-guided feature weights and transformer fusion architecture. Our method achieves state-of-the-art by experimental results on multispectral detection datasets, including FLIR-aligned, LLVIP, and KAIST.