
As the automotive industry becomes increasingly digitalized, Camera Monitoring Systems (CMS) are replacing traditional mirrors, offering improved aerodynamics and wider fields of view. However, depth perception remains a challenge, and unclear overlays can reduce driver trust. This study investigates how different CMS augmentations affect distance estimation during overtaking. Thirty participants viewed video clips across three road scenes using one baseline and three augmented interface concepts. They estimated vehicle distances, rated certainty, and reported preferences. The Lines concept, using distance lines and vehicle outlines, produced the most favorable results with the lowest absolute errors and highest clarity and reliability ratings, although it introduced systematic overestimation. The Corner concept led to consistent underestimation but offered some perceived benefits, while the Dashed concept performed similarly to the baseline. The final design builds on the strengths of the Lines concept, with refinements inspired by Corner to enhance visibility and stability. Recommendations include using intuitive depth cues, accessible colors, and well‑timed visual elements. Future research should explore sound cues, symbolic warnings, and long‑term user acceptance.

Foveated rendering is a key technique for reducing computational load in immersive display systems by lowering rendered image quality in the peripheral visual field while preserving high fidelity in the fovea. While the impact of foveation on perceived spatial detail is well understood, its influence on other visual qualities, such as depth from motion parallax, remains unclear. In this work, we investigate how foveated rendering affects motion-based depth perception across the visual field. Building on previous work on binocular disparity, we use a comparable experimental setup to isolate motion parallax as the sole depth cue and measure depth discrimination thresholds under varying levels of foveation, modeled as varying intensities of spatial blur, and eccentricity. Our results show that depth from motion is immediately impaired by visible foveation, with stronger impairments at higher levels of blur. These findings suggest that motion-based depth cues may be more sensitive to foveated rendering than disparity cues, which were previously found to be largely unaffected.

Transformers, which have demonstrated remarkable performance improvements in natural language processing, have been increasingly adopted in computer vision tasks since the introduction of the Vision Transformer (ViT). In hyperspectral image (HIS) reconstruction, Transformer-based models have gained popularity due to their ability to capture global dependencies. While these models alleviate the certain limitations of convolutional neural networks (CNNs), their computational complexity scales quadratically with spatial resolution, making ultra-high-resolution reconstruction infeasible. Spectral Transformer variants have been proposed to reduce the computational burden associated with high spatial resolution, yet they still face challenges in handling ultra-high-resolution imagery. In this work, we propose a “Patched Input Spatial-Spectral Transformer (PSST)” that efficiently reconstructs HSIs from ultra-high-resolution RGB images. The model integrates a spatial transformer before spectral processing, enabling global context awareness while maintaining computational efficiency through in-model patch partitioning. Although performance slightly decreases for low-resolution inputs compared to state-of-the-art (SOTA) models, our method achieves the highest reconstruction quality for ultra-high-resolution inputs, achieving higher PSNR while significantly reducing memory consumption.

In raw imaging workflows, metadata is as critical as pixel data. Raw files do not merely store sensor measurements; they also encode the information required to interpret those measurements as color, brightness, and dynamic range. When computational methods modify raw-domain image data, this interpretive metadata is often lost, invalidated, or omitted, producing substantial rendering inconsistencies in downstream software. PARSEK (the Probabilistic Alignment Raw Stitcher Experiment from Kentucky) exposes this problem clearly: although its super-resolution output preserves useful raw-domain image content, results saved without appropriate metadata can exhibit severe color and tonal shifts when opened in conventional raw processors. This paper presents KYDNG, a DNG repackaging pipeline designed to preserve perceptual consistency by embedding reconstructed raw image data together with metadata derived from the source capture. The implementation writes raw-specific structural tags, including CFA pattern and repeat dimensions, and propagates key camera-dependent rendering metadata, including ColorMatrix1, AsShotNeutral, and black and white levels, while also embedding a JPEG preview. The resulting files are intended to be interpreted by standard raw development tools using the same camera-consistent rendering assumptions as the original capture.

A closed‑loop color feedback algorithm that leverages post‑ISP statistics to improve camera color quality is presented. Unlike traditional approaches, which evaluate white balance and color early in the pipeline and tune individual modules in isolation, the proposed method assesses color near the end of the ISP pipeline, compares it against target perceptual colors, and feeds the resulting deviations back to upstream processing blocks. This enables dynamic adjustment of AWB and color‑related parameters to achieve desired perceptual color outcomes. The framework addresses key limitations of conventional color processing, including (1) evaluating AWB in the raw domain where perceived color cannot be reliably assessed, (2) the inability of fixed color‑tuning parameters to compensate for deviations introduced by other ISP blocks, and (3) the lack of coordinated color evaluation across modules. We further demonstrate an application of this framework for skin‑tone improvement. The system takes face regions, filters non‑skin pixels, computes representative skin color statistics, compares them with target skin colors, and derives adjustment parameters that update color tunings for the current or subsequent frame. This example illustrates the flexibility and effectiveness of the proposed closed‑loop approach for perceptually guided color enhancement or accurate color reproduction.

This study builds upon our previous work, where we analyzed the range of real-world colors and identified images containing colors that exceed the boundaries of legacy color gamuts such as sRGB and DCI-P3, making them difficult for traditional displays to render accurately. In our current research, we conducted a series of visual experiments to evaluate perceptual differences and viewer preferences when such images are displayed on ultra-WCG displays compared to standard-gamut displays. Our findings indicate that observers could consistently distinguish between images shown on an ultra-WCG display and the same images calibrated to sRGB. The perceptual difference between DCI-P3 and ultra-WCG was notably smaller, resulting in lower detection rates that were more contentdependent. Overall, observers showed a strong preference for the ultra-WCG display, regardless of the viewing condition or the image content.

It is well established that a small percent of the color deficiency population is monocular (Judd, 1948; Broackes, 2010). On the other hand, it is also well known that under certain conditions, binocular fusion of colors (including brightness) does occur. Considering the two sides together, human color sensation is bi-monocular. Furthermore, relating the La Hire phenomenon about the physiological blind spot to the neuroanatomical finding that the blind spot is represented in V1-L4, we can infer that V1-L4 is the neural substrate for color sensations in the human brain. This neural substrate is bi-monocular in that the excitatory neurons there are monocular in terms of thalamic inputs but the two eyes monocular neurons inhabit there side by side: Together they can represent binocular information. In short, bi-monocularity is a prominent attribute of color vision worthy of further investigation. This quality of color sensation has an obvious and important implication for devices that contain eye-based displays: For example, presently, all the commercially available VR headsets (e.g., Apple s Vision Pro and Meta s Quest products) do not have separate color filter settings for the two eyes of an individual user: This feature is worthy of enhancement for the uniocular color deficient population.

Metamer mismatch bodies (MMBs) quantify the extent of metamer mismatching for a given color stimulus with a change in color mechanism (i.e. change in illuminant and/or observer). Prior work has shown that the MMB boundary can be efficiently approximated by spherical sampling of unit directions in the 6D joint color-mechanism space, and for each sampled direction, maximizing the boundary point subject to the metameric cross-section constraints. Many sampled directions map to the same boundary vertex, so the number of recovered vertices is typically far smaller than the number of sampled directions. This produces a plausible approximation, but the resulting boundary vertices, expressed in sensor-response spaces (for example XYZ, LMS, or RGB), are often distributed in a highly non-uniform manner. Increasing the number of sampled directions increases the number of recovered vertices but does not improve boundary uniformity. We explored a simple post-processing workflow that builds a larger candidate pool of vertices and then selects a fixed-size subset using a spacing-driven sampling algorithm, improving vertex uniformity as measured by a nearest-neighbor metric. This approach substantially improves vertex uniformity in sensor space, but it can discard boundary-defining extreme vertices, potentially altering hull volume and other distinguishing boundary features. We therefore argue that any practical workflow for improving MMB vertex uniformity should include an explicit mechanism for retaining boundary-critical extremes prior to applying spacing-driven selection.

Digital camera-based rear‑view systems are increasingly introduced as alternatives to traditional mirrors, offering potential benefits such as improved aerodynamics, reduced blind spots, and enhanced visibility. However, these systems alter the visual cues available to drivers by presenting a fixed monocular image, which might affect how distance and approach speed are judged. This study examines how driver-age influences distance estimation and decisions related to overtaking, while also assessing the impact of camera field of view and camera height. Fifty‑eight licensed car drivers viewed thirty‑six high‑resolution driving clips showing forward‑road scenes and digital rear‑view perspectives with systematically varied camera settings. They completed two tasks: judging the distance of an approaching vehicle and indicating the last moment at which a lane change would be considered safe. Age influenced both perceptual judgements and lane‑change decisions, though not always in the expected direction. Older drivers showed smaller overall errors, yet at wider fields of view they often shifted into overestimation, while younger drivers maintained conservative underestimation. Older drivers nevertheless selected more cautious lane‑change timings in certain conditions. Apparent accuracy advantages may reflect reduced bias rather than consistently safer perception; The results highlight the importance of accounting for user diversity when evaluating camera-based rear‑view systems and developing age‑inclusive design strategies.

Imagery from optical see-through (OST) head-mounted displays (HMDs) is perceived as a blending of light emitted by the display added to the light from the user’s physical environment, which can result in color distortions and desaturation of the virtual imagery. Due to these limitations, the user’s ability to distinguish between colors shown on the display may be reduced compared to more traditional types of displays, which may impact the interpretation of the symbology, and potentially reduce performance. Further, individual variation in color perception may also impact the utility of color symbology in OST HMDs. In this paper, we present a user study that investigates the utility of color-coded symbology displayed on an OST Augmented Reality (AR) display within a flight simulator. We compare performance between participants with normal color vision and participants with color vision deficiencies in a dynamic flight simulator and investigate effects of symbology contrast and symbology color set on participant response times, accuracy, and eye behavior. Our results suggest that for the color sets tested, increasing the size of the set beyond a monochrome color results in reduced performance for both color normal and color deficient subjects. It’s possible that custom color sets specific to OST displays are needed to achieve performance benefits.