Regular
Attention Mechanism
Color fidelityCompressed stream
Dataset similarity measuresDeep LearningDL classifiersDeep learning
Encoder classification
Fréchet distanceFeature specializationFeature space distance
Green encoderGamut mapping
Hue preservation
Intra prediction modesImage neighborhoodsImage Denoising
Luma/chroma coefficients
Multi-Scale Feature Matching
Non Local Means
Residual Swin Transformers
Tone mapping
Video fingerprintingVideo streaming
 Filters
Month and year
 
  22  13
Image
Pages 226-1 - 226-7,  © 2025 Society for Imaging Science and Technology 2025
Volume 37
Issue 10
Abstract

In this paper, we present a computationally-efficient gamut mapping algorithm designed for tone-mapped images, focusing on preserving hue fidelity while providing flexibility to retain either luminance or saturation for visually consistent results. The algorithm operates in both RGB and YUV color spaces, enabling practical implementation in hardware and software for real-time systems. We demonstrate that the proposed method effectively mitigates hue shifts during gamut mapping, offering a computationally viable alternative to more complex methods based on perceptually uniform color spaces.

Digital Library: EI
Published Online: February  2025
  64  22
Image
Pages 227-1 - 227-6,  © 2025 Society for Imaging Science and Technology 2025
Volume 37
Issue 10
Abstract

Image denoising is a crucial task in image processing, aiming to enhance image quality by effectively eliminating noise while preserving essential structural and textural details. In this paper, we introduce a novel denoising algorithm that integrates residual Swin transformer blocks (RSTB) with the concept of the classical non-local means (NLM) filtering. The proposed solution is aimed at striking a balance between performance and computation complexity and is structured into three main components: (1) Feature extraction utilizing a multi-scale approach to capture diverse image features using RSTB, (2) Multi-scale feature matching inspired by NLM that computes pixel similarity through learned embeddings enabling accurate noise reduction even in high-noise scenarios, and (3) Residual detail enhancement using the swin transformer block that recovers high-frequency details lost during denoising. Our extensive experiments demonstrate that the proposed model with 743k parameters achieves the best or competitive performance amongst the state-of-the-art models with comparable number of parameters. This makes the proposed solution a preferred option for applications prioritizing detail preservation with limited compute resources. Furthermore, the proposed solution is flexible enough to adapt to other image restoration problems like deblurring and super-resolution.

Digital Library: EI
Published Online: February  2025
  16  2
Image
Pages 233-1 - 233-6,  © 2025 Society for Imaging Science and Technology 2025
Volume 37
Issue 10
Abstract

While conventional video fingerprinting methods act in the uncompressed domain (pixels and/or directly derived representations from pixels), the present paper establishes the proof of concepts for compressed domain video fingerprinting. Thus, visual content is processed at the level of compressed stream syntax elements (luma/chroma coefficients, and intra prediction modes) by a homemade NN-based solution backboned by conventional CNN models (ResNet and MobileNet). The experimental validations are obtained out of processing a state of the art and a homemade HEVC compressed video databases, and bring forth Accuracy, Precision and Recall values larger than 0.9.

Digital Library: EI
Published Online: February  2025
  10  1
Image
Pages 234-1 - 234-6,  © 2025 Society for Imaging Science and Technology 2025
Volume 37
Issue 10
Abstract

Video streaming hits more than 80% of the carbon emissions generated by worldwide digital technologies consumption that, in their turn, account for 5% of worldwide carbon emissions. Hence, green video encoding emerges as a research field devoted to reducing the size of the video streams and the complexity of the decoding/encoding operations, while keeping a preestablished visual quality. Having the specific view of tracking green encoded video streams, the present paper studies the possibility of identifying the last video encoder considered in the case of multiple reencoding distribution scenarios. To this end, classification solutions backboned by the VGG, ResNet and MobileNet families are considered to discriminate among MPEG-4 AVC stream syntax elements, such as luma/chroma coefficients or intra prediction modes. The video content sums-up to 2 hours and is structured in two databases. Three encoders are alternatively studied, namely a proprietary green-encoder solution, and the two by-default encoders available on a large video sharing platform and on a popular social media, respectively. The quantitative results show classification accuracy ranging between 75% to 100%, according to the specific architecture, sub-set of classified elements, and dataset.

Digital Library: EI
Published Online: February  2025
  15  2
Image
Pages 237-1 - 237-6,  © 2025 Society for Imaging Science and Technology 2025
Volume 37
Issue 10
Abstract

Assessing distances between images and image datasets is a fundamental task in vision-based research. It is a challenging open problem in the literature and despite the criticism it receives, the most ubiquitous method remains the Fréchet Inception Distance. The Inception network is trained on a specific labeled dataset, ImageNet, which has caused the core of its criticism in the most recent research. Improvements were shown by moving to self-supervision learning over ImageNet, leaving the training data domain as an open question. We make that last leap and provide the first analysis on domain-specific feature training and its effects on feature distance, on the widely-researched facial image domain. We provide our findings and insights on this domain specialization for Fréchet distance and image neighborhoods, supported by extensive experiments and in-depth user studies.

Digital Library: EI
Published Online: February  2025

Keywords

[object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object]