Visibility of image artifacts depends on the viewing conditions, such as display brightness and distance to the display. However, most image and video quality metrics operate under the assumption of a single standard viewing condition, without considering luminance or distance to the display. To address this limitation, we isolate brightness and distance as the components impacting the visibility of artifacts and collect a new dataset for visually lossless image compression. The dataset includes images encoded with JPEG andWebP at the quality level that makes compression artifacts imperceptible to an average observer. The visibility thresholds are collected under two luminance conditions: 10 cd/m2, simulating a dimmed mobile phone, and 220 cd/m2, which is a typical peak luminance of modern computer displays; and two distance conditions: 30 and 60 pixels per visual degree. The dataset was used to evaluate existing image quality and visibility metrics in their ability to consider display brightness and its distance to viewer. We include two deep neural network architectures, proposed to control image compression for visually lossless coding in our experiments.
Finding a point in the intersection of two closed convex sets is a common problem in image processing and other areas. Projections onto convex sets (POCS) is a standard algorithm for finding such a point. Dykstra's projection algorithm is a well known alternative that finds the point in the intersection closest to a given point. Yet another lesser known alternative is the alternating direction method of multipliers (ADMM) that can be used for both purposes. In this paper we discuss the differences in the convergence of these algorithms in image processing problems. The ADMM applied to finding an arbitrary point in the intersection is much faster than POCS and any algorithm for finding the nearest point in the intersection.
In images, the representation of glossiness, translucency, and roughness of material objects (Shitsukan) is essential for realistic image reproduction. To date, image coding has been developed considering various indices of the quality of the encoded image, for example, the peak signal-to-noise ratio. Consequently, image coding methods that preserve subjective impressions of qualities such as Shitsukan have not been studied. In this study, the authors focus on the property of glossiness and propose a method of glossiness-aware image coding. Their purpose is to develop an encoding algorithm that produces images that can be decoded by standard JPEG decoders, which are commonly used worldwide. The proposed method consists of three procedures: block classification, glossiness enhancement, and non-glossiness information reduction. In block classification, the types of glossiness in a target image are classified using block units. In glossiness enhancement, the glossiness in each type of block is emphasized to reduce the amount of degradation of glossiness during JPEG encoding. The third procedure, non-glossiness information reduction, further compresses the information while maintaining the glossiness by reducing the information in each block that does not represent the glossiness in the image. To test the effectiveness of the proposed method, the authors conducted a subjective evaluation experiment using paired comparison of images coded by the proposed method and JPEG images with the same data size. The glossiness was found to be better preserved in images coded by the proposed method than in the JPEG images.
Practical steganalysis inevitably involves the necessity to deal with a diverse cover source. In the JPEG domain, one key element of the diversification is the JPEG quality factor, or, more generally, the JPEG quantization table used for compression. This paper investigates experimentally the scalability of various steganalysis detectors w.r.t. JPEG quality. In particular, we report that CNN detectors as well as older feature-based detectors have the capacity to contain the complexity of multiple JPEG quality factors within a single model when the quality factors are properly grouped based on their quantization tables. Detectors trained on multiple JPEG qualities show no loss of detection accuracy when compared with dedicated detectors trained for a specific JPEG quality factor. We also demonstrate that CNNs (but not so much feature-based classifiers) trained on multiple qualities can generalize to unseen custom quantization tables compared to detectors trained for specific JPEG qualities. Their ability to generalize to very different quantization tables, however, remains a challenging task. A semi-metric comparing quantization tables is introduced and used to interpret our results.
A new rule for modulating costs in side-informed steganography is proposed. The modulation factors of costs are determined by the minimum perturbation of the precover to quantize to the desired stego value. This new rule is contrasted with the established way of weighting costs by the difference between the rounding errors to the cover and stego values. Experiments are used to demonstrate that the new rule improves security in ternary side-informed UNIWARD in JPEG domain. The new rule arises naturally as the correct cost modulation for JPEG side-informed steganography with the “trunc” quantizer used in many portable digital imaging devices.
The task of additional lossless compression of JPEG images is considered. We propose to decode JPEG image and recompress it using lossy BPG (Better Portable Graphics) codec based on a subset of the HEVC open video compression standard. Then the decompressed and smoothed BPG image is used for calculation and quantization of DCT coefficients in 8x8 image blocks using quantization tables of the source JPEG image. A difference between obtained quantized DCT coefficients and quantized DCT coefficients of the source JPEG image (prediction error) is calculated. The difference is lossless compressed by a proposed context modeling and arithmetical coding. In this way the source JPEG image is replaced by two files: compressed BPG image and the compressed difference which needed for lossless restoration of the source JPEG image. It is shown that the proposed approach provides compression ratios comparable with state of the art PAQ8, WinZip and STUFFIT file archivers. At the same time BPG images may be used for fast preview of compressed JPEG images.
One of the main approaches to additional lossless compression of JPEG images is decoding of quantized values of discrete cosine transform (DCT) coefficients and further more effective recompression of the coefficients. Values of amplitudes of DCT coefficients are highly correlated and it is possible to effectively compress them. At the same time, signs of DCT coefficients, which occupy up to 20% of compressed image, are often considered unpredictable. In the paper, a new and effective method for compression of signs of quantized DCT coefficients is proposed. The proposed method takes into account both correlation between DCT coefficients of the same block and correlation between DCT coefficients of neighbor blocks. For each of 64 DCT coefficients, positions of 3 reference coefficients inside the block are determined and stored in the compressed file. Four reference coefficients with fixed positions are used from the neighbor blocks. For all reference coefficients, 15 frequency models to predict signs of a given coefficient are used. All 7 probabilities (that the sign is negative) are mixed by logistic mixing. For test set of JPEG images, we show that the proposed method allows compressing signs of DCT coefficients by 1.1 ... 1.3 times, significantly outperforming nearest analogues.
In natural steganography, the secret message is hidden by adding to the cover image a noise signal that mimics the heteroscedastic noise introduced naturally during acquisition. The method requires the cover image to be available in its RAW form (the sensor capture). To bring this idea closer to a practical embedding method, in this paper we embed the message in quantized DCT coefficients of a JPEG file by adding independent realizations of the heteroscedastic noise to pixels to make the embedding resemble the same cover image acquired at a larger sensor ISO setting (the so-called cover source switch). To demonstrate the feasibility and practicality of the proposed method and to validate our simplifying assumptions, we work with two digital cameras, one using a monochrome sensor and a second one equipped with a color sensor. We then explore several versions of the embedding algorithm depending on the model of the added noise in the DCT domain and the possible use of demosaicking to convert the raw image values. These experiments indicate that the demosaicking step has a significant impact on statistical detectability for high JPEG quality factors when making independent embedding changes to DCT coefficients. Additionally, for monochrome sensors or low JPEG quality factors very large payload can be embedded with high empirical security.
In this work, image scramble algorithm is proposed that is robust under transcoding to recover a successfully descrambled image from a scrambled image. The transcoding schemes as such as resizing, color space conversion, color format conversion, etc. are considered as they are commonly applied in multimedia sharing Instant Messaging services (IMs) applications. To combat such transcoding, following methods are applied: MCU block shuffle scramble, restricted DCT coefficient sign randomization, restricted DCT coefficient swap within MCU block, adaptive DCT coefficient scaling and clipping depending on the image and postupdate. The resulting scrambled image yields good visual scramble effects and small increase in bitstream size compared to the original image. The scramble key, that scramble algorithm depends upon, is encrypted with the authorized user information and embedded in the scrambled image, chosen prior to transmission, to successfully descramble the scrambled image without additional servers for key management. These results allow scrambled image to be shared via IMs and descramble successfully without key management servers, thereby possibly creating new multimedia service for smartphones.
This paper presents a new method, of recompressing a JPEG crypto-compressed image. In this project, we propose a cryptocompression method which allows recompression without any information about the encryption key. The recompression can be executed on the JPEG bitstream directly by removing the last bit of the code representation of a non null DCT coefficient and adapting its Huffman code part. To be able to apply this approach, the quantization table is slightly modified to make up the modifications. This method is efficient to recompress a JPEG cryptocompressed image in terms of ratio compression. Moreover, since the encryption is fully reversible, the decryption of the recompressed image produces an image that has a similar visual quality compared to the original compressed image.