Invertible embedding allows the original cover and embedded data to be perfectly reconstructed. Conventional methods use a well-designed predictor and fully exploit the carrier characteristics. Due to the diversity, it is actually hard to accurately model arbitrary covers, which limits the practical use of methods relying heavily on content characteristics. It has motivated us to revisit invertible embedding operations and propose a general graph matching model to generalize them and further reduce the embedding distortion. In the model, the rate-distortion optimization task of invertible embedding is derived as a weighted bipartite graph matching problem. In the bipartite graph, the nodes represent the values of cover elements, and the edges indicate the candidate modifications. Each edge is associated with a weight indicating the corresponding embedding distortion for the connected nodes. By solving the minimum weight maximum matching problem, we can find the optimal embedding strategy under the constraint. Since the proposed work is a general model, it can be incorporated into existing works to improve their performance, or used for designing new invertible embedding systems. We incorporate the proposed work into a part of state-of-the-arts, and experiments show that it significantly improves the rate-distortion performance. To the best knowledge of the authors, it is probably the first work studying rate-distortion optimization of invertible embedding from the perspective of graph matching model.
Recent advances in deep learning (DL) have led to great success in tasks of computer vision and pattern recognition. Sharing pre-trained DL models has been an important means to promote the rapid progress of research community and development of DL based systems. However, it also raises challenges to model authentication. It is quite necessary to protect the ownership of the DL models to be released. In this paper, we present a digital watermarking technique to deep neural networks (DNNs). We propose to mark a DNN by inserting an independent neural network that allows us to use selective weights for watermarking. The independent neural network is only used in the training phase and watermark verification phase, and will not be released publicly. Experiments have shown that, the performance of marked DNN on its original task will not be degraded significantly. Meantime, the watermark can be successfully embedded and extracted with a low neural network loss even under the common attacks including model fine-tuning and compression, which has shown the superiority and applicability of the proposed work.
Signal rich art is an application of watermarking technology to communicate information using visible artistic patterns. In this paper we show new methods of generating signal carrying patterns, simplifications of earlier methods, how to embed a vector watermark signal in applications and how to use signal symmetries to expand the detection envelope of a watermark reader.
To read a digital watermark from printed images requires that the watermarking system read correctly after affine distortions. One way to recover from affine distortions is to add a synchronization signal in the Fourier frequency domain and use this synchronization signal to estimate the applied affine distortion. If the synchronization signal contains a collection of frequency impulses, then a least squares match of frequency impulse locations results in a reasonably accurate linear transform estimation. Nearest neighbor frequency impulse peak location estimation provides a good rough estimate for the linear transform, but a more accurate refinement of the least squares estimate is accomplished with partial pixel peak location estimates. In this paper we will show how to estimate peak locations to any desired accuracy using only the complex frequencies computed by the standard DFT. We will show that these improved peak location estimates result in a more accurate linear transform estimate. We conclude with an assessment of detector robustness that results from this improved linear transformation accuracy.
Practical steganalysis inevitably involves the necessity to deal with a diverse cover source. In the JPEG domain, one key element of the diversification is the JPEG quality factor, or, more generally, the JPEG quantization table used for compression. This paper investigates experimentally the scalability of various steganalysis detectors w.r.t. JPEG quality. In particular, we report that CNN detectors as well as older feature-based detectors have the capacity to contain the complexity of multiple JPEG quality factors within a single model when the quality factors are properly grouped based on their quantization tables. Detectors trained on multiple JPEG qualities show no loss of detection accuracy when compared with dedicated detectors trained for a specific JPEG quality factor. We also demonstrate that CNNs (but not so much feature-based classifiers) trained on multiple qualities can generalize to unseen custom quantization tables compared to detectors trained for specific JPEG qualities. Their ability to generalize to very different quantization tables, however, remains a challenging task. A semi-metric comparing quantization tables is introduced and used to interpret our results.
Image steganography can have legitimate uses, for example, augmenting an image with a watermark for copyright reasons, but can also be utilized for malicious purposes. We investigate the detection of malicious steganography using neural networkbased classification when images are transmitted through a noisy channel. Noise makes detection harder because the classifier must not only detect perturbations in the image but also decide whether they are due to the malicious steganographic modifications or due to natural noise. Our results show that reliable detection is possible even for state-of-the-art steganographic algorithms that insert stego bits not affecting an image’s visual quality. The detection accuracy is high (above 85%) if the payload, or the amount of the steganographic content in an image, exceeds a certain threshold. At the same time, noise critically affects the steganographic information being transmitted, both through desynchronization (destruction of information which bits of the image contain steganographic information) and by flipping these bits themselves. This will force the adversary to use a redundant encoding with a substantial number of error-correction bits for reliable transmission, making detection feasible even for small payloads.
Camera sensor fingerprints for digital camera forensics are formed by Photo-Response Non-Uniformity (PRNU), or more precisely, by estimating PRNU from a set of images taken with a camera. These images must be aligned with each other to establish sensor location pixel-to-pixel correspondence. If some of these images have been resized and cropped, the transformations need to be reversed. In this work we deal with estimation of resizing factor in the presence of one reference image from the same camera. For this problem we coin the term semi-blind estimation of resizing factor. We post two requirements that any solution of this problem should meet. It needs to be reasonably fast and exhibit very low estimation error. Our work shows that this problem can be solved using established image matching in Fourier-Mellin transform applied to vertical and horizontal projections of noise residuals (also called linear patterns).
For PRNU-based forensic detectors, the fundamental test statistic is the normalized correlation between the camera fingerprint and the noise residual extracted from the image in successive overlapping analysis windows. The correlation predictor plays a crucial role in the performance of all such detectors. The traditional correlation predictor is based on predefined hand-crafted features representing intensity, texture and saturation characteristics of the image block under inspection. The performance of such an approach depends largely on the training and test data. We propose a convolutional neural network (CNN) architecture to predict the correlation from image patches of suitable size fed as input. Our empirical finding suggests that the CNN generalizes much better than the classical correlation predictor. With the CNN, we could operate with a common network architecture for various digital camera devices as well as a single network that could universally predict correlations for content from all cameras we experimented with, even including the ones that were not used in training the network. Integrating the CNN with our forensic detector gave state-of-the-art results.
In recent years, the number of forged videos circulating on the Internet has immensely increased. Software and services to create such forgeries have become more and more accessible to the public. In this regard, the risk of malicious use of forged videos has risen. This work proposes an approach based on the Ghost effect knwon from image forensics for detecting forgeries in videos that can replace faces in video sequences or change the mimic of a face. The experimental results show that the proposed approach is able to identify forgery in high-quality encoded video content.