For forensic analysis of digital images or videos, the PRNU or camera fingerprint is the most important characteristics, for source attribution and manipulation localization. Typically, a good estimate of the PRNU is obtained by computing its Maximum Likelihood estimate from noise residuals of a large number of flatfield images captured by the camera. In this paper, we propose a novel approach of estimating the fingerprint of a camera, with a Generative Adversarial Network (GAN). The idea is to let the Generator network learn a distribution, from which PRNU samples will be drawn after training of the two adversarial networks. Experimental results indicate that the GAN-generated PRNU yields state-of-the-art camera identification and manipulation localization results.
Camera sensor fingerprints for digital camera forensics are formed by Photo-Response Non-Uniformity (PRNU), or more precisely, by estimating PRNU from a set of images taken with a camera. These images must be aligned with each other to establish sensor location pixel-to-pixel correspondence. If some of these images have been resized and cropped, the transformations need to be reversed. In this work we deal with estimation of resizing factor in the presence of one reference image from the same camera. For this problem we coin the term semi-blind estimation of resizing factor. We post two requirements that any solution of this problem should meet. It needs to be reasonably fast and exhibit very low estimation error. Our work shows that this problem can be solved using established image matching in Fourier-Mellin transform applied to vertical and horizontal projections of noise residuals (also called linear patterns).
For PRNU-based forensic detectors, the fundamental test statistic is the normalized correlation between the camera fingerprint and the noise residual extracted from the image in successive overlapping analysis windows. The correlation predictor plays a crucial role in the performance of all such detectors. The traditional correlation predictor is based on predefined hand-crafted features representing intensity, texture and saturation characteristics of the image block under inspection. The performance of such an approach depends largely on the training and test data. We propose a convolutional neural network (CNN) architecture to predict the correlation from image patches of suitable size fed as input. Our empirical finding suggests that the CNN generalizes much better than the classical correlation predictor. With the CNN, we could operate with a common network architecture for various digital camera devices as well as a single network that could universally predict correlations for content from all cameras we experimented with, even including the ones that were not used in training the network. Integrating the CNN with our forensic detector gave state-of-the-art results.
We formulate PRNU-based image manipulation localization as a probabilistic binary labeling task in a flexible discriminative random field (DRF) framework. A novel local discriminator based on the deviation of the measured correlation from the expected local correlation as estimated by a correlation predictor is paired with an explicit pairwise model for dependencies between local decisions. Experimental results from the Dresden Image Database indicate that the DRF outperforms prior art with Markov random field label priors.