In this paper, we aim to address the image recapturing detection problem with the convolutional and recurrent neural networks. With the advances of image display and acquisition techniques, the recaptured images are of satisfactory quality nowadays. This has been creating an ever stronger demand for sophisticated image recapturing detection algorithms which can efficiently prevent the unauthorized image distributing and forging. In this paper, we propose a hierarchical feature learning strategy by leveraging the intra-block information and inter-block dependency for image recapturing detection. In particular, the image blocks are first employed as the input of the convolutional neural network (CNN) and subsequently recurrent neural network (RNN) is further adopted to extract block dependencies. The CNN and RNN serve as effective tools to extract discriminative and meaningful features regarding both intra- and inter-block information. As such, the inherent properties within local blocks and the correlations between non-local neighbouring blocks are all exploited to identify the recaptured images. Experimental results on three databases show that significantly better performance can be achieved with our proposed framework compared to traditional handcrafted and deep learning based approaches.
Nowadays, the most employed devices for recoding videos or capturing images are undoubtedly the smartphones. Our work investigates the application of source camera identification on mobile phones. We present a dataset entirely collected by mobile phones. The dataset contains both still images and videos collected by 67 different smartphones. Part of the images consists in photos of uniform backgrounds, especially collected for the computation of the RSPN. Identifying the source camera given a video is particularly challenging due to the strong video compression. The experiments reported in this paper, show the large variation in performance when testing an highly accurate technique on still images and videos.
We study linear filter kernel estimation from processed digital images under the assumption that the image's source camera is known. By leveraging easy-to-obtain camera-specific sensor noise fingerprints as a proxy, we have identified the linear crosscorrelation between a pre-computed camera fingerprint estimate and a noise residual extracted from the filtered query image as a viable domain to perform filter estimation. The result is a simple yet accurate filter kernel estimation technique that is relatively independent of image content and that does not rely on hand-crafted parameter settings. Experimental results obtained from both uncompressed and JPEG compressed images suggests performances on par with highly developed iterative constrained minimization techniques.
We formulate PRNU-based image manipulation localization as a probabilistic binary labeling task in a flexible discriminative random field (DRF) framework. A novel local discriminator based on the deviation of the measured correlation from the expected local correlation as estimated by a correlation predictor is paired with an explicit pairwise model for dependencies between local decisions. Experimental results from the Dresden Image Database indicate that the DRF outperforms prior art with Markov random field label priors.
Knowing the history of global processing applied to an image can be very important for the forensic analyst to correctly establish the image pedigree, trustworthiness, and integrity. Global edits have been proposed in the past for "laundering" manipulated content because they can negatively affect the reliability of many forensic techniques. In this paper, we focus on the more difficult and less addressed case when the processed image is JPEG compressed. First, a bank of binary linear classifiers with rich media models are built to distinguish between unprocessed images and images subjected to a specific processing class. For better scalability, the detector is not built in the rich feature space but in the space of projections of features on the weight vectors of the linear classifiers. This decreases the computational complexity of the detector and, most importantly, allows estimation of the distribution of the projections by fitting a mutlivariate Gaussian model to each processing class to construct the final classifier as a maximum-likelihood detector. Well-fitting analytic models permit a more rigorous construction of the detector unachievable in the original high-dimensional rich feature space. Experiments on grayscale as well as color images with a range of JPEG quality factors and four processing classes are used to show the merit of the proposed methodology.