Recent advances in Generative Adversarial Networks (GANs) have led to the creation of realistic-looking digital images that pose a major challenge to their detection by humans or computers. GANs are used in a wide range of tasks, from modifying small attributes of an image (StarGAN [14]), transferring attributes between image pairs (CycleGAN [92]), as well as generating entirely new images (ProGAN [37], StyleGAN [38], SPADE/GauGAN [65]). In this paper, we propose a novel approach to detect, attribute and localize GAN generated images that combines image features with deep learning methods. For every image, co-occurrence matrices are computed on neighborhood pixels of RGB channels in different directions (horizontal, vertical and diagonal). A deep learning network is then trained on these features to detect, attribute and localize these GAN generated/manipulated images. A large scale evaluation of our approach on 5 GAN datasets comprising over 2.76 million images (ProGAN, StarGAN, CycleGAN, StyleGAN and SPADE/GauGAN) shows promising results in detecting GAN generated images.
Digital image forensics aims to detect images that have been digitally manipulated. Realistic image forgeries involve a combination of splicing, resampling, region removal, smoothing and other manipulation methods. While most detection methods in literature focus on detecting a particular type of manipulation, it is challenging to identify doctored images that involve a host of manipulations. In this paper, we propose a novel approach to holistically detect tampered images using a combination of pixel co-occurrence matrices and deep learning. We extract horizontal and vertical co-occurrence matrices on three color channels in the pixel domain and train a model using a deep convolutional neural network (CNN) framework. Our method is agnostic to the type of manipulation and classifies an image as tampered or untampered. We train and validate our model on a dataset of more than 86,000 images. Experimental results show that our approach is promising and achieves more than 0.99 area under the curve (AUC) evaluation metric on the training and validation subsets. Further, our approach also generalizes well and achieves around 0.81 AUC on an unseen test dataset comprising more than 19,740 images released as part of the Media Forensics Challenge (MFC) 2020. Our score was highest among all other teams that participated in the challenge, at the time of announcement of the challenge results.
Camera sensor fingerprints for digital camera forensics are formed by Photo-Response Non-Uniformity (PRNU), or more precisely, by estimating PRNU from a set of images taken with a camera. These images must be aligned with each other to establish sensor location pixel-to-pixel correspondence. If some of these images have been resized and cropped, the transformations need to be reversed. In this work we deal with estimation of resizing factor in the presence of one reference image from the same camera. For this problem we coin the term semi-blind estimation of resizing factor. We post two requirements that any solution of this problem should meet. It needs to be reasonably fast and exhibit very low estimation error. Our work shows that this problem can be solved using established image matching in Fourier-Mellin transform applied to vertical and horizontal projections of noise residuals (also called linear patterns).
The advent of Generative Adversarial Networks (GANs) has brought about completely novel ways of transforming and manipulating pixels in digital images. GAN based techniques such as Image-to-Image translations, DeepFakes, and other automated methods have become increasingly popular in creating fake images. In this paper, we propose a novel approach to detect GAN generated fake images using a combination of co-occurrence matrices and deep learning. We extract co-occurrence matrices on three color channels in the pixel domain and train a model using a deep convolutional neural network (CNN) framework. Experimental results on two diverse and challenging GAN datasets comprising more than 56,000 images based on unpaired image-to-image translations (cycleGAN [1]) and facial attributes/expressions (StarGAN [2]) show that our approach is promising and achieves more than 99% classification accuracy in both datasets. Further, our approach also generalizes well and achieves good results when trained on one dataset and tested on the other.
In this paper, we present a new reference dataset simulating digital evidence for image (photographic) steganography. Steganography detection is a digital image forensic topic that is relatively unknown in practical forensics, although stego app use in the wild is on the rise. This paper introduces the first database consisting of mobile phone photographs and stego images produced from mobile stego apps, including a rich set of side information, offering simulated digital evidence. StegoAppDB, a steganography apps forensics image database, contains over 810,000 innocent and stego images using a minimum of 10 different phone models from 24 distinct devices, with detailed provenanced data comprising a wide range of ISO and exposure settings, EXIF data, message information, embedding rates, etc. We develop a camera app, Cameraw, specifically for data acquisition, with multiple images per scene, saving simultaneously in both DNG and high-quality JPEG formats. Stego images are created from these original images using selected mobile stego apps through a careful process of reverse engineering. StegoAppDB contains cover-stego image pairs including for apps that resize the stego dimensions. We retain the original devices and continue to enlarge the database, and encourage the image forensics community to use StegoAppDB. While designed for steganography, we discuss uses of this publicly available database to other digital image forensic topics.
The ease in counterfeiting both origin and content of a video necessitates the search for a reliable method to identify the source of a media file - a crucial part of forensic investigation. One of the most accepted solutions to identify the source of a digital image involves comparison of its photo-response non-uniformity (PRNU) fingerprint. However, for videos, prevalent methods are not as efficient as image source identification techniques. This is due to the fact that the fingerprint is affected by the postprocessing steps done to generate the video. In this paper, we answer affirmatively to the question of whether one can use images to generate the reference fingerprint pattern to identify a video source. We introduce an approach called “Hybrid G-PRNU” that provides a scale-invariant solution for video source identification by matching its fingerprint with the one extracted from images. Another goal of our work is to find the optimal parameters to reach an optimal identification rate. Experiments performed demonstrate higher identification rate, while doing asymmetric comparison of video PRNU with the reference pattern generated from images, over several test cases. Further the fingerprint extractor used for this paper is being made freely available for scholars and researchers in the domain.
Realistic image forgeries involve a combination of splicing, resampling, cloning, region removal and other methods. While resampling detection algorithms are effective in detecting splicing and resampling, copy-move detection algorithms excel in detecting cloning and region removal. In this paper, we combine these complementary approaches in a way that boosts the overall accuracy of image manipulation detection. We use the copy-move detection method as a pre-filtering step and pass those images that are classified as untampered to a deep learning based resampling detection framework. Experimental results on various datasets including the 2017 NIST Nimble Challenge Evaluation dataset comprising nearly 10,000 pristine and tampered images shows that there is a consistent increase of 8%-10% in detection rates, when copy-move algorithm is combined with different resampling detection algorithms.
In this work, we introduce a new method for localizing image manipulations in a single digital image, such as identifying added, removed (spliced or in-painted), or deformed objects. The method utilizes the so-called Linear Pattern (LP) of digital images as a global template whose integrity can be assessed in a localized manner. The consistency of the linear pattern estimated from the image noise residual is evaluated in overlapping blocks of pixels. The manipulated region is identified by the lack of similarity in terms of the correlation coefficient computed between the power spectral density (PSD) of the LP in that region and the PSD averaged over the entire image. The method is potentially applicable to all images of sufficient resolution as long as the LP in the unmodified parts of the image has different spectral properties from that in the tampered area. No side information, such as the EXIF header or the camera model, is needed to make the method work. Experiments show the capability and limitations of the proposed method, which is robust to mild JPEG compression.
The amount of digital imagery recorded has recently grown exponentially, and with the advancement of software, such as Photoshop or Gimp, it has become easier to manipulate images. However, most images on the internet have not been manipulated and any automated manipulation detection algorithm must carefully control the false alarm rate. In this paper we discuss a method to automatically detect local resampling using deep learning while controlling the false alarm rate using a-contrario analysis. The automated procedure consists of three primary steps. First, resampling features are calculated for image blocks. A deep learning classifier is then used to generate a heatmap that indicates if the image block has been resampled. We expect some of these blocks to be falsely identified as resampled. We use a-contrario hypothesis testing to both identify if the patterns of the manipulated blocks indicate if the image has been tampered with and to localize the manipulation. We demonstrate that this strategy is effective in indicating if an image has been manipulated and localizing the manipulations.