In the recent years, the detection of deepfakes has become a substantial topic in image and video forensics. State-of-the-art blind detection methods can detect deepfakes from synthetic datasets with high accuracies. However, they struggle to classify deepfake material that underwent adversarial post-processing or fail to generalize to unseen video data. In this paper, a refined detection pipeline taking advantage of a semi-blind detection scheme is proposed. It combines background-matching with a state-of-the-art CNN-classifier. When classifying videos from the Deepfake Detection Challenge Dataset the CNN-classifier was previously trained on, the performance did not improve using the new detection scheme. However, the approach was able to achieve superior results on unseen data of the FaceForensics++ Dataset.
We provide a method for montage recognition allowing to re-identify background images and inserted objects. It can also be used as a highly robust method for image re-identification beyond montages. To achieve this, we combine three mechanisms: segmentation, orientation detection and robust hashing. We show that this approach provides detection rates similar to more complex algorithms based on feature matching but is significantly more efficient with respect to storage requirements and computational time. We provide test results for various attacks like rotation, scaling, cropping, addition of noise and brightness changes.
Images can be recognized by cryptographic or robust hashes during forensic investigation or content filtering. Cryptographic methods tend to be too fragile, robust methods may leak information about the hashed images. Combining robust and cryptographic methods can solve both problems, but requires a good prediction of hash bit positions most likely to break. Previous research shows the potential of this approach, but evaluation results still have rather high error rates, especially many false negatives. In this work we have a detailed look at the behavior of robust hashes under attacks and the potential of prediction derived from distance from median and learning from attacks.