The ease of capturing, manipulating, distributing, and consuming digital media (e.g. images, audio, video, graphics, and text) has motivated new applications and raised a number of important security challenges to the forefront. These applications and challenges have prompted significant research and development activities in the areas of digital watermarking, steganography, data hiding, forensics, media identification, and encryption to protect the authenticity, security, and ownership of media objects. Research results in these areas have translated into new paradigms and applications to monetize media objects without violating their ownership rights. The Media Watermarking, Security, and Forensics conference is a premier destination for disseminating high-quality, cutting-edge research in these areas. The conference provides an excellent venue for researchers and practitioners to present their innovative work as well as to keep abreast with the latest developments in watermarking, security, and forensics. The technical program will also be complemented by keynote talks, panel sessions, and short demos involving both academic and industrial researchers/ practitioners. This strong focus on how research results are applied in practice by the industry gives the conference its unique flavor.
Face recognition systems are used in high security applications for identification, authentication and authorization. Therefore they need to be robust not only to people wearing face accessories and masks like in the COVID19 pandemic, they also need to be robust against adversarial attacks. We have identified three inconspicuous facial areas to wear adversarial examples to attack face recognition. These are the mouth-nose section, the forehead and the eye area. In this paper, we will address the question of how much of a face needs to be present for successful identification and whether removing the critical regions is a viable countermeasure against adversarial examples.
Images posted online present a privacy concern in that they may be used as reference examples for a facial recognition system. Such abuse of images is in violation of privacy rights but is difficult to counter. It is well established that adversarial example images can be created for recognition systems which are based on deep neural networks. These adversarial examples can be used to disrupt the utility of the images as reference examples or training data. In this work we use a Generative Adversarial Network (GAN) to create adversarial examples to deceive facial recognition and we achieve an acceptable success rate in fooling the face recognition. Our results reduce the training time for the GAN by removing the discriminator component. Furthermore, our results show knowledge distillation can be employed to drastically reduce the size of the resulting model without impacting performance indicating that our contribution could run comfortably on a smartphone.
The use of general-purpose computers as touch-screen voting machines has created several difficult auditing problems. If voting machines are compromised by malware, they can adapt their behavior to evade testing and auditing, and paper trails are achieved through printing devices under the untrusted machine’s control. In this paper we outline and exhibit a prototype of a device that audits a voting machine through screen capture, sampling the HDMI signal passed from the computer to the display. This is achieved through a standard that requires a compliant voting machine to display signal markers on the summary pages before a vote is cast; compliance is enforced via alerts to the voter with a visual and audible signal while the screen is captured and archived. This direct feedback to the voter prevents a compromised machine from failing to invoke the device. We discuss the design and prototype of this system and possible avenues for attack on it.
The recent development of AI systems and their frequent use for classification problems poses a challenge from a forensic perspective. In many application fields like DeepFake detection, black box approaches such as neural networks are commonly used. As a result, the underlying classification models usually lack explainability and interpretability. In order to increase traceability of AI decisions and move a crucial step further towards precise & reproducible analysis descriptions and certifiable investigation procedures, in this paper a domain adapted forensic data model is introduced for media forensic investigations focusing on media forensic object manipulation detection, such as DeepFake detection.
This work discusses document security, use of OCR, and integrity verification related to printed documents. Since the underlying applications are usually documents containing sensitive personal data, a solution that does not require the entire data to be stored in a database is the most compatible. In order to allow verification to be performed by anyone, it is necessary that all the data required for this is contained on the document itself. The approach must be able to cope with different layouts so that the layout does not have to be adapted for each document. In the following, we present a concept and its implementation that allows every smartphone user to verify the authenticity and integrity of a document.
For PRNU-based image manipulation localization, the cor- relation predictor plays a crucial role to reduce false positives considerably, as well as increasing accuracy of manipulation local- ization. In this paper, we propose a novel correlation predictor with a non-parametric learning algorithm, which is Locally Weighted Regression. Instead of fitting a global set of model parameters, a non-parametric learning algorithm fits a model dynamically by sampling the training set based on the pixel in the query image at which the correlation needs to be predicted. Our experimental results suggest that building a model dynamically based on the distance of training examples from the query pixel in the feature space helps to predict the correlation more accurately. Experimental results on benchmark data indicate that integrating the new predictor significantly improves the accuracy of predicted correlation, as well as image manipulation localization performance of PRNU-based forensic detectors.
Camera identification is an important topic in the field of digital image forensics. There are three levels of classification: brand, model, and device. Studies in the literature are mainly focused on camera model identification. These studies are increasingly based on deep learning (DL). The methods based on deep learning are dedicated to three main goals: basic (only model) - triple (brand, model and device) - open-set (known and unknown cameras) classifications. Unlike other areas of image processing such as face recognition, most of these methods are only evaluated on a single database (Dresden) while a few others are publicly available. The available databases have a diversity in terms of camera content and distribution that is unique to each of them and makes the use of a single database questionable. Therefore, we conducted extensive tests with different public databases (Dresden, SOCRatES, and Forchheim) that combine enough features to perform a viable comparison of LD-based methods for camera model identification. In addition, the different classifications (basic, triple, open-set) pose a disparity problem preventing comparisons. We therefore decided to focus only on the basic camera model identification. We also use transfer learning (specifically fine-tuning) to perform our comparative study across databases.
To counter the ever increasing flood of image forgeries in the form of spliced images in social media and the web in general, we propose the novel image splicing localization CNN NoiseSeg. NoiseSeg fuses statistical and CNN-based splicing localization methods in separate branches to leverage the benefits of both. Unique splicing anomalies that can be identified by its coarse noise separation branch, fine-grained noise feature branch and error level analysis branch all get combined in a segmentation fusion head to predict a precise localization of the spliced regions. Experiments on the DSO-1, CASIAv2, DEFACTO, IMD2020 and WildWeb image splicing datasets show that NoiseSeg outperforms most other state-of-the-art methods significantly and even up to a margin of 46.8%.
We provide a method for montage recognition allowing to re-identify background images and inserted objects. It can also be used as a highly robust method for image re-identification beyond montages. To achieve this, we combine three mechanisms: segmentation, orientation detection and robust hashing. We show that this approach provides detection rates similar to more complex algorithms based on feature matching but is significantly more efficient with respect to storage requirements and computational time. We provide test results for various attacks like rotation, scaling, cropping, addition of noise and brightness changes.