During the pandemic the usage of video platforms skyrocketed among office workers and students and even today, when more and more events are held on-site again, the usage of video platforms is at an all-time high. However, the many advantages of these platforms cannot hide some problems. In the professional field, the publication of audio recordings without the consent of the author can get him into trouble. In education, another problem is bullying. The distance from the victim lowers the inhibition threshold for bullying, which means that platforms need tools to combat it. In this work, we present a system, which can not only identify the person leaking the footage, but also identify all other persons present in the footage. This system can be used in both described scenarios.
DeepFakes are a recent trend in computer vision, posing a thread to authenticity of digital media. For the detection of DeepFakes most prominently neural network based approaches are used. Those detectors often lack explanatory power on why the given decision was made, due to their black-box nature. Furthermore, taking the social, ethical and legal perspective (e.g. the upcoming European Commission in the Artificial Intelligence Act) into account, black-box decision methods should be avoided and Human Oversight should be guaranteed. In terms of explainability of AI systems, many approaches work based on post-hoc visualization methods (e.g. by back-propagation) or the reduction of complexity. In our paper a different approach is used, combining hand-crafted as well as neural network based components analyzing the same phenomenon to aim for explainability. The exemplary chosen semantic phenomenon analyzed here is the eye blinking behavior in a genuine or DeepFake video. Furthermore, the impact of video duration on the classification result is evaluated empirically, so that a minimum duration threshold can be set to reasonably detect DeepFakes.
Human-in-control is a principle that has long been established in forensics as a strict requirement and is nowadays also receiving more and more attention in many other fields of application where artificial intelligence (AI) is used. This renewed interest is due to the fact that many regulations (among others the the EU Artificial Intelligence Act (AIA)) emphasize it as a necessity for any critical AI application scenario. In this paper, human-in-control and quality assurance aspects for a benchmarking framework to be used in media forensics are discussed and their usage is illustrated in the context of the media forensics sub-discipline of DeepFake detection.
The recent development of AI systems and their frequent use for classification problems poses a challenge from a forensic perspective. In many application fields like DeepFake detection, black box approaches such as neural networks are commonly used. As a result, the underlying classification models usually lack explainability and interpretability. In order to increase traceability of AI decisions and move a crucial step further towards precise & reproducible analysis descriptions and certifiable investigation procedures, in this paper a domain adapted forensic data model is introduced for media forensic investigations focusing on media forensic object manipulation detection, such as DeepFake detection.