Conventional image forgery relies heavily on various digital image processing techniques, which will inevitably introduce artifacts and inconsistency. For the goal of raising suspicion over the integrity of a genuine picture P, we proposed an ambiguity attack not employing any digital image processing techniques. It works by deliberately producing a second picture Pamb containing a target ROI(Region-of-Interest) that highly resembles the ROI in P. Except for the target ROI, the rest of the contents might be dramatically different between P and Pamb, so that Pamb tells a rather different story from P. Since Pamb is not involved with any forgery in digital domain, Pamb shall pass generic digital image forensic tests. Furthermore, several measures can be taken to make the ROI in Pamb looks more 'original' than its counterpart in P, which induces people to believe Pamb is genuine and P is no more than a forgery derived from Pamb instead. The ambiguity created between P and Pamb is hard to resolve due to three reasons. Firstly, no digital forensic tool shall identify any artifacts or inconsistency in Pamb; secondly, the fact of being able to pass all digital forensic tests still does not assure P is genuine; lastly, determine the chronological order of P and Pamb is very hard for general cases.
We study characteristics of the second significant digits of block-DCT coefficients computed from digital images. Following previous works on compression forensics based on first significant digits, we examine the merits of stepping towards significant digits beyond the first. Our empirical findings indicate that certain block-DCT modes follow Benford's law of second significant digits extremely well, which allows us to distinguish between never-compressed images and decompressed JPEG images even for the highest JPEG compression quality of 100. As for multiple-compression forensics we report that second significant digit histograms are highly informative on their own, yet cannot further improve already good performances of classification schemes that work with first significant digits alone.
The detection of copy–move forgeries has been studied extensively, however all known methods were designed and evaluated for digital images depicting natural scenes. In this paper, we address the problem of detecting and localizing copy–move forgeries in images of scanned text documents. The purpose of our analysis is to study how block-based detection of near-duplicates performs in this application scenario considering that even authentic scanned text contains multiple, similar-looking glyphs (letters, numbers, and punctuation marks). A series of experiments on scanned documents is carried out to examine the operation of some feature representations proposed in the literature with respect to the correct detection of copied image segments and the minimization of false positives. Our findings indicate that, subject to specific threshold and parameter values, the block-based methods show modest performance in detecting copy–move forgery from scanned documents. We explore strategies to further adapt block-based copy–move forgery detection approaches to this relevant application scenario.
Fujitsu is working on the use of video watermarking for digital marketing. In the case of advertising services in Japan, TV viewers can automatically access an E-commerce site synchronized with a home shopping network program and easily order a commodity of their choice by detecting the watermark embedded in the video they are watching with their smart device application. Since watermark signals are generally deteriorated by video compression, the watermark strength needs to be adjusted. In the TV broadcasting, the degree of deterioration is different depending on the form of broadcasting (e.g. digital terrestrial or satellite) because of the associated difference of bitrate and compression format. The complexity of adjusting the watermark strength to each broadcasting form for each video becomes a problem in the real operation while if the strength determined to overcome the largest deterioration is used for all other cases as well, the influence on the image quality of video compressed with a low compression rate may become greater than necessary. To reduce the above-mentioned inconvenience in practical use, we have developed a method that standardizes the strength for more applications than before by improving the trade-off between video compression tolerance and influence on the video quality.
This paper describes a technique that can invisibly embed information into images captured with a video camera. It uses illumination that contains invisible patterns. As the illumination contains patterns, a captured image of a real object illuminated by such light also contains the same patterns. It uses a luminance modulated pattern whose amplitude is too small to be perceived. Four frames are used for a cycle of the modulating. The difference frame images of every other frame of the captured image over a certain period are added up. Changes in brightness by modulating are accumulated over the frames while the object image is removed because difference frame images are used. This makes it possible to read out the embedded patterns. The experimental results demonstrate that the embedded pattern is invisible and can be read out by choosing some conditions properly, although a small amount of noise results from the remaining object image. Introduction
This paper presents a video watermarking algorithm that is designed to resist the analog gap beside other known attacks. The analog gap, i.e. re-recording e.g. with a camcorder from a screen, poses a huge challenge for digital video watermarking applications. A satisfactory solution is not known yet. In this work we propose a novel transparent and blind video watermarking algorithm that uses so called maximal stable extremal regions (MSER) for identifying regions of the video, in which a watermark is capable to survive many attacks, even the analog gap. Therefore, for embedding as well as detecting, each frame of the video has to be analyzed and stable regions ought to be found. For the embedding, all selected regions are approximated by circles. By means of the orientation of the MSER-Region the preprocessed pattern are adjusted, scaled and added to the content. The MSER itself is amplified to increase the recognition on the detector side. To keep the transparency high the Laplacian matrix is used as psycho visual model as well as the scene detection to reduce the flickering effect.
Telltale watermarks allow to infer how a watermarked signal has been altered on the channel by analyzing the distortion applied to the watermark. We propose a new kind of telltale watermark for digital images that tracks the number of JPEG (re-)compressions. Our watermarks leverage peculiarities in the convergence behavior of JPEG images. We show that it is possible to generate or find combinations of pixel values which form so called "counter blocks". Counter blocks cycle predictably through a fixed number of states in subsequent JPEG compression and decompression operations. By combining counter blocks with different cycle lengths in one image, we are able to track the number of JPEG (re-)compressions in extended ranges. We evaluate the accuracy of counter blocks, discuss pitfalls when embedding them, and study the construction of counter blocks with specified cycle lengths.
Speech processing is used to translate human speech to text and to identify speakers for applications in biometric systems. Speaker verification requires robust algorithms to prohibit an adversary from impersonating another speaker. Previous research has demonstrated that specially crafted additive noise can cause a misclassification of a speaker as a specific target. In this paper, we study whether targeted additive noise can thwart speaker verification without affecting speech-to-text decoding. Mel-frequency cepstral coefficients (MFCCs) and Gaussian mixture models (GMMs) are commonly used in both applications for encoding schemes. We attempt to induce a desired change in the probability of one speaker model used for speaker classification, while preserving likelihood under another speech model used for speech decoding.
Any serious steganography system should make use of coding. Here, we investigate the performance of our prior linguistic steganographic method for tweets, combined with perfect coding. We propose distortion measures for linguistic steganography, the first of their kind, and investigate the best embedding strategy for the steganographer. These distortion measures are tested with fully automatically generated stego objects, as well as stego tweets filtered by a human operator. We also observed a square root law of capacity in this linguistic stegosystem.