Convolutional neural networks offer much more accurate detection of steganography than the outgoing paradigm - classifiers trained on rich representations of images. While training a CNN is scalable with respect to the size of the training set, one cannot directly train on images that are too large due to the memory limitations of current GPUs. Most leading network architectures for steganalysis today require the input image to be a small tile with 256 × 256 or 512 × 512 pixels. Because detecting the presence of steganographic embedding changes really means detecting a very weak noise signal added to the cover image, resizing an image before presenting it to a CNN would be highly suboptimal. Applying the tile detector on disjoint segments of a larger image and fusing the results bring a plethora of new problems of how to properly fuse the outputs. In this paper, we propose a different solution to this problem based on modifying an existing leading network architecture for steganalysis in the spatial domain, the YeNet, to output statistical moments of feature maps to the fully-connected classifier part of the network. On experiments in which we adjust the payload with image size according the square root law for constant statistical detectability, we demonstrate that the proposed architecture can be trained to steganalyze images of various sizes without any or only a small loss with respect to detectors trained for a fixed image size.
Clement Fuji Tsang, Jessica Fridrich, "Steganalyzing Images of Arbitrary Size with CNNs" in Proc. IS&T Int’l. Symp. on Electronic Imaging: Media Watermarking, Security, and Forensics, 2018, pp 121-1 - 121-8, https://doi.org/10.2352/ISSN.2470-1173.2018.07.MWSF-121