Recently, a new deep learning architecture, the Vision Transformer, has emerged as the new standard for classification tasks, overtaking the conventional Convolutional Neural Network (CNN) models. However, these state-of-the-art models require large amounts of data, typically over 100 million images, to achieve optimal performance through transfer learning. This requirement is met by using proprietary datasets like JFT-300M or 3B, which are not publicly available. To overcome these challenges and address privacy concerns, Formula-Driven Supervised Learning (FDSL) has been introduced. FDSL trains deep learning models using synthetic images generated from mathematical formulas, such as Fractals and Radial Contour images. The main objective of this approach is to reduce the I/O bottleneck that occurs during training with large datasets. Our implementation of FDSL generates instances in real-time during training, and uses a custom data loader based on EGL (Native Platform Graphics Interface) for fast rendering via shaders. The evaluation of our custom data loader on the FractalDB-100k dataset comprising 100 million images revealed a loading time that is three times faster compared to the PyTorch Vision loader.
Transfer Learning is an important strategy in Computer Vision to tackle problems in the face of limited training data. However, this strategy still heavily depends on the amount of availabl data, which is a challenge for small heritage institutions. This paper investigates various ways of enrichingsmaller digital heritage collections to boost the performance of deep learningmodels, using the identification of musical instruments as a case study. We apply traditional data augmentation techniques as well as the use of an external, photorealistic collection, distorted by Style Transfer. Style Transfer techniques are capable of artistically stylizing images, reusing the style from any other given image. Hence, collections can be easily augmented with artificially generated images. We introduce the distinction between inner and outer style transfer and show that artificially augmented images in both scenarios consistently improve classification results, on top of traditional data augmentation techniques. However, and counter-intuitively, such artificially generated artistic depictions of works are surprisingly hard to classify. In addition, we discuss an example of negative transfer within the non-photorealistic domain.