Towards real-time formula driven dataset feed for large scale deep learning training

Edgar Josafat Martinez-Noriega; Rio  Yokota

doi:10.2352/EI.2023.35.11.HPCI-243

Abstract

Recently, a new deep learning architecture, the Vision Transformer, has emerged as the new standard for classification tasks, overtaking the conventional Convolutional Neural Network (CNN) models. However, these state-of-the-art models require large amounts of data, typically over 100 million images, to achieve optimal performance through transfer learning. This requirement is met by using proprietary datasets like JFT-300M or 3B, which are not publicly available. To overcome these challenges and address privacy concerns, Formula-Driven Supervised Learning (FDSL) has been introduced. FDSL trains deep learning models using synthetic images generated from mathematical formulas, such as Fractals and Radial Contour images. The main objective of this approach is to reduce the I/O bottleneck that occurs during training with large datasets. Our implementation of FDSL generates instances in real-time during training, and uses a custom data loader based on EGL (Native Platform Graphics Interface) for fast rendering via shaders. The evaluation of our custom data loader on the FractalDB-100k dataset comprising 100 million images revealed a loading time that is three times faster compared to the PyTorch Vision loader.

Electronic Imaging

2470-1173

Society for Imaging Science and Technology

IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA

10.2352/EI.2023.35.11.HPCI-243

HPCI-243

Article

Towards real-time formula driven dataset feed for large scale deep learning training

Martinez-NoriegaEdgar Josafat

National Institute of Advanced Industrial Science and Technology, Japan

YokotaRio

Tokyo Institute of Technology, Japan

Abstract

1612023

HPCI

High Performance Computing for Imaging 2023

243-1

243-6

2023

Real-Time RenderingLarge Scale TrainingTransfer LearningVision Transformer

articleview.keywords