This paper presents a method for synthesizing 2D and 3D sensor data for various machine vision tasks. Depending on the task, different processing steps can be applied to a 3D model of an object. For object detection, segmentation and pose estimation, random object arrangements are generated automatically. In addition, objects can be virtually deformed in order to create realistic images of non-rigid objects. For automatic visual inspection, synthetic defects are introduced into the objects. Thus sensor-realistic datasets with typical object defects for quality control applications can be created, even in the absence of defective parts. The simulation of realistic images uses physically based rendering techniques. Material properties and different lighting situations are taken into account in the 3D models. The resulting tuples of 2D images and their ground truth annotations can be used to train a machine learning model, which is subsequently applied to real data. In order to minimize the reality gap, a random parameter set is selected for each image, resulting in images with high variety. Considering the use cases damage detection and object detection, it has been shown that a machine learning model trained only on synthetic data can also achieve very good results on real data.
Biometric authentication takes on many forms. Some of the more researched forms are fingerprint and facial authentication. Due to the amounts of research in these areas there are benchmark datasets easily accessible for new researchers to utilize when evaluating new systems. A newer, less researched biometric method is that of lip motion authentication. These systems entail a user producing a lip motion password to authenticate, meaning they must utter the same word or phrase to gain access. Because this method is less researched, there is no large-scale dataset that can be used to compare methods as well as determine the actual levels of security that they provide. We propose an automated dataset collection pipeline that extracts a lip motion authentication dataset from collections of videos. This dataset collection pipeline will enable the collection of large-scale datasets for this problem thus advancing the capability of lip motion authentication systems.
This paper proposes a novel method for automatic realtime defect detection and classification on wood surfaces. Our method uses deep convolutional neural network (CNN) based approach Faster R-CNN (Region-based CNN ) as detector and MobileNetV3 as backbone network for feature extraction. The key difference of our approach from the existing methods is that it detects knots and other type of defects efficiently and does the classification in real-time from the input video frames. Speed and accuracy is the main focus of our work. In the case of the industrial quality control and inspection such as defects detection, the task of detection and classification needs to be done in real-time on a computationally limited processing units or commodity processors. Our trained model is a light weight, and it can even be deployed on systems for example mobile and edge devices. We have pre-trained the MobileNet V3 on large image dataset for feature extraction. We use Faster R-CNN for detection and classification of defects. The system does the real-time detection and classification on an average of 37 frames per second from input video frames, using low cost and low memory GPU (Graphics Processing Unit). Our method has achieved an overall accuracy of 99% in detecting and classifying defects.
In this paper, text recognition of variably curved cardboard pharmaceutical packages is studied from the photometric stereo imaging point-of-view with focus on developing a method for binarizing the expiration date and batch code texts. Adaptive filtering, more specifically Wiener filter, is used together with haze removal algorithm with fusion of LoG-edge detected sub-images resulting an Otsu thresholded binary image of expiration date and batch code texts for future analysis. Some results are presented, and they appear to be promising for text binarization. Successful binarization is crucial in text character segmentation and further automatic reading. Furthermore, some new ideas will be presented that will be used in our future research work.