IS&T | Library

Motion-Based Domain Randomization for Detecting Honey Bees Inside a Hive

Abstract

Naturalistic driving studies consist of drivers using their personal vehicles and provide valuable real-world data, but privacy issues must be handled very carefully. Drivers sign a consent form when they elect to participate, but passengers do not for a variety of practical reasons. However, their privacy must still be protected. One large study includes a blurred image of the entire cabin which allows reviewers to find passengers in the vehicle; this protects the privacy but still allows a means of answering questions regarding the impact of passengers on driver behavior. A method for automatically counting the passengers would have scientific value for transportation researchers. We investigated different image analysis methods for automatically locating and counting the non-drivers including simple face detection and fine-tuned methods for image classification and a published object detection method. We also compared the image classification using convolutional neural network and vision transformer backbones. Our studies show the image classification method appears to work the best in terms of absolute performance, although we note the closed nature of our dataset and nature of the imagery makes the application somewhat niche and object detection methods also have advantages. We perform some analysis to support our conclusion.

Digital Library: EI

Published Online: January 2024

Proceedings

61 9

Object Detection
Honey Bee Detection
Domain Randomization

Chih-Hsing Ho, Izaak R. Gilchrist, Keirstyn A Amponsah, Amanda Stoltz, Brock A. Harpur, Steve Cantley, Amy R. Reibman

DOI

10.2352/EI.2024.36.8.IMAGE-240

Volume 36

Issue 8

Driver Monitoring System Using Deep Learning Techniques

Abstract

In this paper, we address the task of detecting honey bees inside a beehive using computer vision with the goal of monitoring their activity. Conventionally, beekeepers monitor the activities of honey bees by watching colony entrances or by opening their colonies and examining bee movement and behavior during inspections. However, these methods either miss important information or alter honey bee behavior. Therefore, we installed simple cameras and IR lighting into honey bee colonies for a proof of concept study whether deep-learning techniques could assist in-hive observations. However, the lighting conditions across different beehives are diverse, which leads to varied appearances of both the beehive backgrounds and the honey bees. This phenomenon significantly degrades the performance of detection using Deep Neural Networks. In this paper, we propose to apply domain randomization based on motion to train honey bee detectors for inside the beehive. Our experiments were conducted on the images captured from beehives both seen and unseen during training. The results show that our proposed method boosts the performance of honey bee detection, especially for small bees which are more likely to be affected by the lighting conditions.

Digital Library: EI

Published Online: January 2024

Proceedings

143 55

Deep Learning
Computer Vision
Drowsiness Detection
Distraction Detection
Facial Expression Analysis
Object Detection
YOLOv8
Camera-based Monitoring

Moustafa Ibrahim, Gerrit Tamm, Reiner Creutzburg

DOI

10.2352/EI.2024.36.3.MOBMU-314

Volume 36

Issue 3

Metrology-driven Image Synthesis for Quality Control

Abstract

The Driver Monitoring System (DMS) presented in this work aims to enhance road safety by continuously monitoring a drivers behavior and emotional state during vehicle operation. The system utilizes computer vision and machine learning techniques to analyze the drivers face and actions, providing real-time alerts to mitigate potential hazards. The primary components of the DMS include gaze detection, emotion analysis, and phone usage detection. The system tracks the drivers eye movements to detect drowsiness and distraction through blink patterns and eye-closure durations. The DMS employs deep learning models to analyze the drivers facial expressions and extract dominant emotional states. In case of detected emotional distress, the system offers calming verbal prompts to maintain driver composure. Detected phone usage triggers visual and auditory alerts to discourage distracted driving. Integrating these features creates a comprehensive driver monitoring solution that assists in preventing accidents caused by drowsiness, distraction, and emotional instability. The systems effectiveness is demonstrated through real-time test scenarios, and its potential impact on road safety is discussed.

Digital Library: EI

Published Online: January 2024

Proceedings Paper

73 15

Metrology
Image Synthesis
Object Detection

Meldrick Reimmer, Hermine Chatoux, Olivier Aubreton

DOI

10.2352/CIC.2023.31.1.12

Volume 31

Issue 1

Abstract

Metrology plays a critical role in the rapid progress of Artificial Intelligence (AI), particularly in computer vision. This article explores the importance of metrology in image synthesis for computer vision tasks, with a particular focus on object detection for quality control. The aim is to improve the accuracy, reliability and quality of AI models. Through the use of precise measurements, standards and calibration techniques, a carefully constructed dataset has been generated and used to train AI models. By incorporating metrology into AI models, we aim at improving their overall performance and robustness.

Digital Library: CIC

Published Online: November 2023

Spatial recall index for machine learning algorithms

264 57

image quality
modeling
Performance Metrics
Space-Variant PSF
Object Detection
Automotive
Optical Quality

Patrick Müller, Mattis Brummel, Alexander Braun

Pages 58 - 62, September 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/issn.2694-118X.2021.LIM-58

Volume 2

Issue 1

We present a novel metric Spatial Recall Index to assess the performance of machine-learning (ML) algorithms for automotive applications, focusing on where in the image which performance occurs. Typical metrics like intersection-over-union (IoU), precisionrecallcurves or average precision (AP) quantify the performance over a whole database of images, neglecting spatial performance variations. But as the optics of camera systems are spatially variable over the field of view, the performance of ML-based algorithms is also a function of space, which we show in simulation: A realistic objective lens based on a Cooke-triplet that exhibits typical optical aberrations like astigmatism and chromatic aberration, all variable over field, is modeled. The model is then applied to a subset of the BDD100k dataset with spatially-varying kernels. We then quantify local changes in the performance of the pre-trained Mask R-CNN algorithm. Our examples demonstrate the spatial dependence of the performance of ML-based algorithms from the optical quality over field, highlighting the need to take the spatial dimension into account when training ML-based algorithms, especially when looking forward to autonomous driving applications.

Digital Library: LIM

Published Online: September 2021

Improving Food Detection For Images From a Wearable Egocentric Camera

111 12

Food Detection
Egocentric Image
Object Detection
Deep Learning
Food and Computer Vision

Yue Han, Sri Kalyan Yarlagadda, Tonmoy Ghosh, Fengqing Zhu, Edward Sazonov, Edward J. Delp

Pages 286-1 - 286-7, January 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/ISSN.2470-1173.2021.8.IMAWM-286

Volume 33

Issue 8

Diet is an important aspect of our health. Good dietary habits can contribute to the prevention of many diseases and improve overall quality of life. To better understand the relationship between diet and health, image-based dietary assessment systems have been developed to collect dietary information. We introduce the Automatic Ingestion Monitor (AIM), a device that can be attached to one’s eye glasses. It provides an automated hands-free approach to capture eating scene images. While AIM has several advantages, images captured by the AIM are sometimes blurry. Blurry images can significantly degrade the performance of food image analysis such as food detection. In this paper, we propose an approach to pre-process images collected by the AIM imaging sensor by rejecting extremely blurry images to improve the performance of food detection.

Digital Library: EI

Published Online: January 2021

A Study on Training Data Selection for Object Detection in Nighttime Traffic Scenes

67 13

Computer Vision
Deep Learning
Object Detection
Advanced Driver Assistance System
Night Scenes

Astrid Unger, Margrit Gelautz, Florian Seitner

DOI

10.2352/ISSN.2470-1173.2020.16.AVM-203

Volume 32

Issue 16

With the growing demand for robust object detection algorithms in self-driving systems, it is important to consider the varying lighting and weather conditions in which cars operate all year round. The goal of our work is to gain a deeper understanding of meaningful strategies for selecting and merging training data from currently available databases and self-annotated videos in the context of automotive night scenes. We retrain an existing Convolutional Neural Network (YOLOv3) to study the influence of different training dataset combinations on the final object detection results in nighttime and low-visibility traffic scenes. Our evaluation shows that a suitable selection of training data from the GTSRD, VIPER, and BDD databases in conjunction with selfrecorded night scenes can achieve an mAP of 63,5% for ten object classes, which is an improvement of 16,7% when compared to the performance of the original YOLOv3 network on the same test set.

Digital Library: EI

Published Online: January 2020

WARHOL: Wearable Holographic Object Labeler

258 16

Object Detection
Labeling
Training Datasets

Matthew Shreve, Bob Price, Les Nelson, Raja Bala, Jin Sun, Srichiran Kumar

DOI

10.2352/ISSN.2470-1173.2020.13.ERVR-381

Volume 32

Issue 13

Deep learning has significantly improved the accuracy and robustness of computer vision techniques but is fundamentally limited by access to training data. Pretrained networks and public datasets have enabled the building of many applications with minimal data collection. However, these datasets are often biased: they largely contain images with conventional poses of common objects (e.g., cars, furniture, dogs, cats, etc.). In specialized applications such as user assistance for servicing complex equipment, the objects in question are often not represented in popular datasets (e.g., fuser roll assembly in a printer) and require a variety of unusual poses and lighting conditions making the training of these applications expensive and slow. To overcome these limitations, we propose a fast labeling tool using an Augmented Reality (AR) platform that leverages the 3D geometry and tracking afforded by modern AR systems. Our technique, which we call WARHOL, allows a user to mark boundaries of an object once in world coordinates and then automatically project these to an enormous range of poses and conditions automatically. Our experiments show that object labeling using WARHOL achieves 90% of the localization accuracy in object detection tasks with only 5% of the labeling effort compared to manual labeling. Crucially, WARHOL also allows the annotation of objects with parts that have multiple states (e.g., drawers open or closed, removable parts present or not) with minimal extra user effort. WARHOL also improves on typical object detection bounding boxes using a bounding box refinement network to create perspective-aligned bounding boxes that dramatically improve the localization accuracy and interpretability of detections.

Digital Library: EI

Published Online: January 2020

Deep Learning based Fruit Freshness Classification and Detection with CMOS Image sensors and Edge processors

198 45

CMOS Image sensor
Fruit freshness
Image Classification
Object Detection
Edge Processors
MobileNetV2

Tejaswini Ananthanarayana, Raymond Ptucha, Sean C. Kelly

DOI

10.2352/ISSN.2470-1173.2020.12.FAIS-172

Volume 32

Issue 12

CMOS Image sensors play a vital role in the exponentially growing field of Artificial Intelligence (AI). Applications like image classification, object detection and tracking are just some of the many problems now solved with the help of AI, and specifically deep learning. In this work, we target image classification to discern between six categories of fruits — fresh/ rotten apples, fresh/ rotten oranges, fresh/ rotten bananas. Using images captured from high speed CMOS sensors along with lightweight CNN architectures, we show the results on various edge platforms. Specifically, we show results using ON Semiconductor’s global-shutter based, 12MP, 90 frame per second image sensor (XGS-12), and ON Semiconductor’s 13 MP AR1335 image sensor feeding into MobileNetV2, implemented on NVIDIA Jetson platforms. In addition to using the data captured with these sensors, we utilize an open-source fruits dataset to increase the number of training images. For image classification, we train our model on approximately 30,000 RGB images from the six categories of fruits. The model achieves an accuracy of 97% on edge platforms using ON Semiconductor’s 13 MP camera with AR1335 sensor. In addition to the image classification model, work is currently in progress to improve the accuracy of object detection using SSD and SSDLite with MobileNetV2 as the feature extractor. In this paper, we show preliminary results on the object detection model for the same six categories of fruits.

Digital Library: EI

Published Online: January 2020

Locating Mechanical Switches Using RGB-D Sensor Mounted on a Disaster Response Robot

189 6

disaster response robot
Object Detection
3D point cloud

Takuya Kanda, Kazuya Miyakawa, Jeonghwang Hayashi, Jun Ohya, Hiroyuki Ogata, Kenji Hashimoto, Xiao Sun, Takashi Matsuzawa, Hiroshi Naito, Atsuo Takanishi

DOI

10.2352/ISSN.2470-1173.2020.6.IRIACV-016

Volume 32

Issue 6