Object detection using aerial drone imagery has received a great deal of attention in recent years. While visible light images are adequate for detecting objects in most scenarios, thermal cameras can extend the capabilities of object detection to night-time or occluded objects. As such, RGB and Infrared (IR) fusion methods for object detection are useful and important. One of the biggest challenges in applying deep learning methods to RGB/IR object detection is the lack of available training data for drone IR imagery, especially at night. In this paper, we develop several strategies for creating synthetic IR images using the AIRSim simulation engine and CycleGAN. Furthermore, we utilize an illumination-aware fusion framework to fuse RGB and IR images for object detection on the ground. We characterize and test our methods for both simulated and actual data. Our solution is implemented on an NVIDIA Jetson Xavier running on an actual drone, requiring about 28 milliseconds of processing per RGB/IR image pair.
Diagnosing ligament injuries using MRI scans is a labor-intensive task that requires an expert. In this paper, we propose a fully recurrent neural network (RNN) for detecting Anterior Cruciate Ligament (ACL) tears using MRI scans. The proposed network localizes the ACL and classifies it into several categories: ACL tear, normal tear, and healthy. Existing detection methods use deep learning networks based on single MRI sections, and in this way lose 3D spatial context. To address this, we propose a fully recurrent neural network that processes a sequence of 3D sections and so captures 3D spatial context. The proposed network is based on a YOLOv3 backbone and can produce a sequence of decisions which are then combined by majority voting. Experimental results show improvement over state-of-the-art methods.
The field of image and video quality assessment has enjoyed rapid development over the last two decades. Several datasets and algorithms have been designed to understand the effects of common distortions on the subjective experiences of human observers. The distortions present in these datasets may be synthetic (applying artificially computed blur, compression, noise, etc.) or authentic (in-capture lens flare, motion blur, under/overexposure, etc.). The goal of quality assessment is often to quantify the loss of visual "naturalness" caused by the distortion(s). We have recently created a new resource called LIVE-RoadImpairs, which is a novel image quality dataset consisting of authentically distorted images of roadways. We use the dataset to develop a no-reference quality assessment algorithm that is able to predict the failure rates of object-detection algorithms. This work was among the overall winners of the PSCR Enhancing Computer Vision for Safety Challenge.
In recent years, deep neural networks (DNNs) have accomplished impressive success in various applications, including autonomous driving perception tasks. On the other hand, current deep neural networks are easily fooled by adversarial attacks. This vulnerability raises significant concerns, particularly in safety-critical applications. As a result, research into attacking and defending DNNs has gained much coverage. In this work, detailed adversarial attacks are applied on a diverse multi-task visual perception deep network across distance estimation, semantic segmentation, motion detection, and object detection. The experiments consider both white and black box attacks for targeted and un-targeted cases while attacking a task and inspecting the effect on all the others, in addition to inspecting the effect of applying a simple defense method. We conclude this paper by comparing and discussing the experimental results, proposing insights and future work. The visualizations of the attacks are available at https://drive.google.com/file/d/1NKhCL2uC_SKam3H05SqjKNDE_zgvwQS-/view?usp=sharing