The rise of cheaper and more accurate genotyping techniques has lead to significant advances in understanding the genotype-phenotype map. However, this is currently bottlenecked by manually intensive or slow phenotype data collection. We propose an algorithm to automatically estimate the canopy height of a row of plants in field conditions in a single pass on a moving robot. A stereo sensor pointed down collects a series of stereo image pairs. The depth images are then converted to height-above-ground images to extract height contours. Separate height contours corresponding to each frame are then concatenated to construct a height contour representing one row of plants in the plot. Since the process is automated, data can be collected throughout the growing season with very little manual labor complementing the already abundantly available genotypic data. Using experimental data from seven plots, we show our proposed approach achieves a height estimation error of approximately 3.3%.
During recent years, deep learning methods have shown to be effective for image classification, localization and detection. Convolutional Neural Networks (CNN) are used to extract information from images and are the main element of modern machine learning and computer vision methods. CNNs can be used for logo detection and recognition. Logo detection consist on locate and recognize commercial brand logos within an image. These methods are useful in the areas of online brand management or ad placement. The performance of this methods is closely related on the quantity and the quality of the data, typically image/label pairs, used to train the CNNs. Collecting the pair of images and labels, commonly referred as ground truth, can be expensive and time consuming. Multiple techniques try to solve this problem by either transforming the available data using data augmentation methods or by creating new images from scratch or from other images using image synthesis methods. In this paper, we investigate the latter approach. We segment background images, extract depth information and then blend logo images accordingly in order to create new real looking images. This approach allows us to create an indefinite number of images with a minimum manual labeling effort. The synthetic images can later be used to train CNNs for logo detection and recognition.
A flashover occurs when a fire spreads very rapidly through crevices due to intense heat. Flashovers present one of the most frightening and challenging fire phenomena to those who regularly encounter them: firefighters. Firefighters' safety and lives often depend on their ability to predict flashovers before they occur. Typical pre-flashover fire characteristics include dark smoke, high heat, and rollover ("angel fingers") and can be quantified by color, size, and shape. Using a color video stream from a firefighter's body camera, we applied generative adversarial neural networks for image enhancement. The neural networks were trained to enhance very dark fire and smoke patterns in videos and monitor dynamic changes in smoke and fire areas. Preliminary tests with limited flashover training videos showed that we predicted a flashover as early as 55 seconds before it occurred.
Plant phenotyping, or the measurement of plant traits such as stem width and plant height, is a critical step in the development and evaluation of higher yield biofuel crops. Phenotyping allows biologists to quantitatively estimate the biomass of plant varieties and therefore their potential for biofuel production. Manual phenotyping is costly, time-consuming, and errorprone, requiring a person to walk through the fields measuring individual plants with a tape measure and notebook. In this work we describe an alternative system consisting of an autonomous robot equipped with two infrared cameras that travels through fields, collecting 2.5D image data of sorghum plants. We develop novel image processing based algorithms to estimate plant height and stem width from the image data. Our proposed method has the advantage of working in situ using images of plants from only one side. This allows phenotypic data to be collected nondestructively throughout the growing cycle, providing biologists with valuable information on crop growth patterns. Our approach first estimates plant heights and stem widths from individual frames. It then uses tracking algorithms to refine these estimates across frames and avoid double counting the same plant in multiple frames. The result is a histogram of stem widths and plant heights for each plot of a particular genetically engineered sorghum variety. In-field testing and comparison with human collected ground truth data demonstrates that our system achieves 13% average absolute error for stem width estimation and 15% average absolute error for plant height estimation.
A playbook in American Football can consist of hundreds of plays, and to learn each play and the corresponding assignments and responsibilities is a big challenge for the players. In this paper we propose a teaching tool for coaches in American Football based on computer vision and visualization techniques which eases the learning process and helps the players gain deeper knowledge of the underlying concepts. Coaches can create, manipulate and animate plays with adjustable parameters which affect the player actions in the animation. The general player behaviors and interactions between players are modeled based on expert knowledge. The final goal of the framework is to compare the theoretical concepts with their practical implementation in training and game, by using computer vision algorithms which extract spatio-temporal motion patterns from corresponding real video material. First results indicate that the software can be used effectively by coaches and the players' understanding of critical moments of the play can be increased with the animation system.
This paper shows that the implementation of vision systems benefits from the usage of sensing front-end chips with embedded pre-processing capabilities – called CVIS. Such embedded pre-processors reduce the number of data to be delivered for ulterior processing. This strategy, which is also adopted by natural vision systems, relaxes system-level requirements regarding data storage and communications and enables highly compact and fast vision systems. The paper includes several proof-o-concept CVIS chips with embedded pre-processing and illustrate their potential advantages.
Recent progress in deep learning methods has shown that key steps in object detection and recognition, including feature extraction, region proposals, and classification, can be done using Convolutional Neural Networks (CNN) with high accuracy. However, the use of CNNs for object detection and recognition has significant technical challenges that still need to be addressed. One of the most daunting problems is the very large number of training images required for each class/label. One way to address this problem is through the use of data augmentation methods where linear and nonlinear transforms are done on the training data to create "new" training images. Typical transformations include spatial flipping, warping and other deformations. An important concept of data augmentation is that the deformations applied to the labeled training images do not change the semantic meaning of the classes/labels. In this paper we investigate several approaches to data augmentation. First, several data augmentation techniques are used to increase the size of the training dataset. Then, a Faster R-CNN is trained with the augmented dataset for detect and recognize objects. Our work is focused on two different scenarios: detecting objects in the wild (i.e. commercial logos) and detecting objects captured using a camera mounted on a computer system (i.e. toy animals).
The goal of the TCD3 project is to identify anomalous and dangerous driving patterns from traffic camera feeds. Successful execution can improve road safety by assisting law enforcement catch dangerous drivers, who text while driving or drink and drive. TCD3—in real time—uses Computer Vision to detect cars on the road, utilizes Machine Learning algorithms to identify cars exhibiting dangerous behaviors, and then notifies law enforcement of suspicious vehicles. The project overcomes several technical challenges such as detecting vehicles under different lighting conditions, tracking vehicles in different frames, and distinguishing random variations in a vehicle's path due to normal driving from anomalous variations due to distracted driving. TCD3's C++ script runs on a server and receives live streaming traffic camera feed. A heuristic Computer Vision algorithm utilizes optical flow analysis, background subtraction, and feature extraction algorithms to reliably determine vehicle positions. A proprietary recursive matrix density-based method was created to clean sensor feeds, sizably improving detection accuracy, and greatly improving on current morphological methods. Image registration allows a vehicle's path to be analyzed through multiple frames. A test suite of traffic camera footage was used to evaluate vehicle detection. Frames were doctored and drunk drivers were simulated to test the Machine Learning system, the algorithm was found to have an 83% accuracy. Machine Learning was used for historical and active comparative analyses of vehicle paths to identify anomaly. The system is contextually aware and is robust with respect to normal irregularities in traffic patterns such as from red lights. Permission for large scale testing of the prototype on actual high fidelity traffic camera footage has been requested. Upon detection, the relevant video clip will be extracted and sent to law enforcement for further action. To increase affordability, processing speed, and scalability, a multinode networked Spark-based supercomputing architecture is being investigated. TCD3 is multi-threaded for maximum resource allocation. The project website is at drunkdriverdetection.com.