Computer vision algorithms are often burdened for embedded hardware implementation due to integration time and system complexity. Many commercial systems prevent low-level image processing customization and hardware optimization due to the largely proprietary nature of the algorithms and architectures, hindering research development by the larger community. This work presents DevCAM- an open-source multi-camera environment, targeted at hardware-software research for vision systems, specifically for co-located multi-sensor processor systems. The objective being to facilitate the integration of multiple latest generation sensors, abstracting interfacing difficulties to high-bandwidth sensors, enable user defined hybrid processing architectures on FPGA, CPU and GPU, and to unite multi-module systems with networking and high-speed storage storage. The system architecture can accommodate up to six 4-lane MIPI sensor modules which are electronically synchronized, alongside support for an RTK-GPS receiver and a 9-axis IMU. We demonstrate a number of available configurations that can be achieved for stereo, quadnocular, 360, and light-field image acquisition tasks. The development framework includes mechanical, PCB, FPGA and software components for the rapid integration into any system. System capabilities are demonstrated with the focus on opening new research frontiers such as distributed edge processing, inter system synchronization, sensor synchronization, and hybrid hardware acceleration of image processing tasks.
Autonomous driving plays a crucial role to prevent accidents and modern vehicles are equipped with multimodal sensor systems and AI-driven perception and sensor fusion. These features are however not stable during a vehicle’s lifetime due to various means of degradation. This introduces an inherent, yet unaddressed risk: once vehicles are in the field, their individual exposure to environmental effects lead to unpredictable behavior. The goal of this paper is to raise awareness of automotive sensor degradation. Various effects exist, which in combination may have a severe impact on the AI-based processing and ultimately on the customer domain. Failure mode and effects analysis (FMEA) type approaches are used to structure a complete coverage of relevant automotive degradation effects. Sensors include cameras, RADARs, LiDARs and other modalities, both outside and in-cabin. Sensor robustness alone is a well-known topic which is addressed by DV/PV. However, this is not sufficient and various degradations will be looked at which go significantly beyond currently tested environmental stress scenarios. In addition, the combination of sensor degradation and its impact on AI processing is identified as a validation gap. An outlook to future analysis and ways to detect relevant sensor degradations is also presented.
Proposed for the first time is a novel calibration empowered minimalistic multi-exposure image processing technique using measured sensor pixel voltage output and exposure time factor limits for robust camera linear dynamic range extension. The technique exploits the best linear response region of an overall nonlinear response image sensor to robustly recover via minimal count multi-exposure image fusion, the true and precise scaled High Dynamic Range (HDR) irradiance map. CMOS sensor-based experiments using a measured Low Dynamic Range (LDR) 44 dB linear region for the technique with a minimum of 2 multi-exposure images provides robust recovery of 78 dB HDR low contrast highly calibrated test targets.
Experimentally demonstrated for the first time is Coded Access Optical Sensor (CAOS) camera empowered robust and true white light High Dynamic Range (HDR) scene low contrast target image recovery over the full linear dynamic range. The 90 dB linear HDR scene uses a 16 element custom designed test target with low contrast 6 dB step scaled irradiances. Such camera performance is highly sought after in catastrophic failure avoidance mission critical HDR scenarios with embedded low contrast targets.
The United States of America has an estimate of 84,000 dams of which approximately 15,500 are rated as high-risk as of 2016. Recurrent geological and structural health changes require dam assets to be subject to continuous structural monitoring, assessment and restoration. The objective of the developed system is targeted at evaluating the feasibility for standardization in remote, digital inspections of the outflow works of such assets to replace human visual inspections. This work proposes both a mobile inspection platform and an image processing pipeline to reconstruct 3D models of the outflow tunnel and gates of dams for structural defect identification. We begin by presenting the imaging system with consideration to lighting conditions and acquisition strategies. We then propose and formulate global optimization constraints that optimize system poses and geometric estimates of the environment. Following that, we present a RANSAC frame-work that fits geometric cylinder primitives for texture projection and geometric deviation, as well as an interactive annotation frame-work for 3D anomaly marking. Results of the system and processing are demonstrated at the Blue Mountain Dam, Arkansas and the F.E. Walter Dam, Pennsylvania.
Semi-supervised learning uses underlying relationships in data with a scarcity of ground-truth labels. In this paper, we introduce an uncertainty quantification (UQ) method for graph-based semi-supervised multi-class classification problems. We not only predict the class label for each data point, but also provide a confidence score for the prediction. We adopt a Bayesian approach and propose a graphical multi-class probit model together with an effective Gibbs sampling procedure. Furthermore, we propose a confidence measure for each data point that correlates with the classification performance. We use the empirical properties of the proposed confidence measure to guide the design of a humanin-the-loop system. The uncertainty quantification algorithm and the human-in-the-loop system are successfully applied to classification problems in image processing and ego-motion analysis of body-worn videos.
The most common sensor arrangement of 360 panoramic video cameras is a radial design where a number of sensors are outward looking as in spokes on a wheel. The cameras are typically spaced at approximately human interocular distance with high overlap. We present a novel method of leveraging small form-factor camera units arranged in stereo pairs and interleaved to achieve a fully panoramic view with fully parallel sensor pairs. This arrangement requires less keystone correction to get depth information and the discontinuity between images that have to be stitched together is smaller than in the radial design. The primary benefit for this arrangement is the small form factor of the system with the large number of sensors enabling a high resolving power. We highlight mechanical considerations, system performance and software capabilities of these manufactured and tested imaging units. One is based on the Raspberry Pi cameras and a second based on a 16 camera system leveraging 8 pairs of 13 megapixel AR1335 cell phone sensors. In addition several different variations on the conceptual design were simulated with synthetic projections to compare stitching difficulty of the rendered scenes.