Image Processing: Algorithms and Systems continues the tradition of the past conference Nonlinear Image Processing and Pattern Analysis in exploring new image processing algorithms. It also reverberates the growing call for integration of the theoretical research on image processing algorithms with the more applied research on image processing systems. Specifically, the conference aims at highlighting the importance of the interaction between transform-, model-, and learning-based approaches for creating effective algorithms and building modern imaging systems for new and emerging applications.
Estimating the pose from fiducial markers is a widely researched topic with practical importance for computer vision, robotics and photogrammetry. In this paper, we aim at quantifying the accuracy of pose estimation in real-world scenarios. More specifically, we investigate six different factors, which impact the accuracy of pose estimation, namely: number of points, depth offset, planar offset, manufacturing error, detection error, and constellation size. Their influence is quantified for four non-iterative pose estimation algorithms, employing direct linear transform, direct least squares, robust perspective n-point, and infinitesimal planar pose estimation, respectively. We present empirical results which are instructive for selecting a well-performing pose estimation method and rectifying the factors causing errors and degrading the rotational and translational accuracy of pose estimation.
In this work, we study the effectiveness of prior re-sampling approaches for imbalanced image classification. We propose to investigate inter-class and within-class characteristics and conduct class specific extrapolation re-sampling for optimal imbalanced learning.
Changes in retinal structure have been documented in patients with chronic schizophrenia using optical coherence tomography (OCT) metrics, but these studies were limited by the measurements provided by OCT machines. In this paper, we leverage machine and deep learning techniques to analyze OCT images and train algorithms to differentiate between schizophrenia patients and healthy controls. In order to address data scarcity issues, we use intermediate representations extracted from ReLayNet, a pretrained convolutional neural network designed to segment macula layers from OCT images. Experimental results show that classifiers trained on deep features and OCT-machine provided metrics can reliably distinguish between chronic schizophrenia patients and an age-matched control population. Further, we present what is to our knowledge the first reported empirical evidence showing that separation can be achieved between first-episode schizophrenia patients and their age- matched control group by leveraging deep image features extracted from OCT imagery.
Synthetic aperture radar (SAR) images have found numerous applications. However, further analysis of SAR images including interpretation, classification, segmentation, etc. is an extremely challenging task due to the presence of highly intensive speckle noise. Therefore, image despeckling is one of the main stages in preliminary SAR data processing. Over the past decades, a large number of different image despeckling techniques have been proposed ranging from local statistics filters to deep learning based ones. In this study, we analyze one of the most known and widely used local statistics Frost filter. Despeckling efficiency of the Frost filter significantly depends on the sliding window size and tuning (also called damping) factor. Here, we present a method for optimal parameters selection of the Frost filter for a given image based on despeckling efficiency prediction. Despeckling efficiency prediction for the Frost filter is carried out using a set of statistical and spectral input parameters and multilayer neural network. It is shown that such a prediction can be performed before applying image despeckling with a high accuracy and it is faster than despeckling itself. Both simulated speckled images and real-life Sentinel-1 SAR images have been used for extensive evaluation of the proposed method.
Simulation-based training is used to improve learners’ skills and enhance their knowledge. Recently, virtual reality technology has been exploited in simulation mainly for training purposes, to enable learning while performing simulated activities that are dangerous or even impossible to be simulated in the real world. In this context, we present a simulation-based firefighter training for an earthquake scenario developed in collaboration with national Italian firefighters rescue units (Italian “Istituto Superiore Antincendi”). The proposed training model is based on a virtual reality solution and foresees a novel interaction and game model developed specifically for training first-responders. The simulator environment is a head-mounted display where the learner interacts with objects and performs specific tasks. The performed test show that the use of virtual reality can improve the effectiveness of training. Indeed, trainees show a better perception of the scene which is reflected in a faster response in the real situation. The proposed training system can help the firefighter by providing adequate information on how to deal with risks.
Contrast is an imperative perceptible attribute embodying the image quality. In medical images, the poor quality specifically low contrast inhibits precise interpretation of the image. Contrast enhancement is, therefore, applied not merely to improve the visual quality of images but also enabling them to facilitate further processing tasks. We propose a contrast enhancement approach based on cross-modal learning in this paper. Cycle-GAN (Generative Adversarial Network) is used for this purpose, where UNet augmented with global features acts as a generator. Besides, individual batch normalization has been used to make generators adapt specifically to their input distributions. The proposed method accepts low contrast T2-weighted (T2-w) Magnetic Resonance images (MRI) and uses the corresponding high contrast T1-w MRI to learn the global contrast characteristics. The experiments were conducted on a publicly available IXI dataset. Comparison with recent CE methods and quantitative assessment using two prevalent metrics FSIM and BRISQUE validate the superior performance of the proposed method.
Circle detection of edge images can involve significant time and memory requirements, particularly if the circles have unknown radii over a large range. We describe an algorithm that processes an edge image in a single linear pass, compiling statistics of connected components that can be used by two distinct least square methods. Because the compiled statistics are all sums, these components can then be quickly merged without any further examination of image pixels. Fusing multiple circle detectors allows more powerful circle detection. The resulting algorithm is of linear complexity in the number of image pixels, and quadratic complexity in a much smaller number of cluster statistics.
This research explores a fresh approach to the selection and weighting of classical image features for infrared object detection and target-like clutter rejection. Traditional statistical techniques are used to calculate individual features, while modern supervised machine learning techniques are used to rank-order the predictive-value of each feature. This paper describes the use of Decision Trees to determine which features have the highest value in prediction of the correct binary target/non-target class. This work is unique in that it is focused on infrared imagery and exploits interpretable machine learning techniques for the selection of hand-crafted features integrated into a pre-screening algorithm.
Visual quality is important for remote sensing data presented as grayscale, color or pseudo-color images. Although several visual quality metrics (VQMs) have been used to characterize such data, only a limited analysis of their applicability in remote sensing applications has been done so far. In this paper, we study correlation factors for a wide set of VQMs for color images with distortion types typical for remote sensing. It is demonstrated that there are many metrics that have very high Spearman rank order correlation, e.g. PSNR-based and SSIM-based metrics. Meanwhile, there are also metrics that are practically uncorrelated with others. A detailed analysis of VQMs that have the largest SROCC values and belong to different groups is presented in this paper.