Regular
Action recognitionAWGNAdaptive filteringADMM
BM3D
CNNClustered microcalcificationsColorClassificationComputer VisionComputer-aided diagnosis
Denoising expedienceDeep learningDykstra's projection algorithmDomain adaptationDeep learning featuresDenoising efficiency
FilteringFusion algorithmsFeature de-noisingFPGAFiltering and denoisingFast correcting while scanning
HazeHevc
Image fusionImage retrievalImage noiseImage classificationImage quality
JPEG
Machine learningMachine VisionMotion searchMotion compensation
Neural network
Optimization
Performance predictionPOCSPhotometric stereo
Reinforcement learningRate distortion optimization
Signal-dependent noiseScene understandingSecond Harmonic Generation ImagerySimilarity searchSaturated pixels correctionSimilarity metrics
Tissue Sub-Region ClassificationTools and systems
Video processingVideo codingVirtual adversarial trainingVehicle re-identification
 Filters
Month and year
 
  22  5
Image
Pages A10-1 - A10-6,  © Society for Imaging Science and Technology 2021
Digital Library: EI
Published Online: January  2021
  104  1
Image
Pages 226-1 - 226-6,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

Finding a point in the intersection of two closed convex sets is a common problem in image processing and other areas. Projections onto convex sets (POCS) is a standard algorithm for finding such a point. Dykstra's projection algorithm is a well known alternative that finds the point in the intersection closest to a given point. Yet another lesser known alternative is the alternating direction method of multipliers (ADMM) that can be used for both purposes. In this paper we discuss the differences in the convergence of these algorithms in image processing problems. The ADMM applied to finding an arbitrary point in the intersection is much faster than POCS and any algorithm for finding the nearest point in the intersection.

Digital Library: EI
Published Online: January  2021
  33  12
Image
Pages 234-1 - 234-8,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

Vehicle re-identification (re-ID) is based on identity matching of vehicles across non-overlapping camera views. Recently, the research on vehicle re-ID attracts increased attention, mainly due to its prominent industrial applications, such as post-crime analysis, traffic flow analysis, and wide-area vehicle tracking. However, despite the increased interest, the problem remains to be challenging. One of the most significant difficulties of vehicle re-ID is the large viewpoint variations due to non-standardized camera placements. In this study, to improve re-ID robustness against viewpoint variations while preserving algorithm efficiency, we exploit the use of vehicle orientation information. First, we analyze and benchmark various deep learning architectures in terms of performance, memory use, and cost on applicability to orientation classification. Secondly, the extracted orientation information is utilized to improve the vehicle re-ID task. For this, we propose a viewpoint-aware multi-branch network that improves the vehicle re-ID performance without increasing the forward inference time. Third, we introduce a viewpoint-aware mini-batching approach which yields improved training and higher re-ID performance. The experiments show an increase of 4.0% mAP and 4.4% rank-1 score on the popular VeRi dataset with the proposed mini-batching strategy, and overall, an increase of 2.2% mAP and 3.8% rank-1 score compared to the ResNet-50 baseline.

Digital Library: EI
Published Online: January  2021
  80  9
Image
Pages 235-1 - 235-7,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

In this paper, text recognition of variably curved cardboard pharmaceutical packages is studied from the photometric stereo imaging point-of-view with focus on developing a method for binarizing the expiration date and batch code texts. Adaptive filtering, more specifically Wiener filter, is used together with haze removal algorithm with fusion of LoG-edge detected sub-images resulting an Otsu thresholded binary image of expiration date and batch code texts for future analysis. Some results are presented, and they appear to be promising for text binarization. Successful binarization is crucial in text character segmentation and further automatic reading. Furthermore, some new ideas will be presented that will be used in our future research work.

Digital Library: EI
Published Online: January  2021
  99  2
Image
Pages 237-1 - 237-7,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

Image denoising is a classical preprocessing stage used to enhance images. However, it is well known that there are many practical cases where different image denoising methods produce images with inappropriate visual quality, which makes an application of image denoising useless. Because of this, it is desirable to detect such cases in advance and decide how expedient is image denoising (filtering). This problem for the case of wellknown BM3D denoiser is analyzed in this paper. We propose an algorithm of decision-making on image denoising expedience for images corrupted by additive white Gaussian noise (AWGN). An algorithm of prediction of subjective image visual quality scores for denoised images using a trained artificial neural network is proposed as well. It is shown that this prediction is fast and accurate.

Digital Library: EI
Published Online: January  2021
  80  0
Image
Pages 238-1 - 238-7,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

A similarity search in images has become a typical operation in many applications. A presence of noise in images greatly affects the correctness of detection of similar image blocks, resulting in a reduction of efficiency of image processing methods, e.g., non-local denoising. In this paper, we study noise immunity of various distance measures (similarity metrics). Taking into account a wide variety of information content in real life images and variations of noise type and intensity. We propose a set of test data and obtain preliminary results for several typical cases of image and noise properties. The recommendations for metrics' and threshold selection are given. Fast implementation of the proposed benchmark is realized using CUDA technology.

Digital Library: EI
Published Online: January  2021
  101  9
Image
Pages 239-1 - 239-7,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

Skeleton based action recognition is playing a critical role in computer vision research, its applications have been widely deployed in many areas. Currently, benefiting from the graph convolutional networks (GCN), the performance of this task is dramatically improved due to the powerful ability of GCN for modeling the Non-Euclidean data. However, most of these works are designed for the clean skeleton data while one unavoidable drawback is such data is usually noisy in reality, since most of such data is obtained using depth camera or even estimated from RGB camera, rather than recorded by the high quality but extremely costly Motion Capture (MoCap) [1] system. Under this circumstance, we propose a novel GCN framework with adversarial training to deal with the noisy skeleton data. With the guiding of the clean data in the semantic level, a reliable graph embedding can be extracted for noisy skeleton data. Besides, a discriminator is introduced such that the feature representation could further improved since it is learned with an adversarial learning fashion. We empirically demonstrate the proposed framework based on two current largest scale skeleton-based action recognition datasets. Comparison results show the superiority of our method when compared to the state-of-the-art methods under the noisy settings.

Digital Library: EI
Published Online: January  2021
  27  1
Image
Pages 240-1 - 240-7,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

It has been rigorously demonstrated that an end-to-end (E2E) differentiable formulation of a deep neural network can turn a complex recognition problem into a unified optimization task that can be solved by some gradient descent method. Although E2E network optimization yields a powerful fitting ability, the joint optimization of layers is known to potentially bring situations where layers co-adapt one to the other in a complex way that harms generalization ability. This work aims to numerically evaluate the generalization ability of a particular non-E2E network optimization approach known as FOCA (Feature-extractor Optimization through Classifier Anonymization) that helps to avoid complex co-adaptation, with careful hyperparameter tuning. In this report, we present intriguing empirical results where the non-E2E trained models consistently outperform the corresponding E2E trained models on three image-classification datasets. We further show that E2E network fine-tuning, applied after the feature-extractor optimization by FOCA and the following classifier optimization with the fixed feature extractor, indeed gives no improvement on the test accuracy. The source code is available at https://github.com/DensoITLab/FOCA-v1.

Digital Library: EI
Published Online: January  2021
  24  0
Image
Pages 241-1 - 241-5,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

This paper proposes a novel method to correct saturated pixels in images. This method is based on the YCbCr color space and separately corrects the chrominance and the luminance of saturated pixels. In this algorithm, the saturated image is processed on the scan line, which is beneficial to the hardware implementation and the correction effect is good. Through the results of the joint simulation of MATLAB and Modelsim, it can be concluded that the hardware algorithm of this paper can use less resources to achieve fast correction. This paper uses Altera DE4 high-level development platform for hardware implementation. The calculation results show that highspeed image and video processing by FPGA is feasible and efficient, and it can be done frame by frame for highdefinition video. It has broad practical application prospects.

Digital Library: EI
Published Online: January  2021
  18  3
Image
Pages 246-1 - 246-6,  © Society for Imaging Science and Technology 2021
Volume 33
Issue 10

Accurate diagnosis of microcalcification (MC) lesions in mammograms as benign or malignant is a challenging clinical task. In this study we investigate the potential discriminative power of deep learning features in MC lesion diagnosis. We consider two types of deep learning networks, of which one is a convolutional neural network developed for MC detection and the other is a denoising autoencoder network. In the experiments, we evaluated both the separability between malignant and benign lesions and the classification performance of image features from these two networks using Fisher's linear discriminant analysis on a set of mammographic images. The results demonstrate that the deep learning features from the MC detection network are most discriminative for classification of MC lesions when compared to both features from the autoencoder network and traditional handcrafted texture features.

Digital Library: EI
Published Online: January  2021

Keywords

[object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object]