IS&T | Library

Optimizing Gabor Texture Features for Materials Recognition by Convolutional Neural Networks

Abstract

This paper proposes a novel frame selection technique based on embedding similarity to optimize video quality assessment (VQA). By leveraging high-dimensional feature embeddings extracted from deep neural networks (ResNet-50, VGG-16, and CLIP), we introduce a similarity-preserving approach that prioritizes perceptually relevant frames while reducing redundancy. The proposed method is evaluated on two datasets, CVD2014 and KonViD-1k, demonstrating robust performance across synthetic and real-world distortions. Results show that the proposed approach outperforms state-of-the-art methods, particularly in handling diverse and in-the-wild video content, achieving robust performances on KonViD-1k. This work highlights the importance of embedding-driven frame selection in improving the accuracy and efficiency of VQA methods.

Digital Library: EI

Published Online: February 2025

Proceedings Paper

130 10

Material recognition
Texture recognition
GABOR Filters
Convolutional Neural Networks

Francesco Bianconi, Claudio Cusano, Paolo Napoletano, Raimondo Schettini

Pages 118 - 121, June 2023, 2023

DOI

10.2352/lim.2023.4.1.28

Volume 4

Issue 1

Abstract

In this paper, we present a novel technique that allows for customized Gabor texture features by leveraging deep learning neural networks. Our method involves using a Convolutional Neural Network to refactor traditional, hand-designed filters on specific datasets. The refactored filters can be used in an off-the-shelf manner with the same computational cost but significantly improved accuracy for material recognition. We demonstrate the effectiveness of our approach by reporting a gain in discriminatio accuracy on different material datasets. Our technique is particularly appealing in situations where the use of the entire CNN would be inadequate, such as analyzing non-square images or performing segmentation tasks. Overall, our approach provides a powerful tool for improving the accuracy of material recognition tasks while retaining the advantages of handcrafted filters.

Digital Library: LIM

Published Online: June 2023

Deploying machine learning based segmentation for scientific imaging analysis at synchrotron facilities

278 90

Machine Learning
Image Segmentation
Image Labeling
Feature Extraction
Integrated Platform
User Facility
Deep Learning
Convolutional Neural Networks

Guanhua Hao, Eric J. Roberts, Tanny Chavez, Zhuowen Zhao, Elizabeth A. Holman, Howard Yanxon, Adam Green, Harinarayan Krishnan, Daniela Ushizima, Dylan McReynolds, Nicholas Schwarz, Petrus H. Zwart, Alexander Hexemer, Dilworth Parkinson

DOI

10.2352/EI.2023.35.9.IPAS-290

Volume 35

Issue 9

Abstract

Scientific user facilities present a unique set of challenges for image processing due to the large volume of data generated from experiments and simulations. Furthermore, developing and implementing algorithms for real-time processing and analysis while correcting for any artifacts or distortions in images remains a complex task, given the computational requirements of the processing algorithms. In a collaborative effort across multiple Department of Energy national laboratories, the "MLExchange" project is focused on addressing these challenges. MLExchange is a Machine Learning framework deploying interactive web interfaces to enhance and accelerate data analysis. The platform allows users to easily upload, visualize, label, and train networks. The resulting models can be deployed on real data while both results and models could be shared with the scientists. The MLExchange web-based application for image segmentation allows for training, testing, and evaluating multiple machine learning models on hand-labeled tomography data. This environment provides users with an intuitive interface for segmenting images using a variety of machine learning algorithms and deep-learning neural networks. Additionally, these tools have the potential to overcome limitations in traditional image segmentation techniques, particularly for complex and low-contrast images.

Digital Library: EI

Published Online: January 2023

Crowd counting using deep learning based head detection

354 110

Object detection
Convolutional Neural Networks
Deep Learning
YOLO
Yolov5
Precision
Mean average Precision

Maryam Hassan, Farhan Hussain, Sultan Daud Khan, Mohib Ullah, Mudassar Yamin, Habib Ullah

Pages 293--1 - 293-6, January 2023, This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. 2023

DOI

10.2352/EI.2023.35.9.IPAS-293

Volume 35

Issue 9

Abstract

Scale invariance and high miss detection rates for small objects are some of the challenging issues for object detection and often lead to inaccurate results. This research aims to provide an accurate detection model for crowd counting by focusing on human head detection from natural scenes acquired from publicly available datasets of Casablanca, Hollywood-Heads and Scut-head. In this study, we tuned a yolov5, a deep convolutional neural network (CNN) based object detection architecture, and then evaluated the model using mean average precision (mAP) score, precision, and recall. The transfer learning approach is used for fine-tuning the architecture. Training on one dataset and testing the model on another leads to inaccurate results due to different types of heads in different datasets. Another main contribution of our research is combining the three datasets into a single dataset, including every kind of head that is medium, large and small. From the experimental results, it can be seen that this yolov5 architecture showed significant improvements in small head detections in crowded scenes as compared to the other baseline approaches, such as the Faster R-CNN and VGG-16-based SSD MultiBox Detector.

Digital Library: EI

Published Online: January 2023

ILIAC: Efficient classification of degraded images using knowledge distillation with cutout data augmentation

312 55

Knowledge Distillation
Image Degradation
Convolutional Neural Networks
Image Classification
Data Augmentation
Efficient Neural Networks

Dinesh Daultani, Masayuki Tanaka, Masatoshi Okutomi, Kazuki Endo

DOI

10.2352/EI.2023.35.9.IPAS-296

Volume 35

Issue 9

Abstract

Image classification is extensively used in various applications such as satellite imagery, autonomous driving, smartphones, and healthcare. Most of the images used to train classification models can be considered ideal, i.e., without any degradation either due to corruption of pixels in the camera sensors, sudden shake blur, or the compression of images in a specific format. In this paper, we have proposed a novel CNN-based architecture for image classification of degraded images based on intermediate layer knowledge distillation and data augmentation approach cutout named ILIAC. Our approach achieves 1.1%, and 0.4% mean accuracy improvements for all the degradation levels of JPEG and AWGN, respectively, compared to the current state-of-the-art approach. Furthermore, ILIAC method is efficient in computational capacity, i.e., about half the size of the previous state-of-the-art approach in terms of model parameters and GFlops count. Additionally, we demonstrate that we do not necessarily need a larger teacher network in knowledge distillation to improve the model performance and generalization of a smaller student network for the classification of degraded images.

Digital Library: EI

Published Online: January 2023

Effect of hue shift towards robustness of convolutional neural networks

156 12

Convolutional Neural Networks
Hue shift
Robustness of Deep neural networks
Distribution Shift in Images

Kanjar De, Marius Pedersen

Pages 156-1 - 156-6, January 2022, © Society for Imaging Science and Technology 2022

DOI

10.2352/EI.2022.34.15.COLOR-156

Volume 34

Issue 15

Abstract

Computer vision systems become deployed in diverse real time systems hence robustness is a major area of concern. As a vast majority of the AI enabled systems are based on convolutional neural networks based models which use 3-channel RGB images as input. It has been shown that the performance of AI systems, such as those used in classification, is impacted by distortions in the images. To date most work has been carried out on distortions such as noise, blur, compression. However, color related changes to images could also impact the performance. Therefore, the goal of this paper is to study the robustness of these models under different hue shifts.

Digital Library: EI

Published Online: January 2022

Transfer Learning with Style Transfer between the Photorealistic and Artistic Domain

72 8

Convolutional Neural Networks
Transfer Learning
Data Augmentation
Style Transfer
Art History
Music Iconography

Nikolay Banar, Matthia Sabatelli, Pierre Geurts, Walter Daelemans, Mike Kestemont

DOI

10.2352/ISSN.2470-1173.2021.14.CVAA-041

Volume 33

Issue 14

Transfer Learning is an important strategy in Computer Vision to tackle problems in the face of limited training data. However, this strategy still heavily depends on the amount of availabl data, which is a challenge for small heritage institutions. This paper investigates various ways of enrichingsmaller digital heritage collections to boost the performance of deep learningmodels, using the identification of musical instruments as a case study. We apply traditional data augmentation techniques as well as the use of an external, photorealistic collection, distorted by Style Transfer. Style Transfer techniques are capable of artistically stylizing images, reusing the style from any other given image. Hence, collections can be easily augmented with artificially generated images. We introduce the distinction between inner and outer style transfer and show that artificially augmented images in both scenarios consistently improve classification results, on top of traditional data augmentation techniques. However, and counter-intuitively, such artificially generated artistic depictions of works are surprisingly hard to classify. In addition, we discuss an example of negative transfer within the non-photorealistic domain.

Digital Library: EI

Published Online: January 2021

Adaptive Context Encoding Module for Semantic Segmentation

214 0

Semantic Segmentation
Adaptive Context Encoding
Convolutional Neural Networks

Congcong Wang, Faouzi Alaya Cheikh, Azeddine Beghdadi, Ole Jakob Elle

DOI

10.2352/ISSN.2470-1173.2020.10.IPAS-027

Volume 32

Issue 10

The object sizes in images are diverse, therefore, capturing multiple scale context information is essential for semantic segmentation. Existing context aggregation methods such as pyramid pooling module (PPM) and atrous spatial pyramid pooling (ASPP) employ different pooling size or atrous rate, such that multiple scale information is captured. However, the pooling sizes and atrous rates are chosen empirically. Rethinking of ASPP leads to our observation that learnable sampling locations of the convolution operation can endow the network learnable fieldof- view, thus the ability of capturing object context information adaptively. Following this observation, in this paper, we propose an adaptive context encoding (ACE) module based on deformable convolution operation where sampling locations of the convolution operation are learnable. Our ACE module can be embedded into other Convolutional Neural Networks (CNNs) easily for context aggregation. The effectiveness of the proposed module is demonstrated on Pascal-Context and ADE20K datasets. Although our proposed ACE only consists of three deformable convolution blocks, it outperforms PPM and ASPP in terms of mean Intersection of Union (mIoU) on both datasets. All the experimental studies confirm that our proposed module is effective compared to the state-of-the-art methods.

Digital Library: EI

Published Online: January 2020

Detection of Malicious Spatial-Domain Steganography over Noisy Channels Using Convolutional Neural Networks

241 13

Image steganography
Malicious steganography
Steganalysis
Transmission over noisy channels
Deep learning
Convolutional Neural Networks

Swaroop Shankar Prasad, Ofer Hadar, Ilia Polian

DOI

10.2352/ISSN.2470-1173.2020.4.MWSF-076

Volume 32

Issue 4

Image steganography can have legitimate uses, for example, augmenting an image with a watermark for copyright reasons, but can also be utilized for malicious purposes. We investigate the detection of malicious steganography using neural networkbased classification when images are transmitted through a noisy channel. Noise makes detection harder because the classifier must not only detect perturbations in the image but also decide whether they are due to the malicious steganographic modifications or due to natural noise. Our results show that reliable detection is possible even for state-of-the-art steganographic algorithms that insert stego bits not affecting an image’s visual quality. The detection accuracy is high (above 85%) if the payload, or the amount of the steganographic content in an image, exceeds a certain threshold. At the same time, noise critically affects the steganographic information being transmitted, both through desynchronization (destruction of information which bits of the image contain steganographic information) and by flipping these bits themselves. This will force the adversary to use a redundant encoding with a substantial number of error-correction bits for reliable transmission, making detection feasible even for small payloads.

Digital Library: EI

Published Online: January 2020

Dynamic Zero-Parallax-Setting Techniques for Multi-View Autostereoscopic Display

56 9

Spatial Distribution
Zero Parallax Setting
Saliency Detection
Autostereoscopic Display
Convolutional Neural Networks

Yuzhong Jiao, Mark Ping Chan Mok, Kayton Wai Keung Cheung, Man Chi Chan, Tak Wai Shen, Yiu Kei Li

DOI

10.2352/ISSN.2470-1173.2020.2.SDA-098

Volume 32

Issue 2