IS&T | Library

FlexEye - Application Specific Quality-scalable ISP Tuning

173 78

Image-Signal-Processor (ISP)
Quality of Service (QoS)
Computer Vision
Evolutionary Algorithms
ISP Tuning

Sumbal Akram, Muhammad Abdullah, Khuzaeymah Nasir, Shaharyar Yaqub, Rehan Ahmed, Rehan Hafiz

DOI

10.2352/EI.2025.37.15.AVM-109

Volume 37

Issue 15

Abstract

As AI becomes more prevalent, edge devices face challenges due to limited resources and the high demands of deep learning (DL) applications. In such cases, quality scalability can offer significant benefits by adjusting computational load based on available resources. Traditional Image-Signal-Processor (ISP) tuning methods prioritize maximizing intelligence performance, such as classification accuracy, while neglecting critical system constraints like latency and power dissipation. To address this gap, we introduce FlexEye, an application-specific, quality-scalable ISP tuning framework that leverages ISP parameters as a control knob for quality of service (QoS), enabling trade-off between quality and performance. Experimental results demonstrate up to 6% improvement in Object Detection accuracy and a 22.5% reduction in ISP latency compared to state of the art. In addition, we also evaluate Instance Segmentation task, where 1.2% accuracy improvement is attained with a 73% latency reduction.

Digital Library: EI

Published Online: February 2025

Art history, computer vision…and the face of Abraham Lincoln

175 30

Computer Vision
Art History
Face Recognition
Image Analysis
Machine Learning
Keypoint Extraction
Artistic License
Brushstroke Analysis

Betsy Mathisen

DOI

10.2352/EI.2025.37.11.HVEI-215

Volume 37

Issue 11

Abstract

Are we there yet? All the puzzle pieces are here: a 2” miniature portrait on ivory dated circa 1840-1842, discovered alongside a letter detailing the owner’s familial ties to Mary Todd Lincoln. This portrait’s distinctive features echo President Lincoln’s unique facial asymmetry. However, despite intensive investigation, no historical document has been found to definitively link this miniature to Lincoln. This research aims to bridge art and science to determine whether this painting represents the earliest image of Abraham Lincoln, potentially opening avenues for future collaborations in identifying historical faces from the past. A key contributor to this effort is Dr. David Stork, an Adjunct Professor at Stanford University and a leading expert in computer-based image analysis. Dr. Stork holds 64 U.S. patents and has authored over 220 peer-reviewed publications in fields such as machine learning, pattern recognition, computational optics, and the image analysis of art. His recent book, Pixels and Paintings: Foundations of Computer-Assisted Connoisseurship1, fosters a dialogue between art scholars and the computer vision community.

Digital Library: EI

Published Online: February 2025

Computational Tools for Analyses of Color of Costumes in Large Corpora of Fine Art Paintings

14 3

Computer Vision
Fine-Art Analysis
Segmentation

Christine Li, David G. Stork

DOI

10.2352/EI.2025.37.11.HVEI-216

Volume 37

Issue 11

Abstract

Clothing is a lens through which a society expresses its culture and history. Its stylized portrayal in painting adds an immensely rich layer of cultural self introspection—how artists see themselves and their contemporaries, expressed through art. Particularly of interest in this study is color: how has color in costumes in portraiture painting changed over time, across art styles, and for different genders? In this study, we apply computational methods drawn from computer vision, machine learning, economics, and statistics to a large corpora of over 12k portrait paintings to analyze trends in color in Western art over the past 600 years. For each painting, we obtained clothing segmentation masks using a fine-tuned SegFormer model, performed gender classification using CLIP (Contrastive Language-Image Pre-Training), extracted dominant colors via clustering analysis, and computed Color Contrast Index (CI) and Diversity Index (DI). This study is, to our knowledge, the most comprehensive, large-scale analysis of colors of clothing in paintings. We share our methodology to make more widely accessible state-of-the-art computational tools for scholars studying the history and development of style in fine art paintings. Our tools empower analyses of major trends in costume colors as well as specialized domain-specific searches throughout databases of tens of thousands of paintings—far larger than can be efficiently analyzed without computer methods. These tools can reveal comparisons between different painters and trends within particular artists’ careers. Our tools could be enhanced to enable refined analyses, for instance on the social status of the portrait subject, and other visual criteria.

Digital Library: EI

Published Online: February 2025

Image Quality Assessment for Natural Scene Portraits: An Industrial Application

16 3

Image quality evaluation
Texture detail preservation
Data set fitting
Machine Learning
Computer Vision
DXOMARK
Camera quality assessment

Daniela Carfora Ventura, Gabriel Pacianotto Gouveia, Hoang-Son Nguyen, Jianqiang Sky Zhou, Nicolas Chahine, Sira Ferradans

DOI

10.2352/EI.2025.37.9.IQSP-247

Volume 37

Issue 9

Abstract

Portraits are one of the most common use cases in photography, especially in smartphone photography. However, evaluating portrait quality in real portraits is costly and difficult to reproduce. We propose a new method to evaluate a large range of detail preservation rendering on real portrait images. Our approach is based on 1) annotating a set of portrait images grouped by semantic content using pairwise comparison 2) taking advantage of the fact that we are focusing on portraits, using cross-content annotations to align the quality scales 3) training a machine learning model on the global quality scale. On top of providing a fine-grained wide range detail preservation quality output, numerical experiments show that the proposed method correlates highly with the perceptual evaluation of image quality experts.

Digital Library: EI

Published Online: February 2025

Scalable and Efficient Orchestration of Machine Learning Workloads on DSPs with Multi-level Memory Architecture

60 18

ADAS
Computer Vision
Deep Learning
Embedded Systems
Image Processing

Aaron Sequeira, Febin Sam, Anshu Jain, Pramod Swami

DOI

10.2352/EI.2024.36.10.IPAS-255

Volume 36

Issue 10

Abstract

Deep learning has enabled rapid advancements in the field of image processing. Learning based approaches have achieved stunning success over their traditional signal processing-based counterparts for a variety of applications such as object detection, semantic segmentation etc. This has resulted in the parallel development of hardware architectures capable of optimizing the inferencing of deep learning algorithms in real time. Embedded devices tend to have hard constraints on internal memory space and must rely on larger (but relatively very slow) DDR memory to store vast data generated while processing the deep learning algorithms. Thus, associated systems have to be evolved to make use of the optimized hardware balancing compute times with data operations. We propose such a generalized framework that can, given a set of compute elements and memory arrangement, devise an efficient method for processing of multidimensional data to optimize inference time of deep learning algorithms for vision applications.

Digital Library: EI

Published Online: January 2024

ISP Tuning for Improved Image Quality in Machine Vision

197 87

Automotive
Computer Vision
Contrast Transfer Accuracy
Image Quality
Image signal Processor
Modulation Transfer Function

Diarmaid Geever, Tim Brophy, Dara Molloy, Martin Glavin, Edward Jones, Brian Deegan

DOI

10.2352/EI.2024.36.9.IQSP-257

Volume 36

Issue 9

Abstract

This paper investigates the relationship between image quality and computer vision performance. Two image quality metrics, as defined in the IEEE P2020 draft Standard for Image quality in automotive systems, are used to determine the impact of image quality on object detection. The IQ metrics used are (i) Modulation Transfer function (MTF), the most commonly utilized metric for measuring the sharpness of a camera; and (ii) Modulation and Contrast Transfer Accuracy (CTA), a newly defined, state-of-the-art metric for measuring image contrast. The results show that the MTF and CTA of an optical system are impacted by ISP tuning. Some correlation is shown to exist between MTF and object detection (OD) performance. A trend of improved AP5095 as MTF50 increases is observed in some models. Scenes with similar CTA scores can have widely varying object detection performance. For this reason, CTA is shown to be limited in its ability to predict object detection performance. Gaussian noise and edge enhancement produce similar CTA scores but different AP5095 scores. The results suggest MTF is a better predictor of ML performance than CTA.

Digital Library: EI

Published Online: January 2024

From Video Conferences to DSLRs: An In-depth Texture Evaluation with Realistic Mannequins

120 28

Image quality evaluation
Texture detail preservation
Data set fitting
DXOMARK
Machine Learning
Computer Vision
Realistic image quality charts
Camera quality assessment

Daniela Carfora Ventura, Gabriel Pacianotto Gouveia, Ana Calarasanu, Valentine Tosel, Nicolas Chahine, Sira Ferradans

DOI

10.2352/EI.2024.36.9.IQSP-261

Volume 36

Issue 9

Abstract

Portraits are one of the most common use cases in photography, especially in smartphone photography. However, evaluating portrait quality in real portraits is costly, inconvenient, and difficult to reproduce. We propose a new method to evaluate a large range of detail preservation renditions on realistic mannequins. This laboratory setup can cover all commercial cameras from videoconference to high-end DSLRs. Our method is based on 1) the training of a machine learning method on a perceptual scale target 2) the usage of two different regions of interest per mannequin depending on the quality of the input portrait image 3) the merge of the two quality scales to produce the final wide range scale. On top of providing a fine-grained wide range detail preservation quality output, numerical experiments show that the proposed method is robust to noise and sharpening, unlike other commonly used methods such as the texture acutance on the Dead Leaves chart.

Digital Library: EI

Published Online: January 2024

Generalizing Handwriting and Scene-Text Detection in Images

175 50

Computer Vision
Generalization
Handwritings
Optical Character Recognition (OCR)
Text Detection
Text Recognition

Taewook Kim, Gaurav Patel, Qian Lin, Jan P. Allebach, Qiang Qiu

DOI

10.2352/EI.2024.36.8.IMAGE-242

Volume 36

Issue 8

Abstract

In this paper, we present a deep-learning approach that unifies handwriting and scene-text detection in images. Specifically, we adopt adversarial domain generalization to improve text detection across different domains and extend the conventional dice loss to provide extra training guidance. Furthermore, we build a new benchmark dataset that comprehensively captures various handwritten and scene text scenarios in images. Our extensive experimental results demonstrate the effectiveness of our approach in generalizing detection across both handwriting and scene text.

Digital Library: EI

Published Online: January 2024

Driver Monitoring System Using Deep Learning Techniques

138 53

Deep Learning
Computer Vision
Drowsiness Detection
Distraction Detection
Facial Expression Analysis
Object Detection
YOLOv8
Camera-based Monitoring

Moustafa Ibrahim, Gerrit Tamm, Reiner Creutzburg

DOI

10.2352/EI.2024.36.3.MOBMU-314

Volume 36

Issue 3

Simulating motion blur and exposure time and evaluating its effect on image quality

Abstract

The Driver Monitoring System (DMS) presented in this work aims to enhance road safety by continuously monitoring a drivers behavior and emotional state during vehicle operation. The system utilizes computer vision and machine learning techniques to analyze the drivers face and actions, providing real-time alerts to mitigate potential hazards. The primary components of the DMS include gaze detection, emotion analysis, and phone usage detection. The system tracks the drivers eye movements to detect drowsiness and distraction through blink patterns and eye-closure durations. The DMS employs deep learning models to analyze the drivers facial expressions and extract dominant emotional states. In case of detected emotional distress, the system offers calming verbal prompts to maintain driver composure. Detected phone usage triggers visual and auditory alerts to discourage distracted driving. Integrating these features creates a comprehensive driver monitoring solution that assists in preventing accidents caused by drowsiness, distraction, and emotional instability. The systems effectiveness is demonstrated through real-time test scenarios, and its potential impact on road safety is discussed.

Digital Library: EI

Published Online: January 2024

Article

454 144

Automotive
Simulation
Computer Vision
Image Signal Processor
Image Quality Metrics
Motion Blur

Hao Lin, Brian Deegan, Jonathan Horgan, Enda Ward, Patrick Denny, Ciarán Eising, Martin Glavin, Edward Jones

DOI

10.2352/EI.2023.35.16.AVM-117

Volume 35

Issue 16