IS&T | Library

Transformers for Microscopy Slide Image Segmentation of Invasive Melanoma

Abstract

This study explores the potential of graph neural networks (GNNs) to enhance semantic segmentation across diverse image modalities. We evaluate the effectiveness of a novel GNN-based U-Net architecture on three distinct datasets: PascalVOC, a standard benchmark for natural image segmentation, Wood-Scape, a challenging dataset of fisheye images commonly used in autonomous driving, introducing significant geometric distortions; and ISIC2016, a dataset of dermoscopic images for skin lesion segmentation. We compare our proposed UNet-GNN model against established convolutional neural networks (CNNs) based segmentation models, including U-Net and U-Net++, as well as the transformer-based SwinUNet. Unlike these methods, which primarily rely on local convolutional operations or global self-attention, GNNs explicitly model relationships between image regions by constructing and operating on a graph representation of the image features. This approach allows the model to capture long-range dependencies and complex spatial relationships, which we hypothesize will be particularly beneficial for handling geometric distortions present in fisheye imagery and capturing intricate boundaries in medical images. Our analysis demonstrates the versatility of GNNs in addressing diverse segmentation challenges and highlights their potential to improve segmentation accuracy in various applications, including autonomous driving and medical image analysis. Code Available at GitHub.

Digital Library: EI

Published Online: February 2025

Proceedings Paper

155 52

Cancer Detection
Melanoma Segmentation
Semantic Segmentation
Transformers
Whole-Slide Imaging

Franklin Wang, Michael Wang, Avideh Zakhor, Timothy McCalmont

DOI

10.2352/EI.2024.36.15.COIMG-119

Volume 36

Issue 15

Artist-specific style transfer for semantic segmentation of paintings: The value of large corpora of surrogate artworks

Abstract

Prognosis for melanoma patients is traditionally determined with a tumor depth measurement called Breslow thickness. However, Breslow thickness fails to account for cross-sectional area, which is more useful for prognosis. We propose to use segmentation methods to estimate cross-sectional area of invasive melanoma in whole-slide images. First, we design a custom segmentation model from a transformer pretrained on breast cancer images, and adapt it for melanoma segmentation. Secondly, we finetune a segmentation backbone pretrained on natural images. Our proposed models produce quantitatively superior results compared to previous approaches and qualitatively better results as verified through a dermatologist.

Digital Library: EI

Published Online: January 2024

Article

79 23

Semantic Segmentation
Domain Adaption
Style Transfer
Art Analysis

Thomas Heitzinger, Matthias Woedlinger, David G. Stork

Pages 186-1 - 186-6, January 2022, © Society for Imaging Science and Technology 2022

DOI

10.2352/EI.2022.34.13.CVAA-186

Volume 34

Issue 13

Abstract

Deep neural networks for semantic segmentation have recently outperformed other methods for natural images, partly due to the abundance of training data for this case. However, applying these networks to pictures from a different domain often leads to a significant drop in accuracy. Fine art paintings for highly stylized works, such as from Cubism or Expressionism, in particular, are challenging due to large deviations in shape and texture of certain objects when compared to natural images. In this paper, we demonstrate that style transfer can be used as a form of data augmentation during the training of CNN based semantic segmentation models to improve the accuracy of semantic segmentation models in art pieces of a specific artist. For this, we pick a selection of paintings from a specific style for the painters Egon Schiele, Vincent Van Gogh, Pablo Picasso and Willem de Kooning, create stylized training dataset by transferring artist-specific style to natural photographs and show that training the same segmentation network on surrogate artworks improves the accuracy for fine art paintings. We also provide a dataset with pixel-level annotation of 60 fine art paintings to the public and for evaluation of our method.

Digital Library: EI

Published Online: January 2022

Let The Sunshine in: Sun Glare Detection on Automotive Surround-view Cameras

395 70

Fusion algorithms
Semantic Segmentation
Sunglare detection

Lucie Yahiaoui, Michal Uřičář, Arindam Das, Senthil Yogamani

Pages 80-1 - 80-9, January 2020, © Society for Imaging Science and Technology 2020

DOI

10.2352/ISSN.2470-1173.2020.16.AVM-080

Volume 32

Issue 16

Sun glare is a commonly encountered problem in both manual and automated driving. Sun glare causes over-exposure in the image and significantly impacts visual perception algorithms. For higher levels of automated driving, it is essential for the system to understand that there is sun glare which can cause system degradation. There is very limited literature on detecting sun glare for automated driving. It is primarily based on finding saturated brightness areas and extracting regions via image processing heuristics. From the perspective of a safety system, it is necessary to have a highly robust algorithm. Thus we designed two complementary algorithms using classical image processing techniques and CNN which can learn global context. We also discuss how sun glare detection algorithm will efficiently fit into a typical automated driving system. As there is no public dataset, we created our own and will release it publicly via theWoodScape project [1] to encourage further research in this area.

Digital Library: EI

Published Online: January 2020

Adaptive Context Encoding Module for Semantic Segmentation

214 0

Semantic Segmentation
Adaptive Context Encoding
Convolutional Neural Networks

Congcong Wang, Faouzi Alaya Cheikh, Azeddine Beghdadi, Ole Jakob Elle

DOI

10.2352/ISSN.2470-1173.2020.10.IPAS-027

Volume 32

Issue 10

The object sizes in images are diverse, therefore, capturing multiple scale context information is essential for semantic segmentation. Existing context aggregation methods such as pyramid pooling module (PPM) and atrous spatial pyramid pooling (ASPP) employ different pooling size or atrous rate, such that multiple scale information is captured. However, the pooling sizes and atrous rates are chosen empirically. Rethinking of ASPP leads to our observation that learnable sampling locations of the convolution operation can endow the network learnable fieldof- view, thus the ability of capturing object context information adaptively. Following this observation, in this paper, we propose an adaptive context encoding (ACE) module based on deformable convolution operation where sampling locations of the convolution operation are learnable. Our ACE module can be embedded into other Convolutional Neural Networks (CNNs) easily for context aggregation. The effectiveness of the proposed module is demonstrated on Pascal-Context and ADE20K datasets. Although our proposed ACE only consists of three deformable convolution blocks, it outperforms PPM and ASPP in terms of mean Intersection of Union (mIoU) on both datasets. All the experimental studies confirm that our proposed module is effective compared to the state-of-the-art methods.

Digital Library: EI

Published Online: January 2020

LambdaNet: A Fully Convolutional Architecture for Directional Change Detection

79 8

Change Detection
Siamese Network
Fully Convolutional Network
Semantic Segmentation

Bryan Blakeslee, Andreas Savakis

DOI

10.2352/ISSN.2470-1173.2020.8.IMAWM-114

Volume 32

Issue 8

Change detection in image pairs has traditionally been a binary process, reporting either “Change” or “No Change.” In this paper, we present LambdaNet, a novel deep architecture for performing pixel-level directional change detection based on a four class classification scheme. LambdaNet successfully incorporates the notion of “directional change” and identifies differences between two images as “Additive Change” when a new object appears, “Subtractive Change” when an object is removed, “Exchange” when different objects are present in the same location, and “No Change.” To obtain pixel annotated change maps for training, we generated directional change class labels for the Change Detection 2014 dataset. Our tests illustrate that LambdaNet would be suitable for situations where the type of change is unstructured, such as change detection scenarios in satellite imagery.

Digital Library: EI

Published Online: January 2020

Edge/Region Fusion Network for Scene Labeling in Infrared Imagery

16 0

Semantic Segmentation
SegNet
Scene Labeling
Infrared Imagery
Boundary Detection

Bradley Sorg, Theus Aspiras, Vijayan Asari

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-414

Volume 31

Issue 8