IS&T | Library

Abstract

Scientific and technological advances during the last decade in the fields of image acquisition, data processing, telecommunications, and computer graphics have contributed to the emergence of new multimedia, especially 3D digital data. Modern 3D imaging technologies allow for the acquisition of 3D and 4D (3D video) data at higher speeds, resolutions, and accuracies. With the ability to capture increasingly complex 3D/4D information, advancements have also been made in the areas of 3D data processing (e.g., filtering, reconstruction, compression). As such, 3D/4D technologies are now being used in a large variety of applications, such as medicine, forensic science, cultural heritage, manufacturing, autonomous vehicles, security, and bioinformatics. Further, with mixed reality (AR, VR, XR), 3D/4D technologies may also change the ways we work, play, and communicate with each other every day.

Digital Library: EI

Published Online: January 2023

Few-shot learning on point clouds for railroad segmentation

217 80

Few Shot Learning
Point Cloud
Segmentation
Railroad

Abdur Razzaq Fayjie, Patrick Vandewalle

DOI

10.2352/EI.2023.35.17.3DIA-100

Volume 35

Issue 17

Abstract

Infrastructure maintenance of complex environments like railroads is a very expensive operation. Recent advances in mobile mapping systems to collect 3D point cloud data and in deep learning for detection and segmentation can prove to be very helpful in automating this maintenance and allowing preventive maintenance at certain locations before big failures occur. Some fully-supervised methods have been developed for understanding dynamic railroad environments. These methods often fail to generalize to infrastructure changes or new classes in low-labeled data. To address this issue, we propose a railroad segmentation method that leverages few-shot learning by generating class prototypes for the most relevant infrastructure classes. This method takes advantage of existing embedding networks for point clouds, taking the geometrical and spatial context into account for feature representation of complex connected classes. We evaluate our method on real-world data measured on Belgian railway tracks. Our model achieves promising results on connected classes, exposed to only a few annotated samples at test time.

Digital Library: EI

Published Online: January 2023

Appearance segmentation and documentation applied to cultural heritage surfaces

322 77

RTI -Reflectance Transformation Imaging
Hemispherical harmonics
Normalization
Linear discriminant model
Appearance segmentation
Visualization
Conservation documentation

Sunita Saha, Amalia Siatou, Christian Degrigny, Alamin Mansouri, Robert Sitnik

Pages 101-1 - 101-6, January 2023, This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. 2023

DOI

10.2352/EI.2023.35.17.3DIA-101

Volume 35

Issue 17

Abstract

This paper describes the development and application of a novel supervised segmentation technique used for conservation documentation based on visible appearance changes of Cultural Heritage (CH) metal surfaces. The technique is based on employing a linear discriminant analysis model to classify Reflectance Transformation Imaging (RTI) reconstruction coefficients. The Hemispherical Harmonics (HSH) reconstruction coefficients for each pixel are first calculated and then normalized. This normalization increases the robustness and invariance of the application making it possible to apply it for documenting different surfaces and at different time intervals. In this paper, we presented three case studies related to corrosion assessment of CH objects through detection of corrosion and monitoring the degree of silver tarnishing. For each case study, a supervised data set is constructed, teaching the algorithm to recognize as distinct a specified appearance characteristic (such as corrosion, metal etc.) by comparing it to the reconstruction coefficients of each pixel. The segmented information is visualized by a simplified colormap. The calculated results are afterwards verified by visible inspection from conservation-restoration experts. The method can segment surfaces with changes in micro-geometry, but it reaches its limitation on surfaces with minimal topography and high specularity.

Digital Library: EI

Published Online: January 2023

Learned visual localization with camera pose refinement and verification based on differentiable renderer

285 48

Visual localization
Place recognition
Deep learning
Differentiable renderer
Camera pose estimation
Pose verification

Chanchang Tsai, Hajime Taira, Masatoshi Okutomi

DOI

10.2352/EI.2023.35.17.3DIA-102

Volume 35

Issue 17

Abstract

This manuscript presents a new CNN-based visual localization method that seeks a camera location of an input RGB image with respect to a pre-collected RGB-D images database. To determine an accurate camera pose, we employ a coarse-to-fine localization manner that firstly finds coarse location candidates via image retrieval, then refines them using local 3D structure represented by each retrieved RGB-D image. We use a CNN feature extractor and a relative pose estimator for coarse prediction that do not sufficiently require a scene-specific training. Furthermore, we propose a new pose refinement-verification module that simultaneously evaluates and refines camera poses using differentiable renderer. Experimental results on public datasets show that our proposed pipeline achieves accurate localization on both trained and unknown scenes.

Digital Library: EI

Published Online: January 2023

3D mesh saliency from local spiral hop descriptors

113 36

Saliency
3D Meshes
Spiral descriptor
Structure tensor

Olivier Lézoray, Anass Nouri

DOI

10.2352/EI.2023.35.17.3DIA-103

Volume 35

Issue 17

Abstract

Mesh saliency, the process of detecting visually important regions in 3D meshes, is a significant component in computer graphics, that can be used in various applications such as denoising and simplification. In this paper, we propose a new 3D mesh saliency measure that can identify sharp geometric features in meshes. A local normal-based descriptor is built for each vertex thanks to a spiral path within a 2-hop neighborhood. First, a geometric-based saliency is computed as the mean local alignment between the spiral descriptors within a 1-hop, and weighted by a vertex roughness measure. Second, a spectral-based saliency is computed from the spectral energy of each vertex structure tensor with the gradient defined from the spiral descriptor alignments. The final saliency is then defined as a weighted sum of both. This single-scale saliency can be extended to a multi-scale saliency by decimating the mesh at several scales and averaging back the obtained saliencies after mapping them between decimated meshes. The approach presents competitive results with state-of-the-art.

Digital Library: EI

Published Online: January 2023

Layered view synthesis for general images

104 45

View Synthesis
3D
Inpainting
Depth Estimation

Loïc Dehan, Wiebe Van Ranst, Patrick Vandewalle

Pages 104-1 - 104-6, January 2023, This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. 2023

DOI

10.2352/EI.2023.35.17.3DIA-104

Volume 35

Issue 17

Abstract

We describe a novel method for monocular view synthesis. The goal of our work is to create a visually pleasing set of horizontally spaced views based on a single image. This can be applied in view synthesis for virtual reality and glasses-free 3D displays. Previous methods produce realistic results on images that show a clear distinction between a foreground object and the background. We aim to create novel views in more general, crowded scenes in which there is no clear distinction. Our main contributions are a computationally efficient method for realistic occlusion inpainting and blending, especially in complex scenes. Our method can be effectively applied to any image, which is shown both qualitatively and quantitatively on a large dataset of stereo images. Our method performs natural disocclusion inpainting and maintains the shape and edge quality of foreground objects.

Digital Library: EI

Published Online: January 2023

DL-based floorplan generation from noisy point clouds

139 58

Floorplan generation
Indoor Stereo SLAM
Noisy point cloud filtering
Real-time 3D reconstruction

Xin Liu, Egor Bondarev, Peter H.N. de With

DOI

10.2352/EI.2023.35.17.3DIA-105

Volume 35

Issue 17

Abstract

Remote inspections of unknown and hostile environments can be performed by military/police personnel via deployment of sensors and SLAM-based 3D reconstruction techniques. However, the generated point clouds (PCs) cannot be transmitted to coordinators, because of their volume sizes. A common data-reduction solution is to convert the PC-based 3D models into 2D floorplans. In this paper, we propose a system with an end-to-end network for automated floorplan generation from noisy PCs to estimate the main building structures (doors, windows and walls). First, the noisy 3D PC is column filtered to remove irrelevant or noise points. Second, we project the remaining points onto a grid map. Finally, an end-to-end neural network is trained to extract an accurate line-based floorplan from the grid map. Experimental results reveal that the system generates floorplans that accurately represent the main structures of a building. On average, the estimated floorplans reach 0.73 F1 score for the building-layout evaluation, which outperforms the state-of-the-art methods. Furthermore, the model size is reduced by multiple thousands of times on the average.

Digital Library: EI

Published Online: January 2023

A comparative evaluation of 3D geometries of scenes estimated using factor graph based disparity estimation algorithms

163 40

Passive Stereo Vision
3D Reconstruction
Factor Graph-based Stereo Matching Algorithm
Disparity Map

Hanieh Shabanian, Madhusudhanan Balasubramanian

DOI

10.2352/EI.2023.35.17.3DIA-107

Volume 35

Issue 17

Abstract

Passive stereo vision systems are useful for estimating 3D geometries from digital images similar to the human biological system. In general, two cameras are situated at a known distance from the object and simultaneously capture images of the same scene from different views. This paper presents a comparative evaluation of 3D geometries of scenes estimated by three disparity estimation algorithms, namely the hybrid stereo matching algorithm (HCS), factor graph-based stereo matching algorithm (FGS), and a multi-resolution FGS algorithm (MR-FGS). Comparative studies were conducted using our stereo imaging system as well as hand-held, consumer-market digital cameras and camera phones of a variety of makes/models. Based on our experimental results, the factor graph algorithm (FGS) and multi-resolution factor graph algorithm (MR-FGS) result in a higher level of 3D reconstruction accuracy than the HCS algorithm. When compared with the FGS algorithm, MR-FGS provides a significant improvement in the disparity contrast along the depth boundaries and minimal depth discontinuities.

Digital Library: EI

Published Online: January 2023

Assistive mobile application for real-time 3D spatial audio soundscapes toward improving safe and independent navigation

129 38

3D range scanning
3D reconstruction
3D localization and mapping
spatial audio
assistive devices
assistive technologies
multimedia on mobile devices
3D scene classification

Broderick S. Schwartz, Tyler Bell

DOI

10.2352/EI.2023.35.17.3DIA-108

Volume 35

Issue 17

Abstract

Assistive technologies are used in a variety of contexts to improve the quality of life for individuals that may have one or more vision impairments. This paper describes a novel assistive technology platform that utilizes real-time 3D spatial audio to aid its users in safe and efficient navigation. This platform leverages modern 3D scanning technology on a mobile device to digitally construct a live 3D map of a user's surroundings as they move about their space. Within the digital 3D scan of the world, spatialized, virtual audio sources (i.e., speakers) provide the navigator with a real-time 3D stereo audio "soundscape." As the user moves about the world, the digital 3D map, and its resultant soundscape, are continuously updated and played back through headphones connected to the navigator's device. This paper details (1) the underlying technical components and how they were integrated to produce the mobile application that generates a dynamic soundscape on a consumer mobile device and (2) a methodology for analyzing the usage of the application. It is the aim of this application to assist individuals with vision impairments to navigate and understand spaces safely, efficiently, and independently.

Digital Library: EI

Published Online: January 2023

3D nuclei segmentation for multi-cellular quantification of zebrafish embryos using NISNet3D

269 82

Nuclei Segmentation
Zebra Embryo
NISNet3D

Linlin Li, Liming Wu, Alain Chen, Edward J. Delp, David M. Umulis

DOI

10.2352/EI.2023.35.17.3DIA-109

Volume 35

Issue 17