For solving the low completeness of scene reconstruction by existing methods in challenging areas such as weak texture, no texture, and non-diffuse reflection, this paper proposes a multiview stereo high-completeness network, which combined the light multiscale feature adaptive aggregation module (LightMFA2), SoftPool, and a sensitive global depth consistency checking method. In the proposed work, LightMFA2 is designed to adaptively learn critical information from the generated multiscale feature map, which can solve the troublesome problems of feature extraction in challenging areas. Furthermore, SoftPool is added to the regularization process to complete the downsampling of the 2D cost matching map, which reduces information redundancy, prevents the loss of useful information, and accelerates network computing. The purpose of the sensitive global depth consistency checking method is to filter the depth outliers. This method discards pixels with confidence less than 0.35 and uses the reprojection error calculation method to calculate the pixel reprojection error and depth reprojection error. The experimental results on the Technical University of Denmark dataset show that the proposed multiview stereo high-completeness 3D reconstruction network has significantly improved in terms of completeness and overall quality, with a completeness error of 0.2836 mm and an overall error of 0.3665 mm.
Scientific and technological advances during the last decade in the fields of image acquisition, data processing, telecommunications, and computer graphics have contributed to the emergence of new multimedia, especially 3D digital data. Modern 3D imaging technologies allow for the acquisition of 3D and 4D (3D video) data at higher speeds, resolutions, and accuracies. With the ability to capture increasingly complex 3D/4D information, advancements have also been made in the areas of 3D data processing (e.g., filtering, reconstruction, compression). As such, 3D/4D technologies are now being used in a large variety of applications, such as medicine, forensic science, cultural heritage, manufacturing, autonomous vehicles, security, and bioinformatics. Further, with mixed reality (AR, VR, XR), 3D/4D technologies may also change the ways we work, play, and communicate with each other every day.
Assistive technologies are used in a variety of contexts to improve the quality of life for individuals that may have one or more vision impairments. This paper describes a novel assistive technology platform that utilizes real-time 3D spatial audio to aid its users in safe and efficient navigation. This platform leverages modern 3D scanning technology on a mobile device to digitally construct a live 3D map of a user's surroundings as they move about their space. Within the digital 3D scan of the world, spatialized, virtual audio sources (i.e., speakers) provide the navigator with a real-time 3D stereo audio "soundscape." As the user moves about the world, the digital 3D map, and its resultant soundscape, are continuously updated and played back through headphones connected to the navigator's device. This paper details (1) the underlying technical components and how they were integrated to produce the mobile application that generates a dynamic soundscape on a consumer mobile device and (2) a methodology for analyzing the usage of the application. It is the aim of this application to assist individuals with vision impairments to navigate and understand spaces safely, efficiently, and independently.
Scientific and technological advances during the last decade in the fields of image acquisition, data processing, telecommunications, and computer graphics have contributed to the emergence of new multimedia, especially 3D digital data. Modern 3D imaging technologies allow for the acquisition of 3D and 4D (3D video) data at higher speeds, resolutions, and accuracies. With the ability to capture increasingly complex 3D/4D information, advancements have also been made in the areas of 3D data processing (e.g., filtering, reconstruction, compression). As such, 3D/4D technologies are now being used in a large variety of applications, such as medicine, forensic science, cultural heritage, manufacturing, autonomous vehicles, security, and bioinformatics. Further, with mixed reality (AR, VR, XR), 3D/4D technologies may also change the ways we work, play, and communicate with each other every day.
In this paper, a low cost, single camera, double mirror system that can be built in a desktop nail printer will be described. The usage of this system is to capture an image of a fingernail and to generate the 3D shape of the nail. The nail’s depth map will be estimated from this rendered 3D nail shape. The paper will describe the camera calibration process and explain the calibration theory for this proposed system. Then a 3D reconstruction method will be introduced, as well. Experimental results will be shown in the paper, which illustrate the accuracy of the system to handle the rendering task.
Applications ranging from simple visualization to complex design require 3D models of indoor environments. This has given rise to advancements in the field of automated reconstruction of such models. In this paper, we review several state-of-the-art metrics proposed for geometric comparison of 3D models of building interiors. We evaluate their performance on a real-world dataset and propose one tailored metric which can be used to assess the quality of the reconstructed model. In addition, the proposed metric can also be easily visualized to highlight the regions or structures where the reconstruction failed. To demonstrate the versatility of the proposed metric we conducted experiments on various interior models by comparison with ground truth data created by expert Blender artists. The results of the experiments were then used to improve the reconstruction pipeline.
This paper presents an algorithm for indoor layout estimation and reconstruction through the fusion of a sequence of captured images and LiDAR data sets. In the proposed system, a movable platform collects both intensity images and 2D LiDAR information. Pose estimation and semantic segmentation is computed jointly by aligning the LiDAR points to line segments from the images. For indoor scenes with walls orthogonal to floor, the alignment problem is decoupled into top-down view projection and a 2D similarity transformation estimation and solved by the recursive random sample consensus (R-RANSAC) algorithm. Hypotheses can be generated, evaluated and optimized by integrating new scans as the platform moves throughout the environment. The proposed method avoids the need of extensive prior training or a cuboid layout assumption, which is more effective and practical compared to most previous indoor layout estimation methods. Multi-sensor fusion allows the capability of providing accurate depth estimation and high resolution visual information.
We introduce the Morpholo library which is able to convert a stereoscopic snapshot into a native multi-view image through morphing and takes into account display calibration data for specific slanted lenticular monitors. Holograms are generated fast by creating Lookup Tables to replace runtime computation. The implementation of Morpholo for glasses-free streaming of live 3D video is considered.