During the process of virtual and reality fusion interaction, accurately estimating and mapping real-world objects to their corresponding virtual counterparts is crucial for enhancing the overall interaction experience. This paper focuses on studying the pose estimation of real-world targets within this fusion context. To address the challenge of achieving precise pose estimation from single-view RGB images captured by basic devices, a high-resolution heatmap regression method is proposed. This algorithm strikes a balance between accuracy and complexity. To tackle issues stemming from inadequate utilization of semantic information in feature maps during heatmap regression, a lightweight upsampling method based on content awareness is introduced. Additionally, to mitigate resolution and accuracy loss due to quantization errors during pose calculation caused by predicted keypoints on the heatmap, a keypoint optimization module incorporating Gaussian dimensionality reduction and pose estimation strategy based on high-confidence keypoints is presented. Quantitative experimental results demonstrate that this method outperforms comparative algorithms on the LINEMOD dataset, achieving an accuracy rate of 85.7% based on the average distance index. Qualitative experiments further illustrate the successful achievement of precise real-to-virtual space pose estimation and mapping in interactive scene applications.
The utility and ubiquitousness of virtual reality make it an extremely popular tool in various scientific research areas. Owing to its ability to present naturalistic scenes in a controlled manner, virtual reality may be an effective option for conducting color science experiments and studying different aspects of color perception. However, head mounted displays have their limitations, and the investigator should choose the display device that meets the colorimetric requirements of their color science experiments. This paper presents a structured method to characterize the colorimetric profile of a head mounted display with the aid of color characterization models. By way of example, two commercially available head mounted displays (Meta Quest 2 and Meta Quest Pro) are characterized using four models (Look-up Table, Polynomial Regression, Artificial Neural Network, and Gain Gamma Offset), and the appropriateness of each of these models is investigated.
In this paper, I present the proposal of a virtual reality subjective experiment to be performed at Texas State University, which is part of the VQEG-IMG test plan for the definition of a new recommendation for subjective assessment of eXtended Reality (XR) communications (work item ITU-T P.IXC). More specifically, I discuss the challenges of estimating the user quality of experience (QoE) for immersive applications and detail the VQEG-IMG test plan tasks for XR subjective QoE assessment. I also describe the experimental choices of the audio-visual experiment to be performed at Texas State University, which has the goal of comparing two possible scenarios for teleconference meetings: a virtual reality representation and a realistic representation.
Concerns about head mounted displays have led to numerous studies about their potential impact on the visual system. Yet, none have investigated if the use of Virtual Reality (VR) Head Mounted Displays with their reduced field of view and visually soliciting visual environment, could reduce the spatial spread of the attentional window. To address this question, we measured the useful field of vision in 16 participants right before playing a VR game for 30 minutes and immediately afterwards. The test involves calculation of a presentation time threshold necessary for efficient perception of a target presented in the centre of the visual field and a target presented in the periphery. The test consists of three subtests with increasing difficulty. Data comparison did not show significant difference between pre-VR and post-VR session (subtest 2: F(1,11) = .7 , p = .44; subtest 3 F(1,11) = .9 , p = .38). However, participants’ performances for central target perception decreased in the most requiring subtest (F(1,11) = 8.1, p = .02). This result suggests that changes in spatial attention could be possible after prolonged VR presentation.
At present, the research on emotion in the virtual environment is limited to the subjective materials, and there are very few studies based on objective physiological signals. In this article, the authors conducted a user experiment to study the user emotion experience of virtual reality (VR) by comparing subjective feelings and physiological data in VR and two-dimensional display (2D) environments. First, they analyzed the data of self-report questionnaires, including Self-assessment Manikin (SAM), Positive And Negative Affect Schedule (PANAS) and Simulator Sickness Questionnaire (SSQ). The result indicated that VR causes a higher level of arousal than 2D, and easily evokes positive emotions. Both 2D and VR environments are prone to eye fatigue, but VR is more likely to cause symptoms of dizziness and vertigo. Second, they compared the differences of electrocardiogram (ECG), skin temperature (SKT) and electrodermal activity (EDA) signals in two circumstances. Through mathematical analysis, all three signals had significant differences. Participants in the VR environment had a higher degree of excitement, and the mood fluctuations are more frequent and more intense. In addition, the authors used different machine learning models for emotion detection, and compared the accuracies on VR and 2D datasets. The accuracies of all algorithms in the VR environment are higher than that of 2D, which corroborated that the volunteers in the VR environment have more obvious skin electrical signals, and had a stronger sense of immersion. This article effectively compensated for the inadequacies of existing work. The authors first used objective physiological signals for experience evaluation and used different types of subjective materials to make contrast. They hope their study can provide helpful guidance for the engineering reality of virtual reality.
Modern virtual reality (VR) headsets use lenses that distort the visual field, typically with distortion increasing with eccentricity. While content is pre-warped to counter this radial distortion, residual image distortions remain. Here we examine the extent to which such residual distortion impacts the perception of surface slant. In Experiment 1, we presented slanted surfaces in a head-mounted display and observers estimated the local surface slant at different locations. In Experiments 2 (slant estimation) and 3 (slant discrimination), we presented stimuli on a mirror stereoscope, which allowed us to more precisely control viewing and distortion parameters. Taken together, our results show that radial distortion has significant impact on perceived surface attitude, even following correction. Of the distortion levels we tested, 5% distortion results in significantly underestimated and less precise slant estimates relative to distortion-free surfaces. In contrast, Experiment 3 reveals that a level of 1% distortion is insufficient to produce significant changes in slant perception. Our results highlight the importance of adequately modeling and correcting lens distortion to improve VR user experience.
We analyzed the impact of common stereoscopic three-dimensional (S3D) depth distortion on S3D optic flow in virtual reality environments. The depth distortion is introduced by mismatches between the image acquisition and display parameters. The results show that such S3D distortions induce large S3D optic flow distortions and may even induce partial/full optic flow reversal within a certain depth range, depending on the viewer’s moving speed and the magnitude of S3D distortion. We hypothesize that the S3D optic flow distortion may be a source of intra-sensory conflict that could be a source of visually induced motion sickness.
This paper describes a comparison of user experience of virtual reality (VR) image format. The authors prepared the following four conditions and evaluated the user experience during viewing VR images with a headset by measuring subjective and objective indices; Condition 1: monoscopic 180-degree image, Condition 2: stereoscopic 180-degree image, Condition 3: monoscopic 360-degree image, Condition 4: stereoscopic 360-degree image. From the results of the subjective indices (reality, presence, and depth sensation), condition 4 was evaluated highest, and conditions 2 and 3 were evaluated to the same extent. In addition, from the results of the objective indices (eye and head tracking), a tendency to suppress head movement was found in 180-degree images.
The increased replication of human behavior with virtual agents and proxies in a multi-user or collaborated virtual reality environment (CVE) has influenced the eruption of scholastic research and training. The capabilities of the user experience for emergency response and training in emergency and catastrophic situations may be highly influenced by the use of computer bots, avatar, and virtual agents. Our contribution and proposal for the concerted collaborated Virtual Reality nightclub environment consequently warrants the flexibility to run manifold scenarios and evacuation drills in reaction to emergency and disaster preparedness. Modeling such an environment is very essential because it helps emulate the emergencies we experience in our routine lives and provide a learning platform towards the preparation of extreme events. The results of the user study to measure presence in the VE using presence questionnaire (PQ) are discussed in detail, and it was found that there is a consistent positive relation between presence and task performance in VEs. The results further suggest that most users feel that this application could be a good tool for education and training purposes.
Developing an augmented reality (AR) system involves a multitude of interconnected algorithms such as image fusion, camera synchronization and calibration, and brightness control, each having diverse parameters. This abundance of features, while beneficial in nature for its applicability to different tasks, is detrimental to developers as they try to navigate different combinations and pick the most suitable configuration for their application. Additionally, the temporally inconsistent nature of the real world hinders the development of reproducible and reliable testing methods for AR systems. To help address these issues, we develop and test a virtual reality (VR) environment [1] that allows the simulation of variable AR configurations for image fusion. In this work, we improve our system with a more realistic AR glass model adhering to physical light and glass properties. Our implementation combines the incoming real-world background light and the AR projector light at the level of the AR glass.