During the process of virtual and reality fusion interaction, accurately estimating and mapping real-world objects to their corresponding virtual counterparts is crucial for enhancing the overall interaction experience. This paper focuses on studying the pose estimation of real-world targets within this fusion context. To address the challenge of achieving precise pose estimation from single-view RGB images captured by basic devices, a high-resolution heatmap regression method is proposed. This algorithm strikes a balance between accuracy and complexity. To tackle issues stemming from inadequate utilization of semantic information in feature maps during heatmap regression, a lightweight upsampling method based on content awareness is introduced. Additionally, to mitigate resolution and accuracy loss due to quantization errors during pose calculation caused by predicted keypoints on the heatmap, a keypoint optimization module incorporating Gaussian dimensionality reduction and pose estimation strategy based on high-confidence keypoints is presented. Quantitative experimental results demonstrate that this method outperforms comparative algorithms on the LINEMOD dataset, achieving an accuracy rate of 85.7% based on the average distance index. Qualitative experiments further illustrate the successful achievement of precise real-to-virtual space pose estimation and mapping in interactive scene applications.
Xinyan Yang, Ying Ma, Feiran Sun, Yinghao Yang, Long Ye, "6D Pose Estimation based on High-Resolution Heatmap Regression for Virtual and Reality Fusion Interaction" in Journal of Imaging Science and Technology, 2025, pp 1 - 10, https://doi.org/10.2352/J.ImagingSci.Technol.2025.69.1.010414