
Human gestures in the real-world are complex, ranging from sign language, to full body motion, to extremely dynamic poses such as crawling and dancing. This study examines a set of multimodal sensory fusion methods to support the real-time operation and training without the need for wearable equipment. We articulate the gesture tracking sensor modality based on the gesture tracking types, accuracy, detection latency, distance, key point requirements, and accuracy with LiDAR, webcam and inertial measurement unit (IMU) for complex gesture recognition. We applied the methodology to applications of gait detection and tracking in a high altitude, sign language detection, and background noise removal in a crewed space. Our experiments show that the usability of the multimodal interfaces can be tested in a simulated environment and measured with instruments objectively.
Hsu-Wei Chen, Yang Cai, "Tracking Complex Gestures with Multimodal Sensors" in Electronic Imaging, 2026, pp 280-1 - 280-1, https://doi.org/10.2352/EI.2026.38.6.ISS-280