
Owing to its ability to enable precise perception of dynamic and complex environments, point cloud semantic segmentation has become a critical task for autonomously driven vehicles in recent years. However, in complex, dynamic scenes, cumulative errors pose significant challenges for existing semantic segmentation methods, limiting their accuracy and efficiency, particularly in safety-critical applications. To address these issues, this paper introduces a novel framework that balances accuracy and computational efficiency by leveraging temporal alignment. The framework effectively captures inter-frame correlations, enhances local detail information, reduces error accumulation, and maintains detailed scene features. Furthermore, by integrating LiDAR and camera data through multi-modal fusion, the framework provides complementary perspectives, significantly improving segmentation performance and robustness in dynamic environments. This method achieves competitive performance on the benchmark SemanticKITTI and nuScenes datasets, demonstrating its capability to detect occluded objects and ensure reliable perception in safety-critical scenarios. The proposed framework offers a promising solution for enhancing the robustness and reliability of autonomous driving systems in complex environments.
Shuyi Tan, Chao Huang, Yi Zhang, Yang Wang, "Efficient and Robust Semantic Segmentation Method based on Multi-Scale Feature Fusion and Temporal Alignment" in Journal of Imaging Science and Technology, 2026, pp 1 - 11, https://doi.org/10.2352/J.ImagingSci.Technol.2026.70.4.040501