Back to articles
Proceedings Paper
Volume: 37 | Article ID: AVM-105
Image
Enhancing Robotic Navigation with Large Language Models
  DOI :  10.2352/EI.2025.37.15.AVM-105  Published OnlineFebruary 2025
Abstract
Abstract

Robotics has traditionally relied on a multitude of sensors and extensive programming to interpret and navigate environments. However, these systems often struggle in dynamic and unpredictable settings. In this work, we explore the integration of large language models (LLMs) such as GPT-4 into robotic navigation systems to enhance decision-making and adaptability in complex environments. Unlike many existing robotics frameworks, our approach uniquely leverages the advanced natural language and image processing capabilities of LLMs to enable robust navigation using only a single camera and an ultrasonic sensor, eliminating the need for multiple specialized sensors and extensive pre-programmed responses. By bridging the gap between perception and planning, this framework introduces a novel approach to robotic navigation. It aims to create more intelligent and flexible robotic systems capable of handling a broader range of tasks and environments, representing a major leap in autonomy and versatility for robotics. Experimental evaluations demonstrate promising improvements in the robot’s effectiveness and efficiency across object recognition, motion planning, obstacle manipulation, and environmental adaptability, highlighting its potential for more advanced applications. Future developments will focus on enabling LLMs to autonomously generate motion profiles and executable code for tasks based on verbal instructions, allowing these actions to be carried out without human intervention. This advancement will further enhance the robot’s ability to perform specific actions independently, improving both its autonomy and operational efficiency.

Subject Areas :
Views 2
Downloads 0
 articleview.views 2
 articleview.downloads 0
  Cite this article 

Xunyu Pan, Jeremy Perando, "Enhancing Robotic Navigation with Large Language Modelsin Electronic Imaging,  2025,  pp 105-1 - 105-7,  https://doi.org/10.2352/EI.2025.37.15.AVM-105

 Copy citation
  Copyright statement 
Copyright © 2025, Society for Imaging Science and Technology
ei
Electronic Imaging
2470-1173
2470-1173
Society for Imaging Science and Technology
IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA