Microsoft trains ChatGPT to control robots

The language model could command robot arms, drones, and home assistant robots.
Jijo Malayil
AI, Machine learning
AI, Machine learning


Imagine a scenario in which you can directly communicate with robots, enabling them to complete various tasks for you. To achieve this, Microsoft has outlined its plans to partner with OpenAI to develop ChatGPT's capabilities to control robots. The software giant used the chatbot and "controlled multiple platforms such as robot arms, drones, and home assistant robots intuitively with language," the company wrote in a blog post

Robots still rely heavily on hand-written codes to perform their tasks, while humans find spoken language the most intuitive way to communicate. Microsoft has worked to alter this reality and "make natural human-robot interactions possible using OpenAI‘s new AI language model, ChatGPT."

How can ChatGPT help in this regard? 

The team plans to leverage the platform's ability to develop coherent and grammatically correct responses to various prompts and questions and see if ChatGPT can think beyond the text and reason about the physical world to help with robotics tasks. "We want to help people interact with robots more easily, without needing to learn complex programming languages or details about robotic systems."

The key obstacle in the way for a language model based on AI is to solve problems considering the laws of physics, the context of the operating environment, and how the robot’s physical actions can change the state of the world. Even though ChatGPT can do a lot alone, it still needs some help. Microsoft has released a series of design principles, including unique prompting structures, high-level APIs, and human feedback via text. These models can be used to guide language models toward solving robotics tasks. 

The firm is also introducing PromptCraft, an open-source platform where anyone can "share examples of prompting strategies for different robotics categories."

Using these design principles, researchers could fine-tune and utilize ChatGPT's knowledge to control different robot form factors for various tasks. The team could use the language model to solve "robotics puzzles, along with complex robot deployments in the manipulation, aerial, and navigation domains."

Various instances where the model worked

The team was able to use the system to allow ChatGPT to control a drone. According to Microsoft, ChatGPT asked follow-up questions when the commands were unclear and "wrote complex code structures for the drone such as a zig-zag pattern to inspect shelves visually. It even figured out how to take a selfie."

The model also performed a simulated industrial inspection exercise with the Microsoft AirSim simulator. "The model was able to effectively parse the user’s high-level intent and geometrical cues to control the drone accurately."

The model showed the ability to bridge textual and physical domains when tasked with building the Microsoft logo out of wooden blocks.

ChapGPT could also write an algorithm for a drone to reach a goal in space while not crashing into obstacles.

Microsoft has, however, sounded a word of caution for users as such practices need a thorough analysis before being used in their day-to-day lives. "We encourage users to harness the power of simulations to evaluate these algorithms before potential real-life deployments and to always take the necessary safety precautions."

Add Interesting Engineering to your Google News feed.
Add Interesting Engineering to your Google News feed.
message circleSHOW COMMENT (1)chevron
Job Board