An artificial intelligence robotics system was designed to learn a new task just by seeing it done once. The non-profit artificial intelligence research company, OpenAI, has trained the robotics system only from simulation and applied it on a physical robot. The company's mission is to build safe artificial general intelligence and ensure that its benefits are openly distributed.
[Image Source: OpenAI]
The robotics system
A vision network and an imitation network power the system, which allows the robot to learn a new behavior that was demonstrated by a simulator. The robotic system can then replicate that specific behavior in different setups within the real world. A built-in camera within the robot captures an image that is ingested by the vision network and subsequently outputs state representing an object's position. Following the research group's previous work, the vision network is trained with hundreds of thousands of virtual images that has various perturbations of lighting, textures, and objects. However, the vision network doesn't practice or train using real images.
The imitation network works exactly as its name suggest. By observing a demonstration, the network processes visualized movements then imitate the task's intent from a starting configuration. This allows the network to round up the demonstration to a new setting.
[Image Source: OpenAI]
For each task, thousands of demonstrations are used to train the imitation network. A training session consists of two demonstrations that do exactly the same task. The first demonstration is given to the network which then has to observe the second demonstration only once. Supervised learning is put to place to predict the demonstrator's action at that observation. The robot is able to predict actions effectively by learning how to infer the main substance of the task using the first demonstration.
By the team's continuous work, the imitation network received a new algorithm called one-shot imitation learning. This algorithm permits a human to dish out a new task by performing it in Virtual Reality. From a single demonstration, the robotics system can solve the same task by starting from a random configuration.
Block stacking task
Equipped with the vision and imitation networks, the robot was put to test by getting it to perform a block stacking routine demonstrated via virtual reality that was controlled by a human. The team supplied the robot with training data consisting of pairs of trajectories that stack blocks into a corresponding set of towers in the same order but commencing from a different start state.
The team had to introduce some amount of noise into the outputs of the system's scripted policy in order for the imitation network to learn a robust policy. Through the use of noise data, the robotics system was able to demonstrate how to re-group itself when something goes wrong. This resulted in a more intelligent imitation network that tackles the anomalies and disturbances from an imperfect script policy.
If you're interested in being part of this robotics project, you can join OpenAI in their headquarters in San Francisco.