DeepMind’s AI Can Create 3D Scenes From Flat 2D Images

The newly developed computer vision algorithm replicates the way human brain learns from its surroundings and generates 3D models of a scene from 2D snapshots.

DeepMind, a UK-based sister company of Google recently created an AI that has the ability to create full-fledged 3D scenes merely after observing them in 2D images.

The primary goal of DeepMind is to teach a machine the way in which humans learn. Humans have a tendency to observe their environment and then categorize everything in their mind, making assumptions around them.

Nobody ever sees the world in pixels. For example, we naturally assume someone to have a back when we look at their chest even though we might not be able to see it from our perspective.

Another ingenious example is when you play peek-a-boo with an infant or a baby. In this case, the babies still know that your face exists even though you cover it completely with your hand.

This is precisely the foundation on which the team of DeepMind builds their machines. Talking about their latest AI creation, their AI was trained to guess how things look like from different angles that it has not yet seen.

Going into the intricacies of this novel research, the scientists of DeepMind designed a Generative Query Network (GQN). This neural network teaches AI to imagine and speculate how a scene of diverse objects would appear from another perspective.

This is how it works! The AI observes different 2D pictures and later attempts to recreate it.

It is interesting to note that the AI does not use any previous knowledge or human-labeled input. It barely observes three flat 2D images and then predicts precisely how the 3D version of the same scene would look like.

For better understanding, imagine taking a picture of a cube and then asking DeepMind’s AI to give different perspectives and angles to the same picture. This would change stuff like shadows and lighting in addition to modifying the direction of all the lines that the cube is made up of.

This is done when the AI uses the Generative Query Network to imagine the different angles of the cube that it has not actually observed so that the requested image can be rendered.

The impact of this Artificial Intelligence innovation is revolutionary, to say the least. The researchers are now working towards building a “fully unsupervised scene understanding.”

However, the AI has not yet been trained to perceive images of the real world. Therefore, the next step in the journey would have to be rendering realistic scenes from 2D images.


Maze-Solving Artificial Intelligence Teaches Itself to Take Shortcuts

It is predicted that the GQN-based AI of Google’s sister company could, in the future, possibly be able to come up with on-demand 3D scenes using just photographs. These 3D scenes are predicted to be quite identical to the real world.

As mentioned by the researchers of DeepMind, “Much like infants and animals, the GQN learns by trying to make sense of its observations of the world around it. In doing so, the GQN learns about plausible scenes and their geometrical properties, without any human labeling of the contents of scenes.”

Via: DeepMind