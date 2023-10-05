Job IconENGINEERING JOBSJob Vector

Here are five unique ways of using ChatGPT vision

It is basically GPT-4 but with vision.
Sejal Sharma
| Oct 05, 2023 09:36 AM EST
Created: Oct 05, 2023 09:36 AM EST
Representational image
Laurence Dutton/iStock 

On September 25, OpenAI gave ChatGPT the ability to see, hear, and speak, making it a truly multimodal large language model. And along came GPT-4V, which is basically GPT-4 but with vision.

This feature enables users to input image prompts of almost anything under the sun and ask GPT-4V to analyze them. Available currently only to ChatGPT Plus subscribers, users are posting on social media how they are using and utilizing the upgrade.

We have assembled for you the most interesting ways in which the world is using the features of GPT-4V.

1. Making education accessible

Touting it as ‘the future of education,’ an X user posted a video of ChatGPT deciphering a diagram of a human cell from a 9th-grade book. The model picked up the textual cues from the labeled diagram and explained the functions of each component of the cell in a point-by-point manner.

Related

As a visual tutor, ChatGPT can help teachers as well in planning lessons, activities, and homework. For students, it can help in doing that homework, although that is something that might not be preferred by the current education system.

2. Breaking down complex workflow processes

Another user uploaded an image that showed a complex chart containing instructions to execute a completion in OpenAI. In its response, ChatGPT can be seen breaking down the complex-looking diagram and explaining exactly what’s happening. Most likely, the chatbot will be able to further break down Pentagon slides, and PowerPoint templates and then give an information analysis of the same.

Another X user posted the following image, showing how ChatGPT came up with the correct answer for an unbelievably hard-to-process image.

3. Identifying animals, birds, and other objects

Apparently, the chatbot can also recognize birds and animals. A user on X took a picture of a bird and then asked ChatGPT to identify it and also write a JSON file for it. The chatbot identified the bird as an ibis and populated a JSON for it.

However, the chatbot could not identify the geolocation of a person.

An X user clicked a photo of a cloth hanging on their wall and asked ChatGPT to describe it. Here’s the result:

As can be seen, people are using the Vision feature to identify objects, some of which might also be obscure and get elaborate explanations about them.

4. Writing code

We always knew ChatGPT could write code. But with Vision, it can write code using only a picture, thus reducing the barrier between idea and execution.

5. Identifying memes and popular culture

For many users, ChatGPT could not identify a movie from its image, hinting that it has not been trained in movies and TV series, which is great because otherwise, it would have meant another case of copyright infringement for the chatbot. But it could identify movie scenes that are part of popular culture.

The good thing is that GPT-4V refuses to answer questions about hate symbols and extremist content in some instances but not all. The behavior may be inconsistent and, at times, contextually inappropriate. For instance, it knows the historic meaning of the Templar Cross but misses its modern meaning in the US, where it has been appropriated by hate groups.

In order to mitigate hate, the model refuses requests for identifying race, age, identity, and ungrounded inferences.

