The new capabilities of GPT-4 provide a “virtual volunteer” for the visually impaired

OpenAI has introduced its latest powerful AI model to the world, GPT-4, and refreshingly, the first thing they’ve worked with with the new capabilities is helping the visually impaired. Be My Eyes, which allows blind and partially sighted people to ask sighted people to describe what their phone sees, is getting a “virtual volunteer” to offer AI-powered help at any time.
We have written about Be My Eyes many times since its launch in 2015, and of course the rise of computer vision and other devices that help the visually impaired to navigate everyday life has played a prominent role in its history. But the app itself can only do so much, and the core function has always been to get a volunteer helping hand who could look through the phone’s camera and provide detailed descriptions or instructions.
The new version of the application is the first to integrate the multimodal capability of GPT-4, meaning that it not only chats intelligibly, but also examines and understands the images it receives.
Users can send images through the app to an AI-powered virtual volunteer who will answer any question about the image and provide instant visual assistance for a variety of tasks.
For example, if a user sends a picture of the inside of their refrigerator, the Virtual Volunteer can not only correctly identify what’s inside, but also extrapolate and analyze what can be made with those ingredients. The tool can then offer you a number of recipes for these ingredients and send you step-by-step instructions on how to prepare them.
But the video accompanying the description is more illustrative. In it, Be My Eyes user Lucy shows off the app that helps her with a bunch of things live. If you’re not familiar with the screen reader’s rapid-fire cotton, you might miss some of the dialogue, but she’ll describe what a dress looks like, identify a plant, read a map, translate a label, control a certain treadmill at the gym, and tell her which buttons to press. press at the machine. (You can watch the video below.)
Be My Eyes is a virtual volunteer
It shows very succinctly how unfriendly much of our urban and commercial infrastructure is to people with visual impairments. And it also shows how useful GPT-4’s multimodal chat can be under the right circumstances.
There’s no doubt that human volunteers will continue to be useful to users of the Be My Eyes app – they can’t be replaced, they only raise the bar when needed (and indeed can be called in immediately if the AI’s response isn’t good enough).
As an example, the AI helpfully suggests at the gym that “the machines available are those without humans on them”. Dirt! As OpenAI co-founder Sam Altman said today, the capabilities are more impressive at first glance than we’ve seen in a while, but we have to be careful not to look too closely at this gift horse’s mouth.
The Be My Eyes team works closely with OpenAI and its community to define and manage its capabilities as development continues.
Currently, the feature is in closed beta for a “small portion” of Be My Eyes users, which will expand in the coming weeks. “We hope to make Virtual Volunteer widely available in the coming months,” the team writes. “Like our existing volunteering service, this tool is free for all blind and partially sighted communities using the Be My Eyes app.”
Considering how quickly ChatGPT was brought on board to provide services for enterprise SaaS platforms and other rather prosaic applications, it’s nice to see the novelty immediately getting to work and helping people. You can read more about GPT-4 here.