If you listened to a video without visuals, you wouldn’t get the complete experience. But that is the exact experience many blind and visually impaired people have when they follow a video tutorial or scroll through social media. Professor Sri Kurniawan and her colleagues at UC Santa Cruz hope to solve this problem by working with the local blind and visually impaired community to identify useful features for new automatically generated audio description programs.
Blind and visually impaired people use audio descriptions to hear this visual information, but developers often leave out their voices in the design of these tools, such as audio readers. Audio readers read out descriptions of visual material like photos or PDF files. This oversight results in engineers wasting time developing useless features. Even worse, they may omit features that would drastically improve the experience for users. What’s more, human audio description writers, like those employed by video production companies like Netflix, cannot match the dizzying volume of new videos uploaded to social media every day.
The blind and visually impaired community has a few existing audio description programs to choose from. In November, speakers at an event at Lighthouse for the Blind and Visually Impaired, a local nonprofit organization, introduced their audience to several audio description programs that use artificial intelligence (AI) to describe documents and images in real time. Early forms of AI made possible screen readers that describe PDF and image files.
The speakers at this event said Seeing AI is one of the most user-friendly and helpful audio description programs for the community. Though they noted it still made mistakes during a demonstration at the talk, such as misreading one digit in the phone number on a business card.
Kurniawan and her collaborators at Arizona State University hope to use generative AI like ChatGPT to create audio descriptions for everything from educational and work explainer videos to entertainment like the latest cute cat video sweeping the internet. To ensure their program serves blind and visually impaired users well, they decided to start the development process by working with those communities.
To do this, graduate students Amina Kobenova and Rohan Jhangiani compiled audio description guidelines from accessibility organizations and popular sites like Netflix. They surveyed 23 members of the Bay Area blind and visually impaired community. The survey asked them to rate the usefulness of each of the 128 guidelines on a scale of one to five. As these guidelines were written for audio description writers, who are by necessity sighted, members of this community rarely hear about them. “These are the pieces of advice that other people use when designing content for you. What do you have to say about this?” said Jhangiani.
Some members of the blind and visually impaired community who participated in this project seem cautiously optimistic. Connie Jung, a survey participant who has low vision, said about the accessibility of any new audio description program, “It’s not going to be perfect,” but it will have some useful features and some tasks it struggles with.
The UCSC researchers want to center this community, following the principles of user centered design. “(We want to) ground the solution in what they need,” said Kurniawan. Jung hopes that community input into Kurniawan’s program will improve the user experience since their feedback is included “as the program is being designed.”
The researchers plan to keep the blind and visually impaired community involved throughout the development process. “Data often flattens rich experiences,” said Jhangiani. These “rich experiences” of day-to-day audio description program use likely includes insights that the team didn’t know to ask about in their surveys.
In the next phase of their research, the team will interview people who had strong opinions about the guidelines — positive and negative — and incorporate their feedback into the program. After Kurniawan’s team and their collaborators build the program, they plan to include blind and visually impaired people in the testing and refinement of the program.