What Is Audio Description?
Imagine trying to enjoy a TV show, movie, or live performance, but not being able to see it. It would be challenging to gain a complete understanding of what is happening. You would most likely miss crucial information that is expressed visually through gestures, character actions, or scenery, rather than through audio. There is an exorbitant amount of detail that can be conveyed in a single image. This scenario is precisely why audio description (AD) is such an important tool.
Beginner’s Guide to Audio Description [Free eBook]
What is Audio Description?
For individuals who are blind or have low vision, audio description is the key to revealing detailed information that sighted people consume without a thought.
Audio description (also referred to as “description” or “video description”) is defined as “the verbal depiction of key visual elements in media and live productions.” AD is meant to provide information on visual content that is considered essential to the comprehension of the program. In these cases, not providing AD would inhibit blind and low vision viewers from gaining a complete understanding of a given program or content.
The description of media involves the interspersion of AD snippets within the program’s original audio components, just like the example above. This is to allow the individual the benefit of the description, without diminishing the information in the existing content.
What Is Audio Description?
Audio description (also referred to as “description” or “video description”) is defined as “the verbal depiction of key visual elements in media and live productions.” AD is meant to provide information on visual content that is considered essential to the comprehension of the program.
How Do You Create Audio Description?
Audio description can be created using several different approaches, each with its own pros and cons. The two distinct steps of creating audio description are writing the script and voicing the description.
Writing the Script
- Human-Written Descriptions – A trained describer writes the AD script manually to ensure accuracy and clarity.
- AI-Generated Descriptions – Automated tools create AD scripts; however, quality can vary.
- Hybrid Approach: AI + Human Review – A hybrid model balances efficiency and accuracy by using AI-generated descriptions that are edited by humans.
Voicing Audio Description (Output Format)
- Human-Voiced AD – A professional voice artist records the descriptions.
- Synthesized Speech AD – Text-to-speech software generates a synthetic voice to read the AD script.
Regardless of the method used, it’s essential to follow best practices and audio description standards outlined by the Described Media and Captioning Project (DCMP) description key.
3Play Media’s AI Audio Description
3Play Media’s AI-Enabled Audio Description solution leverages advanced AI to both script and voice descriptions. This process provides a scalable and cost-effective way to make content more accessible. Unlike traditional methods that break videos into sections, 3Play’s process ensures the AI analyzes the entire video holistically.
Our patented solution was developed with deep involvement from 3Play Media’s expert human describers. This means that even the AI-generated descriptions are crafted with the best practices for high-quality, nuanced description in order to maintain the accuracy and richness of human-created content. Users can choose to edit the AI-generated script within the platform or upgrade to human review for additional quality assurance.
Learn More About 3Play Media’s AI Audio Description →
How Do You Publish AD?
Publishing audio description can present technical challenges, but there are multiple solutions to ensure content is accessible:
- User-Selectable Audio Track with Description – Most devices can’t merge multiple soundtracks, so this method lets users replace the original audio with a version that includes audio description. On platforms that support multiple tracks, a separate description-only track can be used.
- Pre-Mixed Versions – The AD is integrated directly into the main audio track for seamless playback.
- Pre-Mixed Extended AD Versions – Additional pauses allow for more detailed descriptions in complex scenes.
- Static Text Alternative – This method is considered an alternative to audio descriptions, and is best used for media that doesn’t have important time-based information in the original video portion of the media.
- WebVTT Description Track – A text-based alternative that screen readers can process.
Each method has its own technical requirements depending on streaming platforms, media players, and compliance regulations. Ensuring compatibility with WCAG 2.1 and accessibility laws is crucial when implementing AD solutions.
The World Wide Web Consortium (W3C) lists several sufficient techniques for adding description to audio-visual material. All of these methods are reliable ways to meet the WCAG Success Criterion.
This blog post was originally published in 2017 and has since been updated for accuracy and clarity.