Benefits of Having Captions in Video Content
By Danny Guzman, Accessibility Specialist
Captions provide a textual representation of dialogue and other important audio for people with hearing disabilities. Additionally, they make it possible to watch videos without sound, which is important on social networks that play videos without sound by default, or in a busy, noisy environment (think of trying to watch a sports game in an airport). So, what are captions? Captions provide synchronized text of the audio content and include non-speech elements like noises. They may be opened (always shown on screen) or closed (available via enable/disable through a media player). Captions are necessary for accessibility and must be synchronized with the multimedia.
Live captions can be created by either professional real-time captioners, also called Communication Access Realtime Translation (CART) providers, or Artificial/Computer Generated (AI).
Live captions created by real-time captioners can be done in-person or remotely. That is, the person doing the captioning/CART does not have to be at the same location as the live action. They can be creating the live captions by listening to the audio over a phone or Internet connection. Live captions posted as a recording, will most likely need minor editing for accuracy.
Live captions created by AI are usually offered by platforms such as YouTube, Zoom, Adobe Connect, etc. AI live caption provides users with instant access to live captioning. The drawback is that it tends to have accuracy issues and usually performs about 90 to 95%, when posting that content as a pre-recorded audio or synchronized multimedia file—too low for users who have hearing disabilities.
Captions generated automatically by AI (Artificial/Computer Generated) should be reviewed and edited for accuracy and final posting. Many times, the captions do not exactly match the spoken audio. Words may be misunderstood, important audio not captured, and acronyms are missed, which can lead to ineffective captions. Just missing a word can change the representation of the dialogue and therefore the accessibility of the media. Creating captions for accessibility requires an understanding of which non-speech audio information should be included in the captions.
Transcribing an audio file can be challenging and time-consuming for those who lack the skills and expertise. Although the caption file structure is simple, adding timestamps for synchronization is time consuming. Free software, such as YouTube Editor, can aid in producing a transcript with timestamps fairly seamlessly.
To learn more about Captions, visit The World Wide Web Consortium (W3C) captions site at: https://www.w3.org/WAI/media/av/captions/