Whether it’s broadcast television or video streamed over the top, viewers are familiar with closed captioning on their screens. This is because the FCC mandates on-screen captions to make the audio portions of over-the-air TV broadcasts accessible to the deaf and hard of hearing.
But what about radio shows? How would that even work? NPR Member Station WAMU 88.5 FM, in Washington, DC, has found a way to provide exceptionally accurate live captioning of its live radio programming, including local news and talk, as well as its nationally syndicated show “1A” as displays on their website, WAMU.org.
Recognizing the value of serving deaf and hard of hearing listeners, WAMU’s technical team, led by Senior Director of Technology Rob Bertrand, developed a workflow that allows the ENCO enCaption automated real-time speech-to-text system to take the live radio signal, convert the spoken dialog into text, and output that caption data stream for web display.
Until now, enCaption—an advanced AI-driven, neural network speech-to-text engine—has mainly been used by video content creators, such as broadcasters, corporations, worship centers, and government agencies to caption video accurately in near-real-time. WAMU’s installation proves it can also be used effectively for radio as well.
While the workflow for radio is different from that of TV, it’s also much simpler. With a TV-centric workflow, enCaption can ingest its live audio signal by direct audio feeds, such as XLR or AES3, or by de-embedding from many video formats including:
- Live SD/HD-SDI
- HDMI
- HLS and RTSP streams
The signals are then fed into a third-party CEA 608/708 caption encoder, which embeds the caption data into video for closed captions or it burns them in on top of the video for open captions.
In the case of WAMU, enCaption receives a live audio signal, such as analog XLR, AES 3, and/or AoIP formats, such as Livewire, WheatNet, AES 67, or Dante, then outputs caption data that can then be converted by a customized middleware data translation application for display on web pages. To customize this final data translation step to their operational needs, WAMU techs created the middleware software in-house that captures the caption data from enCaption’s web socket via an IP connection and converts it into data their Web hosting provider can then feed into an iFrame on their website—where the benefits of HTML and CSS allow for fully customizable caption fonts when the developer so chooses.
In the future, WAMU plans to use more of enCaption’s ability to generate highly accurate, text-searchable transcripts so people can read full-length interviews, as well as “sidecar files” that can be used for video editing, among other applications. Watch a video recording of a November 4, 2021 webinar co-presented by ENCO Media Solutions Account Manager Bill Bennett and WAMU’s Rob Bertrand to illustrate how their innovative captioning workflow is “Making Radio Accessible with Captions.”