Watching a video on a mobile phone.

Captions on videos benefit everyone. They are essential for those who are d/Deaf or hard of hearing and increase audience comprehension – particularly for those listening to content in a second language. Many of us also rely on captions in situations when using audio would disturb others, or on those occasions when we’d prefer to read content rather than watch it.

At the University of Southampton, we are familiar with Automated Captions in recordings such as Stream and Panopto. Yet, while the quality of automated captions has improved, they are rarely 100% accurate.

This article provides advice on how to correct captions, and how to create transcripts for our videos.

Correcting captions in Stream or Panopto

Both Stream and Panopto provide tools for correcting captions:

This built-in functionality is simple and allows any video owner to correct inaccuracies in captions. Then if we want more control over the timing and other settings, there are other tools we can use. SubtitleEdit is now available from Additional Software on University-managed PCs thanks to Ash Bennette and the Software and Desktop Services Team. This tool is ideal for power-users who want to convert between subtitle formats or adjust the timing and spacing of captions.

Two lines of captions from a recording. The mouse pointer hovers over one caption revealing an Edit button.
Editing captions in Stream.
Three lines of captions from a recording. The "more" button is selected beside one caption revealing a menu with options Edit and Delete. The mouse pointer hovers over the Edit button.
Editing captions in Panopto.
Editing captions in SubtitleEdit. Two captions are highlighted due to their short duration which breaks the built in guidance for the amount of time to show a caption.

Creating transcripts

While caption files are usually text, they tend to include timing information between lines of sentences. This makes for a less fluent reading experience. To create a transcript that we would want to read, we first need to remove that timing information. Patrick Robin in Automation Services made a version of Microsoft’s VTT Cleaner, hosted in our SharePoint environment. VTT is the caption format used by Microsoft Steam. Having downloaded the VTT file from our stream recording we can use this tool to remove timing information from our captions, giving us the content ready to turn into a transcript.

Panopto uses the SRT format, but we can use SubtitleEdit to convert the SRT file to VTT format, then use the above tool to get those captions in plain text.

Once we have the captions in plain text, we can then format them into something more readable. For example, by building the text into paragraphs and adding headings to make the content easier to digest.

Caption perfection

While accuracy is paramount there are other ways you can improve your captions even more.

  • Fix any punctuation issues. For example, “Let’s eat Matt!” has a different meaning to “Let’s eat, Matt!”
  • Aim for complete sentences within a caption. This is better than the last few words of a sentence carrying over into a new caption that starts a new sentence.
  • If there is more than one speaker, add their name each time the speaker changes. This is particularly important for transcripts or when the speaker is off-screen.

Find more captioning tips in Meryl Evan’s blog post, 10 Rules You Need to Create Great Captioned Videos.

Find out more

Read our page on captions with university services.

How can we correct captions and create transcripts at the University of Southampton?

Post navigation


What do you think? Leave us a comment to share your thoughts...