Tutorials · 5 min read

Add captions in your videos

How to add captions to your videos

Captions are not just for accessibility! They also increase views and viewing time for all users.

Doug Sillars

March 6, 2020

Videos are a compelling way to share experiences with your customers. The combination of speech, sounds, images and movement allow us to share a more enriching experience.

However, on the web a video can only be autoplayed when it is muted. As a consumer with small children, this is often a blessing: the last thing I want is a website to wake up a sleeping baby. But as a developer, I want everyone to experience everything my videos have to offer, so I want to work around this issue (but of course, while still letting everyone’s babies nap!)

One solution is to add captions. Users can still engage with your video and follow along with the dialogue. Facebook finds that videos with captions are watched 12% longer than videos without captions (85% of Facebook videos are watched without audio). Longer video watch time increases your search rank, and the presence of a captions file even improves your site SEO! With all of these advantages, the accessibility improvements of your website almost come as an afterthought.

No matter the reason you choose to add captions to your videos — you are probably wondering how it is done!

The VTT File

Closed captioning is typically done with a VTT (Video Text Track) file. They=se are text files that simply list the text that should appear over a specified period:


In the example above, from 15–17.9 seconds, the text “At the left we can see” will appear on the screen, followed a split second later by “at the right we can see.” It is really a pretty straightforward format. But other than manually transcribing your video, how can you create a VTT file?

Let’s walk through 2 ways:

VTT Creator

VTT Creator is a web based tool. You simply upload your video, and as it plays the video back, it creates captions for you.

The website uploads your video and begins the analysis, and the pane on the left fills with the subtitles

You can download the VTT, and you’re good to go!

Google Speech to Text

Google Cloud, Azure and AWS all have a speech to text tool. I chose Google’s for this demo, but they all operate in much the same way.

Google has a helpful QuickStart guide to set up your Google Cloud account for Speech to Text. The first step is to use ffmpeg to strip the audio track (there is no need to upload the video track when doing audio analysis).

I used their Node.js code example with a few changes (Code available on GitHub). The video I was transcribing was over 1 minute long, so I had to host it on a Google Cloud instance. Rather than using:


I used:


There is no need to transcode the video to base64 (as described in the tutorial) if it is hosted on Google Cloud. The sample code from Google writes the transcript into the console:

This is super cool, but we want more than a transcription — we need those time stamps. The raw JSON file has an entry for each word with the start and end time for each word:

So I created a function that groups words, using the first word’s start time,and the end time of the last word. The length of the phrase can be varied — in the sample code it is set to 10 words. The script outputs a VTT file to the console.

Adding Captions to your Video


Adding the VTT file to your video is easy. If you are using HTML5 video, you can add a track attribute pointing to the VTT file.

<video autoplay muted controls>
    <source src=”myvideo.webm”>
    <source src=”myvideo.mp4">
    <track default lang=”en” kind=”captions” src=”myvideo.vtt”>
    <track lang=”es” kind=”captions” src=”muvideo-es.vtt”>

Video Streaming

If you want to add captions to your api.video video, you can use the Captions API.

  1. First you must authenticate your session to gain an access token.
  2. Determine the videoID of the video you’d like to add the caption to. In this example, I have created captions to Elephant’s Dream (an Open Source video project). The video ID is: vi5UNyaStzuuj0xTAGp7qtjf


  1. Make a call to add the caption. Using curl, the request looks like:
curl [https://ws.api.video/videos/vi5UNyaStzuuj0xTAGp7qtjf/captions/en-US]
(https://ws.api.video/videos/vi5UNyaStzuuj0xTAGp7qtjf/captions/en-US) -H ‘Authorization: Bearer {access\_token}’ -F file=@/path/to/google.vtt

And we receive a response:


Note that captions are off by default. This is easily resolved by patching the video, setting the default to true. Using curl, this looks like:

 curl -X PATCH [https://ws.api.video/videos/vi5UNyaStzuuj0xTAGp7qtjf/captions/en](https://ws.api.video/videos/vi5UNyaStzuuj0xTAGp7qtjf/captions/e) -H ‘Authorization: Bearer {access\_token}’ -d ‘{“default”: true}’

And we receive a response:


And watching the movie: success! the captions are turned on by default. To autoplay a video with api.video, simply add #autoplay to the end of the url, and we have an autoplaying video with captions.

Now we can rest assured that our videos will likely have more traction, be watched longer, and shared more often. And, we are making our content more accessible to users who cannot hear.


Video captions not only make your content accessible, but research shows videos with captions are watched longer, and shared more often. And searchable transcriptions actually increase the SEO of your site!

Often there is reluctance to add captions due to the added work, but with just a few clicks of a button, captions can be generated automatically and uploaded for immediate playback.

Give it a try, and share your caption successes on the api.video community forum.

Try out more than 80 features for free

Access all the features for as long as you need.
No commitment or credit card required

Video API, simplified

Fully customizable API to manage everything video. From encoding to delivery, in minutes.

Built for Speed

The fastest video encoding platform. Serve your users globally with 140+ points of presence. 

Let end-users upload videos

Finally, an API that allows your end-users to upload videos and start live streams in a few clicks.


Volume discounts and usage-based pricing to ensure you don’t exceed your budget.