logo api.video

api.video + aflorithmic.ai: Localize Advertisement Videos with Personalized Voice Overs!

August 5, 2021 - Erikka Innes in Authenticatevideo createVideo upload,Python

You've prepared the perfect ad copy for your business, but now you're trying to figure out how you can personalize the ad for each business. One great way to do that would be to use a voice over tool that allows you to customize the content based on the location of your customers. Fortunately, this is available! By combining api.video with aflorithmic.ai's voice over this problem is easy to solve.

Imagine, if you will, that you own a pizza chain

To make this fun, let's imagine you are the owner of a popular pizza chain - Renzo's pizza chain. You have fifty different outlets and you've prepared the perfect ad:

--Insert video here--

This ad is great for one of your locations, but you want people near all of your restaurants to know they can eat the best pizza for the least dough. For each location, you'd like to run this ad, but have it play information that tells the listener the city, address, day of the week and the time of the offer being described.

No problem! We can do this in the tutorial today. Let's get started.


For this project, you're going to need:


We will use the api.video Python client and Aflorithic.ai's apiaudio library.

Installation for api.video: pip install api.video

Installation for Aflorithmic: pip install -U apiaudio

ffmpeg installation

If you want to run the project as-is, you'll also need to install ffmpeg. These instructions help you with installation on a mac. What you'll want to do is make sure you have brew installed. Then it's very easy, you just install with:

brew install ffmpeg

You need to install ffmpeg before you install the next two items.

pydub and pyaudio installation

pydub and pyaudio can also be difficult to install, depending on your set up and what you've tried to install already. If you made the mistake of trying to install these before installing ffmpeg, then what you would do is first run:

brew remove portaudio

Then, reinstall this like so:

brew install portaudio

After these steps, you should be able to successfully install the modules you'll need. Here's the commands:

brew pyaudio


brew pydub

Project overview

Here's what we're going to do with the example script today:

  1. Prompt to collect the api.video API key and the Aflorithmic API key.
  2. Select our video from the Videos folder (for your own tweaks later you can drop other videos here to use, or use a different folder system to organize your content).
  3. Select the script we want from the Script folder and offer the user a preview of the script to make sure it's the right one.
  4. Select the localization .csv file we want to use with our script.
  5. Create a sample audio file with the voice over, using the first line from the .csv file
  6. Give the user the sample file so they can make sure it sounds like what they want.
  7. Next we'll combine the first audio file with the video.
  8. We'll display that file for review by the user to make sure it looks right.
  9. If the user approves the file, one by one we'll create an audio file, combine it with the video, tag each video with the information from the csv, and upload it for storage and hosting on api.video.
  10. We'll return the user a list of video titles and a link to a playable copy of the final file.

Code sample

Here is the code sample. It's a wizard that will walk you through all the steps. The complete project is available on github here: https://github.com/apivideo/python-api-client/blob/master/examples/video_audio/README.md

import apivideo, apiaudio, ffmpeg
import os, json, csv, sys, webbrowser
from pydub import AudioSegment
from pydub.playback import play
from apivideo.apis import VideosApi
from apivideo.exceptions import ApiAuthException

# Functions

# Get the keys from the user 
def get_aflo_key():
    while True:
        aflo = input("Enter your aflorithmic API key (available at https://console.api.audio/): ")
        if len(aflo) < 32 or len(aflo) > 32:
            print("The aflorithmic API key is not correct, please try again.")
            return aflo

def get_apivideo_key():
    while True:
        api_video = input("Enter your api.video API key (available at https://my.api.video/: ")
        if len(api_video) < 43 or len(api_video) > 43:
            print("The api.video API key is not correct, please try again.")
            return api_video

# Choose an .mp4 to add sound to. MP4 is easier to work with when you need to combine videos, so it's the only format accepted. 
# The video you want to combine must be in the same folder the application is running in.
def choose_video(): 
    filelist = os.listdir('.')
    while True:
        for item in filelist:
            if item.endswith(".mp4"):
        filename = input("Please select the video you want to use. \n")
        if filename in filelist:
            return filename
            print("That's not a file. Please type the name of the file as you see it listed.\n")

# If you have multiple scripts you can choose the one you want from the Scripts folder.
def choose_script():
    filelist = os.listdir('Scripts')
    while True:
        filename = input("Please select the script you want to use by typing its complete name. \n")
        if filename in filelist:
            f = open('Scripts/' + filename)
            use = input("Is this the script you wanted to use? Type YES or NO. \n")
            if use.upper() == 'YES':
                print("Great! Let's move on to personalizing your content!")
                return 'Scripts/' + filename
            elif use.upper() == 'NO':
                print("Okay, try again.")

# Choose the csv file you'll use for personalization.
def choose_personalization():
    filelist = os.listdir('Personalization')
    while True:
        filename = input("Please select the .csv file you want to use for personalization. \n")
        if filename in filelist:
            for i in range(3):
                f = open('Personalization/' + filename)
                line = f.readlines()
            use = input("These are the first three lines of the file you picked. Is this the file you wanted to use? Type YES or NO. \n")
            if use.upper() == 'YES':
                print("Great! Let's move on to creating the audio track!")
                return 'Personalization/' + filename
            elif use.upper() == 'NO':
                print("Okay, try again.")

# When you set up a script for use, you can provide some information about that script. This helps with set up
# and returns a list of all the variables with their values. 
def get_script_details():
    print("To set up the script, we need a few details. \n")
    scriptName = "a"
    moduleName = "a"
    projectName = "a"

    while True:
        scriptName = input("What do you want to name your script? Type it in. \n")
        print("You typed: ", scriptName)
        response = input("Is this the name you wanted to use? Type YES or NO. \n")
        if response.upper() == 'YES':
            print("Great, let's get the other information we need!")
            print("Ok, try again.")

    while True:
        moduleName = input("What do you want to name your module? Type it in. \n")
        print("You typed: ", moduleName)
        response = input("Is this the name you wanted to use? Type YES or NO. \n")
        if response.upper() == 'YES':
            print("Great, let's get the other information we need!")
            print("Ok, try again.")
    while True:
        projectName = input("What do you want to name your project? Type it in. \n")
        print("You typed: ", projectName)
        response = input("Is this the name you wanted to use? Type YES or NO. \n")
        if response.upper() == 'YES':
            print("Great, that's everything we need!")
            print("Ok, try again.")

    return [scriptName, moduleName, projectName]

# After you choose a track, it's easier to combine if it's a .wav. This makes the file playable using PyDub and returns a PyDub object. 
# The .wav file can then be used. 
def check_audio_convert_track():
    while True:
        audio_check = input("Type PLAY to hear the sample now. \n")
        if audio_check.upper() == "PLAY":
            track = AudioSegment.from_mp3('Audio/sample.mp3')
            ready = input("Does the track sound the way you wanted? Type YES or NO. For NO, you'll have to start over again. \n")
            if ready.upper() == "YES":
                print("Great! Let's continue on to building the sample video!")
                return track
            if ready.upper() == "NO":
                print("Okay. The program will exit now. You'll need to figure out what tweaks to make to your script and voice.")
                print("That's not a valid response.")
            print("That's not a valid entry.")

def make_video(video, wav_track, title):
    input_video = ffmpeg.input(video)
    input_audio = ffmpeg.input(wav_track)
    title = title + ".mp4"
    ffmpeg.concat(input_video, input_audio, v=1, a=1).output(title).run()
    print("A video with the title " + title + " was added to this folder.")

def add_video_return_mp4(video_title, av_client, item, vid_description):
    videos_api = VideosApi(av_client)

    item_values = item.values()
    item_list = list(item_values)

    video_create_payload = {
        "title": video_title,
        "description": vid_description,
        "public": True,
        "tags": item_list

# Create the container for your video and print the response
    response = videos_api.create(video_create_payload)

# Retrieve the video ID, you can upload once to a video ID
    video_id = response["video_id"]

# Prepare the file you want to upload. Place the file in the same folder as your code.
    file = open(video_title, "rb")

# Upload your video. This handles videos of any size. The video must be in the same folder as your code and 1080p. 
# If you want to upload from a link online, you need to add the source parameter when you create a new video.
    video_response = videos_api.upload(video_id, file)
    url = video_response['assets']['player']

# Welcome and Instructions
print("*          HELLO AND WELCOME TO THE AD LOCALIZER !           *")
print("*   This tool lets you take a video and add a voice over     *")
print("*   that can be personalized with information read from a    *")
print("*   .csv file containing the personalization information.    *")
print("*   Use the example content provided to start with, then try *")
print("*   your own!                                                *")
print("*   NOTE: You can exit the program at any time by typing     *")
print("*   CTRL-C                                                   *")

# Collect API Keys 

print("To get started, you'll need an api.video API key and an aflorithmic.ai API key.")
aflo_key = get_aflo_key()
api_video = get_apivideo_key()

# Authenticate
av_client = apivideo.AuthenticatedApiClient(api_video)
apiaudio.api_key = aflo_key

print("Thanks! Now you're ready to choose your video. Your video must be in the same folder as this application.")
print("Pick your video. \n")

# Choose Video
video = choose_video()

# Choose Script
script = choose_script()

# Create Script
script_details = get_script_details()
text = open(script, "r")
text = text.read()

script = apiaudio.Script().create(scriptText=text, scriptName=script_details[0], moduleName=script_details[1], projectName=script_details[2])

# Personalization 
print("You can personalize your script by reading in values from a .csv file to your script. \n")
csv_f = choose_personalization()

# Create speech. Choose a voice! https://library.api.audio/speakers

# Set up the csv file for use.
audience_combinations = []
with open(csv_f, newline='') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:

i = 0
vid_description = ""

csv_header = {"Video": "Video Title", "MP4 Link": "MP4 Link Here"}
with open('all_vid_links.csv', 'w') as f: 
    w = csv.DictWriter(f, csv_header.keys())

while True:
    vid_description = input("Before we create the audio track, provide a brief description to describe the videos in the series we'll create. \n")
    keep_it = input(vid_description + " --- Is that the description you want? Type YES or NO. \n")
    if keep_it.upper() == "YES":
        print("Great! We're ready to work on the audio track now. Hang on a few moments while we process the first track.")
        print("Okay, try again.")

for item in audience_combinations:

    r = apiaudio.Speech().create(

    template = "copacabana"

    r = apiaudio.Mastering().create(
        scriptId=script.get("scriptId"), soundTemplate=template, audience=[item]

    file = apiaudio.Mastering().download(
    print(f"✨ Mastered file for template {template} ✨")

    if i == 0: 
        new_file = os.path.join(os.path.dirname(file), "sample.mp3")
        os.rename(file, new_file)

        print("You have generated your first audio file. It's in the Audio folder and we've named it 'sample.mp3.'")
        print("Let's make sure it sounds the way you want.")
        wav_track = check_audio_convert_track()
        wav_track.export("Audio/sample.wav", format="wav")
        vid_list = video.split('.')
        video_title = vid_list[0]
        print("We're going to combine the video and the audio now. The video will appear in the same folder as your application title " + video_title + "-sample")
        make_video(video, 'Audio/sample.wav', video_title + "-sample")
        link_mp4 = add_video_return_mp4(video_title + "-sample.mp4", av_client, item, vid_description)
        print("Please watch the sample video that we'll open for you in the webbrowser. Then return to the terminal here. \n")
        while True:
            answer = input("Are the video and audio correct? You will have to start over if not. Type YES or NO. \n")
            if answer.upper() == "YES":
                print("Great, we'll start creating the videos in bulk and uploading to api.video now.")
                i += 1
            if answer.upper() == "NO":
                print("You will have to go back and figure out what's wrong with the video or audio. Exiting now.")
                print("That's not a valid response, try again.")
        # Convert the audio file so we can combine it. 
        counter = str(i)
        vid_list = video.split('.')
        audio_title = "Audio/" + vid_list[0] + counter + "-audio" + ".wav"
        wav_track = AudioSegment.from_mp3(file)
        wav_track.export(audio_title, format="wav")

        # Make a video 
        video_title = vid_list[0] + counter
        make_video(video, audio_title, video_title)
        link_mp4 = add_video_return_mp4(video_title + ".mp4", av_client, item, vid_description)
        my_dict = {"Video": video_title + ".mp4", "MP4 Link": link_mp4}
        with open('all_vid_links.csv', 'a') as f: 
            w = csv.DictWriter(f, my_dict.keys())
        # Upload the video and delete it from here
        os.remove(video_title + ".mp4")
        print("Video " + video_title + "added!")
        i += 1

Walkthrough the important stuff

A bunch of the demo is set up to walk you through the process of combining information. We don't need to go over each while loop, but let's go over some details regarding API behavior and the tools used to create the demo.

The script

You can place your script into your code directly by using quotes. It could look like this (I'm saying could because there are a few different ways to set up your api.audio script):

    On {{day_of_week}} it's Bring a friend day in {{city}} at Renzo's pizza!

    Bring a friend and they will get their pizza for free if you order one or more premium size pizzas, between {{offer_time}}. Only at Renzo's, {{address}} in {{city}}.

    Renzo's. Eat the best pizza for the least dough.

You can see that the script is split into three sections. This demo doesn't really make use of the features available per section so I'm not going to go into detail for this part. Personalization is achieved by using {{ }} around the name of a column from your csv spreadsheet. They can be used in whatever order you like.

You can also choose to read your script from a .txt file. If you choose to do this, make sure to take the quotes off from the beginning and end or it will screw up the parser. A common error you will get will say there's a problem with creating a final file or working with a URL and it's usually because something broke before that point, so keep that in mind when debugging.

In your code, when you set up your script, you'll use a command that looks something like this:

In this snippet, you've already authenticated with your api key (apiaudip.api_key = your_key) so now you can work with the api's endpoints.

The only required field is scriptText, which will contain the text of your script. However it's useful to provide the other names for reference. Be aware that you cannot delete projects or modules. After you set up your script this way, you'll be able to reference it in your code easily via scriptId. Like so:


The audio file

For starters, when you create an audio file with api.audio, it names your file based on your project name and personalization parameters. In this demo, if you named your project "ad" then every audio file would begin with the word "ad." Next, it puts the headers for the csv file in alphabetical order, and adds the appropriate parameter next to each header. For example, our .csv file has the columns (in this order):

  • city
  • address
  • day_of_week
  • offer_time

After each column title, the entry that appears in the audio file will appear. To separate each column, two underscores are used. Here's a sample:


So something you'll possibly want to do is rename the files when they arrive. An easy way to handle this is with the built-in os module.

When creating an audio file you'll do two steps, one is text-to-speech and one is mastering the audio. For text-to-speech you can pick a voice, how fast it will speak and then give it a dictionary containing the list of personalization terms you want to insert into your script.

You can also choose a template, which will play some background music under your spoken audio. You can see in the demo I used "copacabana." To list voices or background music options, go to the api.audio API reference docs and use the endpoint for listing voices or the endpoint for listing music.

The .csv

This demo reads information from a .csv file. If you want to read from something else, you can as long as your output for your program to work with becomes a list of dictionaries. .csv is pretty simple to use, so I went with that option.

Playing a track from your application

You can play a track to check out the audio by using a variety of tools. For this one, I used pydub and pyaudio. These are fairly popular modules. In order to use them, however, audio must be converted to .wav. You will see in the code that two imports are made from pydub:

from pydub import AudioSegment
from pydub.playback import play

These allow us to play straight from the terminal or wherever we may be. The code to convert to .wav and play is very straightforward:

wav_track = AudioSegment.from_mp3(file)
wav_track.export(audio_title, format="wav")

There are other choices available for converting, but api.audio returns .mp3 files, so we use the from_mp3 choice.

After we have the track, playing it is as simple as this:


Merging audio and video

Prior to uploading your video to api.video, you will want to add the sound and video together. This can be accomplished with ffmpeg, which you need to import to use pydub and pyaudio anyway. The code for merging audio and video is:

input_video = ffmpeg.input(video)
input_audio = ffmpeg.input(wav_track)
title = title + ".mp4"
ffmpeg.concat(input_video, input_audio, v=1, a=1).output(title).run()

This will produce an mp4. You can then upload it to api.video.

Upload to api.video

For details about uploading a video with api.video, you can check out the tutorial about it here: Upload a Video with the api.video Python Client

Something to note is to upload a file, it must be in the same folder as your application or it will not upload.

Once it's uploaded, you can retrieve the .mp4 from the response and play it right away in your browser using the built-in webbrowser module.

To retrieve the mp4 from the response, you do this:

link_mp4 = video_response['assets']['player']

And then you can open the link like this:


This will let you make sure everything combined properly so that the audio matches with the video the way you want.

Create all the localized videos

After all the steps to make sure you're creating the right type of video, you can use the recipe from api.audio with a couple of tweaks to create all your new videos with personalized ads by location, then upload them for hosting to api.video.

This demo deletes every video you upload right after the upload happens so you don't end up sitting with fifty videos in a folder.

Thanks for reading! Happy coding. :)


Erikka Innes

Developer Evangelist