api.video recently announced that it now offers free encoding for all its users—a bold move perfectly aligned with our mission to democratize videos.
As a next step and to make things more transparent, we decided to take the help of some numbers and explain it to you.
The natural thought behind video encoding is that it is expensive. That is mainly because encoding workloads demand substantial CPU resources, and CPUs can be quite costly. Consequently, encoding has traditionally incurred high costs and the higher the video resolution, the greater the CPU requirements, directly impacting encoding expenses. It seems logical that pricing should correlate with these factors, which is what experts typically anticipate.
But that is not always the case. Read on to learn how we, at api.video, are making it possible to scrap video encoding costs for you all.
We build our own infrastructure
There are two common routes you can take for video encoding - Central Processing Units (CPUs) or Graphic Processing Units (GPUs). But let us accept that they are both not very affordable. Moreover, datacenter GPUs aren't primarily designed for video encoding but are rather now targeting AI tasks. And considering the popularity of AI, cloud providers are unlikely to reduce GPU prices because they can command higher rates from AI customers. That’s when a new alternative comes in: Application-Specific Integrated Circuits (ASICs). These are specialized chips (Video Processing Units) designed for specific workloads and have been recently in news due to Google developing its own VPU for YouTube called Argos.
At api.video, we build our own infrastructure that has the most advanced VPUs. With these VPUs, we are able to save power consumption costs since the power consumption cost of VPUs is one order of magnitude lower than CPUs and GPUs. It is also easily possible to pack a lot of them in a single server. This increases their density and therefore, reduces the number of servers.
In a nutshell, VPUs help you to encode videos at scale without costing you too much, and that is exactly why we at api.video, made the move to use VPUs for video encoding.
Understanding encoding costs on VPUs
api.video uses VPUs built by Netint Technologies. Let's understand the costs with a 1080p30 video. To simplify the case, I will assume the source video codec is h.264. The output videos after transcoding will be five h.264 renditions for ABR: 1080p30, 720p30, 480p30, 360p30, 240p30.
For a real time transcoding, let's try to sum up the number of MP/s (Megapixel per second) it represents. For the above output, it represents 62 MP/s + 28 MP/s + 12MP/s + 7MP/s + 3MP/s respectively for 1080p30, 720p30, 480p30, 360p30, 240p30 renditions. We need an encoding capacity of 112 MP/s for real time transcoding. Let us assume that the transcoding cost on VPUs is proportional to the number of pixel per frame (whatever the frame resolution) for the sake of simplicity and one VPU of Netint is able to encode 2000 MP/s in h.264. In reality it is a bit more complex but let’s take these as the numbers for a rough idea.
Now, as per the above assumptions, we can encode 2000/112 = 17 minutes of videos (all rendition included) per minute. Below is a table summarizing the minute cost for a server with 4 VPUs. The cost of the server is $10k and the amortization for such a server is 3 years. The electricity cost will be approx. $1000 per year (420w at $0.30 / kWh) at full consumption. Let us see what is the cost of 1 minute of a 1080p30 video transcoded to 5 renditions.
(The cost is split taking into consideration the average usage of the server per day)
Avg. usage of the server in percentage | Minutes encoded over the amortization period | Cost per minute transcoded (minutes / cost of server + electricity) in $ |
---|
0,50% | 536112 | 0,020 |
---|
5% | 5361120 | 0,002 |
---|
25% | 26805600 | 0,0004 |
---|
50% | 53611200 | 0,0002 |
---|
90% | 96500160 | 0,0001 |
---|
If we use the server at only 0.5% per day (i.e ~7 min per day fully loaded) the cost per minute would be at $0,020.
If on an average, this server is loaded at 25% (6 hours a day) the cost per minute would be 1/150 of our previous price.
What about video quality?
Hardware encoded is always associated with quality losses in the head of many video engineers. But that was probably the case decades ago.
Not anymore.
Let’s see how VPUs compete with CPUs in terms of quality by taking a video under Creative Commons license: Caminandes 2.
We have uploaded its 1080p30 version via our API and will compare the quality output with a CPU encoded video with this option -b:v 4400k -preset medium These parameters generate a rather good quality.
ffmpeg -r 30 -y -i input.mp4 -c:v libx264 -force_key_frames 'expr:gte(t,n_forced*4)' -b:v 4400k -preset medium -c:a copy output.mp4