Back to api.video Glossary

H.264

H.264

What is the H.264 codec?

H.264 is a video compression standard that's currently the most widely used codec on the planet. It uses advanced algorithms for estimating motion, inter-prediction, spatial intra-prediction and discrete cosine transfer (DCT) coding. It can be used for discs like Blu-ray (and others), broadcast and video streaming. This codec is also compatible with a lot of different container formats like MP4, MOV, 3GP, MPEG and F4V. Audio is compressed separately, usually using the AAC (advanced audio coding) standard.

How does the H.264 codec work?

H.264 has an encoder and a decoder as part of the codec. The encoder handles prediction, transform and encoding tasks in order to efficiently compress a video file.

Prediction is where the encoder takes a single frame of a video at a time and breaks it into blocks. Each block is called a macroblock, and these can vary in size by codec or method used to handle the blocks. Typically for H.264, the encoder breaks blocks into groups of between 16x16 and 4x4 pixels, but it could be different. The encoder then guesses what's going to be in the block, using these techniques:

  • Intra prediction - the encoder uses the way things are set up directly in the image to guess what other pixels in the image might be. For example if you're in a shadowy part of the image, the encoder might guess that there are more dark pixels in that area. Blocks for this type of prediction are either 4x4 or 16x6 pixels high and wide.
  • Inter prediction - the encoder looks at a previous frame, then subtracts whatever is the same and keeps only the differences in the new frame for storage. This is called forming the residual. Blocks for this type of prediction can be anything from 4x4 to 16x16.
H.264

The transform task is where the encoder takes the information its saved and transforms it into a matrix of values where the positions in the matrix correspond to pixels at specific positions in the image. To do this, it uses a Fourier-transform. Since this article is about H.264, we won't get into too much detail. Suffice it to say the transform is going from an image to a matrix that represents the pixels to a wave. You're transforming the information into something that can be modeled using waves. This is easier to store and transfer, because once you can model your image using waves, you can come up with a way to represent the signal as the sum of combinations of cosine waves.

What is the difference between an MP4 and an H.264 file?

MP4 is a container format and stands for MPEG-4 Part 14. It's the most popular video format in the world because it allows you to combine all sorts of media - video, audio, subtitles and images. Most devices will play an MP4 and today many sites choose MP4 above other formats, making it everyone's go to choice. MP4 is different from H.264 because MP4 is a file format and H.264 is a codec. A file format describes what kinds of information can contain and how it will be organized. A codec describes how video or audio (or both) will be compressed and decompressed later for playback.

Which is better: H.264 or H.265?

To determine whether H.264 or H.265 is better, you must choose your criteria. For quality and speed, H.265 is better. For price, H.264 is free. There are more instances where you may use the H.264 codec free of charge for your video content. H.265 on the other hand, must always be paid for. Because of this, there is less support for H.265. If you want good quality and speed, and the ability to play back your video almost anywhere for free, choose H.264. Companies like Netflix and Hulu earn enough money that they can benefit from paying for H.265 to deliver a higher quality more efficient video experience.

New codecs are under development to replace both of these, with the goal of the codecs being more efficient, providing higher quality, and being free.

Related content

glossaryglossary