What is H265/HEVC ?

2013-10-31 1081 words 6 minutes

Contents

What is H265/HEVC?

In today’s technology, we see tablets and mobile phones with 1920x1080p (full HD) screen resolution. As technology progresses in this way, you can imagine that large television screens will not settle for full HD technology. This is where resolutions like 2K, 4K, and 8K come into play.

As you can appreciate, when resolutions increase so much, it becomes inevitable for new coding (codec) technologies to emerge and use new encodings. The currently used h.264/avc encoding technology was sufficient for HD, but it seems it will not be performant for 4K and 8K resolutions. Therefore, it is being prepared to be replaced by H.265 (High Efficiency Video Coding).

What is H265/HEVC?

H.265/HEVC is a video compression standard jointly developed by ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video Coding Experts Group). In a nutshell, H.265 is an encoding technology that compresses more and requires less bandwidth compared to H.264/avc.

Why do we need H265/HEVC?

The previous standard, H264/AVC, was first published in 2003. Since then, it has been used in every aspect of digital video. With the widespread use of HD (High Definition) in many devices and applications, we need more bandwidth and storage space due to higher resolutions. Ultra HD video resolutions (2K, 4K, 8K) will make the need for storage space and bandwidth felt even more. Today’s mobile phones and tablets have faster processing power compared to the desktop computers used in 2003. With technology at this point, the emergence of more efficient compression technologies is inevitable.

How does H265/HEVC work?

Video compression technologies generally have the same structure, and we can examine them in two parts: Encode (encoding) and Decode (decoding). The following diagram shows the workflow during encoding and decoding.

As seen in the diagram, the encoding sequence proceeds as follows:

Partitioning: Dividing each image into multiple units.
Prediction: Using Inter or Intra prediction in each prediction unit to create predictions from these units.
Transformation: Transforming and quantifying the residual (the difference between the original image and the prediction).
Entropy: Coding the entropy.

The decoding sequence is:

Entropy: Decoding the entropy and extracting the coded sequence elements.
Inverse transformation: Rescaling and inverse transforming.
Prediction: Adding prediction to each prediction unit based on the inverse transform output.
Reconstruction: Reconstructing the decoded video image.

Let’s examine the structure of H265/HEVC in more detail:

Partitioning

H.265/HEVC comes with a highly flexible partitioning structure. Initially, the image is divided into rectangular or square slices. Each video or image frame is divided into Coding Tree Units (CTUs), which can reach up to 64x64 pixels. The Coding Tree Unit (CTU) is the fundamental unit of coding. In previous standards (MPEG-2, H.264/AVC), we observed these as macroblock structures.

A Coding Tree Unit (CTU) is divided into Coding Units (CUs) similar to a well-known Quadtree structure. Coding Units (CUs) then transform into Inter or Intra Prediction. The following diagram helps to understand this structure better.

Prediction

Each Coding Unit (CU) is divided into one or more Prediction Units (PUs) using Intra or Inter Prediction.

Intra Prediction: Each prediction unit (PU) makes predictions from neighboring image data within the same image. It uses methods like DC Prediction (average value), Planar Prediction (adjusting a flat surface to the PU), and directional Prediction (predicting from neighboring data). The diagram below shows the differences in the Intra structure used in H.264/AVC and H.265/HEVC.

Inter Prediction: Each Prediction Unit (PU) makes predictions using Motion Compensation from one or more reference images (images before and after the current image). The diagram below shows the Inter Prediction Quadtree structure.

Transform and Quantization

H.264/AVC, as mentioned earlier, uses a macroblock structure that can reach up to 16x16 pixels. H.265/HEVC, on the other hand, uses a sequential structure of Coding Units (CUs), Prediction Units (PUs), and Transform Units (TUs). Transform Unit (TU) is the fundamental unit for Transform and Quantization. TUs have blocks of 4x4, 8x8, 16x16, and 32x32 pixels. The diagram below shows the relationship between Coding Units (CUs), Prediction Units (PUs), and Transform Units (TUs).

H.265/HEVC uses a Residual Quadtree (RST) structure, where any remaining residual data after prediction is transformed using Block Transform based on Discrete Cosine Transform (DCT) or Discrete Sine Transform (DST).

Entropy Coding

A coded H.265/HEVC Bitstream consists of transformed coefficient values, prediction information (prediction modes and motion vectors), partitioning information, and other header data. All these elements are coded using Context Adaptive Binary Arithmetic Coding (CABAC). CABAC provides high compression efficiency by updating the probability model for each symbol. The diagram below shows the block diagram of the CABAC method.

H.265/HEVC encoding is formed by using the flows explained above. Now let’s examine other features of H.265/HEVC.

Mode and Motion Vector Prediction

H.265/HEVC combines multi-directional prediction and mode information based on the modes of previously coded units.

Deblocking Filter

A filter is applied to Luma and Chroma samples at Transform Unit (TU) or Prediction Unit (PU) boundaries (these boundaries are aligned in 8x8 grids). The strength of this filter is controlled by marked content elements in the H.265/HEVC bitstream. The deblocking filter is designed to prevent blocking artifacts at Block/Unit edges caused by lossy compression.

Sample Adaptive Offset

An optional filter that allows the adjustment of decoded video frames, enhancing the appearance of smooth regions and object edges. The Sample Adaptive Offset (SAO) filter is a non-linear filter that uses lookup tables that can be marked in the H.265/HEVC bitstream.

Parallel Processing

H.265/HEVC includes several features useful for decoders with parallel processing capabilities. These features include:

Tiles: Largely independent rectangular regions that can be decoded.
Wavefront Parallel Processing (WPP): A coding mode ensuring that a new row of Coding Tree Units (CTUs) can only be decoded after two CTUs from the previous row have been decoded.

Profiles, Levels, and Tiers

A profile defines the subset of H.265/HEVC coding tools that a decoder must support. The combination of level and tier defines the maximum decoder processing capabilities regarding image size, encoded samples per second, bit rate, and buffering.

Finally, when we examine the gains of H.265/HEVC compared to other coding techniques, we observe a 35%-40% efficiency gain in applications compared to the previous coding technique, H.264/AVC. This can be seen more clearly in the following diagrams.

In this article, I tried to share information about H.265/HEVC. I hope I could provide useful information for you. I had some uncertainties about translating some technical terms into Turkish, and feedback on this matter would be important to me. Looking forward to meeting you in another article.