Video Encoding Best Practices: 6 Practical Tips for Optimizing Latency, Bandwidth, and Picture Quality
Trying to get the most out of your video encoder? When it comes to video encoding best practices, the first step is to start with a clear understanding of your target application and what you’re trying to achieve. Every use case is unique: perhaps you need to optimize for pristine quality 4K video for an important broadcast event, or you need ultra low latency for bi-directional interviews or maybe you’re faced with bandwidth limitations? Whatever your use case, getting the results that you’re looking for sometimes requires a delicate balancing act and tradeoffs between high picture quality, low bitrates and latency.
To help, we’re sharing 6 practical tips to optimize your encoder settings and deliver video with complete confidence.
Tip #1 Choose HEVC
Whenever possible, choose HEVC/H.265 encoding. Although many broadcast workflows rely on the AVC/H.264 codec for video compression, HEVC offers nearly twice the bandwidth efficiency. A 5Mbps HD stream in HEVC will result in better picture quality than H.264. HEVC can also be used for when bandwidth is limited, a 3Mbps HD stream in HEVC will be comparable to a 5Mbps in H.264. Some video encoders allow for the choice of different encoding profiles for different levels of picture quality. In the Makito X4 video encoder, for example, there is a range of profiles available for both HEVC/H.265 and AVC/H.264 with 8 or 10-bit pixel depths, 4:2:0 or 4:2:2 chroma subsampling options. It’s worth noting that higher profiles generally result in higher bitrates and demand more resources to decode than lower ones.
Tip #2 Pay Attention to GOP Size and Framing
A GOP or Group of Pictures are several video frames grouped together for encoding in H.264 or HEVC. Hand in hand with GOP length is framing. The I-frame, intraframe, or also known as the key frame is the main reference for the subsequent P and B frames in a GOP. I-frames contain the most amount of data while P-frames only contain the differences between it and the previous I-frame, and B frames contain both forward and backward changes resulting in even more efficient compression. Choosing the right combination and number of I, P, and B frames is key to optimizing video quality. For the Makito X4, you can choose from the following 6 options:
I: I frames only (highest quality, least amount of bandwidth efficiency)
IP: I and P frames only (high picture quality, efficient compression)
IBP: I, B and P frames
IBBP: I, BB (two B frames in sequence) and P frames
IBBBP: I, BBB (three B frames in sequence) and P frames
IBBBBP: I, BBBBB (four B frames in sequence) and P frames (highest latency; highest bitrate efficiency)
The length of the GOP stands for the number of frames between two I-frames. By increasing the length of the GOP, there will be fewer I-frames per time frame, which minimizes bandwidth consumption. So, for example, with extremely complex subjects such as water sports or action mode, you’ll want to use a shorter GOP length such as 15 or below that results in excellent video quality. For more static video such as talking heads, then much longer GOP sizes are not only sufficient but also more efficient. The larger the GOP size, the more efficient the compression and the less bandwidth you will need.
Tip #3 Enable Network Adaptive Encoding
In a perfect world, network congestion would never be an issue, but unfortunately, such limitations still exist. IP networks, especially the public internet, can be unreliable. Whenever there’s a possibility that your bandwidth is not guaranteed, you should consider using Network Adaptive Encoding (NAE) if supported by your encoder. NAE mitigates the risk of a stream failing even when encountering significant network bandwidth fluctuations. When changes in a network’s conditions are detected, NAE dynamically adjusts the compression level of the encoder stream in real-time. This ensures that video streams are non-stop and always at their best quality at a given time over unpredictable networks. Read our white paper to understand how NAE uses available bandwidth to maximize your video quality, regardless of any network conditions that may exist.
Tip #4 Don’t Forget Audio
When calculating bitrate for your video stream, don’t forget to include audio. Although not as intensive as video, audio bitrates can quickly add up especially when including multiple audio channels. The Makito X4 video encoder can encode up to 32 channels of audio in channel pair groups and the audio bitrate can be configured anywhere from 80 to 320 kb/s. Depending on your application, you may not need pristine audio, in which case 96 or 128 kb/s may be sufficient. When audio quality is paramount then consider 192 kb/s or above. Whatever the case, don’t forget to include audio within your overall video streaming bitrate budget.
Tip #5 Consider Chroma Subsampling and Pixel Depth
Depending on your application you can choose different combinations of chroma subsampling and pixel depth to deliver the color precision needed and to maintain color fidelity and prevent artifacts in downstream workflows. By reducing the amount of color information in a video signal, chroma subsampling allows picture clarity to be maintained while effectively reducing the file size by up to 50%. Common chroma subsampling formats include:
4:4:4 – uncompressed video with no chroma subsampling, transports both luminance and color data entirely.
4:2:2 – has half of the chroma of 4:4:4 and reduces the bandwidth of an uncompressed video signal by one-third with little to no visual difference.
4:2:0 – has one-quarter of the chroma of 4:4:4 and reduces the bandwidth of an uncompressed video signal by half compared to no chroma subsampling.
Consumer television sets will display video content in 4:2:0 as the visual is indiscernible to viewers. However, some broadcast engineers prefer working with 4:2:2 video sources as it can prevent artifacts when producing content for final playout.
Pixel or bit-depth is the number of basic red, green, and blue colors that are stored within a video frame, and the number of shades determines the bit depth of the image. While the majority of delivered video content is in 8-bit, when it comes to 4K or HDR broadcast contribution workflows, 10-bit video is preferable. 10 bit-video greatly expands the range of colors available (from 16.8 million to 1.06 billion hues), thus enabling more options and control over how to process content, preventing visible banding effects, and allowing color correcting on the fly. However, using 10-bit video rather than the conventional 8-bit but does impact bandwidth.
Tip #6 Leverage the Power of SRT
Pioneered by Haivision, Secure Reliable Transport (SRT) is an open-source, low latency transport protocol that optimizes streaming performance across “noisy”, unpredictable networks such as the public internet. Successfully streaming video over the internet without compromising picture quality, requires some form of error correction as part of a streaming protocol to prevent packet loss. Different types of error correction will all introduce latency, but some more than others – SRT leverages ARQ error correction to help prevent packet loss while introducing less latency. Included for free within the Haivision ecosystem of video solutions, SRT assures quality-of-service when faced with issues such as packet loss, jitter, latency, and fluctuating bandwidth. SRT provides end-to-end security and resiliency along with dynamic endpoint adjustment based on real-time network conditions to always deliver the best quality video over the worst networks. By choosing SRT as your streaming protocol, you can select the latency for error correction, turn on AES encryption if needed, and include a bandwidth overhead in case of link failures resulting in multiple packets requiring transmission.
Get the Balance Right
Achieving a higher quality viewing experience for the end-user usually means higher resolutions and frame rates and therefore higher bandwidth requirements. While new technology and advanced codecs strive to improve latency, finding the right balance will always be important. Ultimately, the individual targeted use case will determine the best balance within this triangle of video encoding and streaming considerations. For applications where latency is critical such as video surveillance and ISR, picture quality can often be traded in favor of minimal latency. However, for use cases where pristine broadcast-quality video matters, latency can be increased slightly in order to support advanced video processing and error correction. By delivering the optimal combination of bandwidth efficiency, high picture quality, and low latency, viewers can enjoy a great live experience over any network – with no spoilers.
Interested in Learning More?
Watch how easy it is to set up a live 4K video stream with the ultra-low latency Makito X4 video encoder.