Why is MPEG-H Audio Gaining Momentum?

By Thomas Kramer, VP Business Development and Strategy, MainConcept, and Adrian Murtaza, Sr. Manager Technology and Standards, Fraunhofer IIS



MPEG-H Audio has been around for several years now, having first been demonstrated by Fraunhofer IIS in 2014. It quickly gained traction and was integrated into a wide range of consumer devices. However, recently the format has been gaining a significant amount of momentum after it was judged to be the most advanced audio system in a strict and detailed evaluation conducted by an independent test laboratory under the SBTVD Forum’s supervision.

MPEG-H Audio has been used in South Korea since 2017 for ATSC 3.0 broadcast and Brazil adopted it in 2019 to enhance their existing terrestrial HDTV services. In 2023, major broadcasters, including Globo, Rede Amazonica, and TV Cultura, have enabled it in their regular broadcast services. MPEG-H Audio is the only mandatory audio system for the country’s next generation TV 3.0 transmission service, due to launch in 2025. It has also been included into a number of other global standards, including DVB, ATSC 3.0, and 3GPP.

So, why the increased interest in the object-based system? The broadcast industry is changing fast in response to shifting consumption habits. While MPEG-H Audio brings a number of important advantages and enhancements to this evolving landscape, we think there are a few standout trends that are contributing to the formats’ popularity:

Live sports are becoming more interactive

According to Altman Solon, global sports viewing has increased significantly over the last couple of years. The survey found that 57% of respondents are watching more sports now, compared to 43% in 2020. However, the way in which sports is being consumed has also changed dramatically. While live sports remain the most common, there is a definite shift, especially amongst younger viewers who are more likely to get their sports content in small snippets and highlights on social media than to watch an entire live match or game.

The challenge for sports content providers is to find ways to entice those fans back to live viewing and one way to do that is by making the experience all the more immersive. The same survey showed that 30% of UK respondents would watch live games as if inside their favorite stadium or arena in VR, for example. Meanwhile Deloitte found that fans want more features as part of their SVOD services to enhance their sports-viewing experience. This includes 35% wanting real-time stats and analytics and 34% looking for different camera angles. When asked about the future of sports consumption, 54% of respondents believed it would be more immersive.

What does that all mean for audio? Quite simply, a video experience cannot be truly immersive without great audio. More than that, MPEG-H Audio allows personalized immersion in a way that takes it to another level. The object-based approach means that fans can choose exactly what they want to be immersed in by selecting one of several options offered by the broadcaster. Viewers might turn the commentary down, or even off, and turn up the fan noise, for example, to feel like they are in the stadium. At the same time, for fans wearing headphones, head movements can be tracked so the sound remains pinned to the point of origin, making for a much more natural experience.

Live contribution is challenging, especially in an immersive world

Live contribution comes with a number of challenges, which are even more pronounced when you add immersive experiences into the mix. Content providers need to ensure they can capture multiple video and audio feeds, add graphics and commentary, and get everything contributed in real-time to be distributed to viewers across the globe. At the same time, they need to deal with bandwidth limitations, latency, and ensuring a consistent audio quality. During live sporting matches, capturing the right audio for the viewers at home, without it being drowned out by other sounds in-stadium, can be extremely challenging. Getting that wrong leads to a less than ideal experience for the consumer at home.

By capturing very directed audio sources and creating objects with associated metadata, MPEG-H Audio gives the control to the viewer at home to get the right audio balance for them, creating a very personalized immersive experience, and clarity of sound on the things they want to hear. Simultaneously, it uses efficient compression techniques, ensuring that all of the audio objects can be delivered even when bandwidth could be limited, while maintaining extremely high quality of sound.

Media workflows are moving cloud-based

While the initial huge popularity of cloud-based media workflows was driven by necessity, it is clear that the benefits that media companies get from moving to the cloud mean it is here to stay. As more media workflows are becoming cloud-based, broadcasters and content providers are looking to make everything work seamlessly within the cloud. However, at the same time, quality remains paramount.

MPEG-H Audio enables content providers to efficiently contribute the audio and tightly coupled associated metadata directly into the cloud without losing quality. This is achieved using an MPEG-H contribution encoder on premise during a live event. It encodes the MPEG-H Production Format and makes it available in the cloud using typical protocols such as Zixi or SRT. Once ingested, it is fed into streaming encoders, such as the MainConcept Live Encoder, where it can be processed for distribution.


The future of immersive and personalized audio

Viewing experiences are becoming more immersive and interactive. It is not enough to have immersive video, content providers need to ensure they have the audio to match and provide a way for viewers to interact with the content. While virtual reality and live sports are leading the way, we will likely see immersive and personalized experiences across multiple genres, from live music to blockbuster movies. MPEG-H Audio will be key to enabling that.

Earlier this year, support for MPEG-H Audio was added in MainConcept’s Live Encoder and FFmpeg plugins, enabling the encoding, contribution, and streaming of content with personalized and immersive audio for multiple devices. This has proven especially interesting for businesses delivering live events, such as sports and concerts.

Visit us at booth #44 at SET Expo in Sao Paulo, Brazil from August 8 to 10 to hear it in action and find out more.


If you have any questions, suggestions, or if you need further information, please do not hesitate to contact us. For more information, please contact
 or visit our website