MPEG-H Audio Academy

Take a deep-dive into the MPEG-H Audio universe and explore tools for every stage of your journey

Learn all about MPEG-H Audio

Your journey starts here. Our tutorials and webinars are the ideal kick off to start working with MPEG-H Audio and experiencing first-hand the multi-layered production opportunities it has to offer. Tailored learning material guides you through the sophisticated production process. 

MPEG-H Learning in Academy

Authoring Personalized Immersive Audio

MPEG-H Audio is a part of industry leading workstations. And of course, there is our own MPEG-H Authoring Suite (MAS): The tools make the production of MPEG-H Audio content easier, faster, more intuitive, and more powerful. They support the recently published MPEG-H ADM Profile as well as binaural monitoring for immersive audio reproduction over headphones.

MPEG-H Authoring Suite Download

Download

Register and get the tools of the trade for free!

MPEG-H Authoring Suite Learn

Learn

Learn about MPEG-H Audio creation with industry-leading workstations and our tools.

MPEG-H Authoring Suite Play

Play

Demo content for your object-based MPEG-H Audio creation.

Read the Blog

Keep up to speed with the latest news and developments around MPEG-H Audio.

Test Signals

MPEG-H Signal Tests

Creating a suitable environment is always indispensable for a successful workflow. Therefore, setting up you speakers correctly is crucial for MPEG-H Audio to be able to unfold its full potential and beauty. The Set-up guide, channel identifications and various technical notes set your work environment up for success. 

Loudness Normalization and Target Loudness

The perfect volume for the playback of media content is a highly individual issue. A seamless experience where all content is played back at a similar loudness level is not. Make sure your loudness normalization works just as it should to deliver the best possible experience.

Older audio codecs are known for delivering content at varying levels of loudness. Even though a “louder” codec might initially be perceived as performing better, the disparity in playback loudness can interrupt the experience by forcing users to adjust the volume. That’s why advanced codecs such as MPEG-H Audio contain loudness information. It is the basis for evening out content pieces recorded at different loudness levels and ensures a seamless experience where users don’t have to adapt the volume of each content piece manually.

Use the test signals provided below to check if the loudness normalization of your MPEG-H Audio, AC-3, and E-AC-3 content works properly. Unlike these three codecs, AAC-LC does not include loudness information. Our test setup uses this fact to help you determine the target loudness of a playout device by comparing the normalized loudness level of MPEG-H Audio, AC-3, and E-AC-3 with the non-normalized one of AAC-LC.

Loudness Test Signals

AAC-LC

56 MB

AC-3

61 MB

E-AC-3

62 MB

MPEG-H

53 MB

TV Audio Setup

While setting up a home entertainment system is a breeze for gear heads, it may be a bit overwhelming for tech novices. With the quickly understandable instructions in this short PDF-guide, users will be plugging the right cables into the right sockets in no time. Easy to follow steps provide the information needed for any configuration.

MPEG-H TV and Speaker Setup

Publications

Learn more about MPEG-H Audio in practice from the publications Fraunhofer IIS released and contributed to. They cover all relevant topics from standardisation issues to technical reports and scientific papers.

FAQ

MPEG-H Audio is a new, next-generation audio technology providing more realism through sound from above as well as around the listener. With its unique personalization features, MPEG-H Audio offers viewers great flexibility to actively engage with the content and adapt it to their own preferences. Regardless of the device, the MPEG-H Audio System delivers the best sound experience possible.

MPEG-H Audio is a complete audio solution and much more than just a codec. Among others, it offers the following big advantages compared to legacy audio codecs:

1) Immersive Sound: MPEG-H Audio allows the transmission of three-dimensional immersive audio (3D-audio) by adding elevated sound sources above and below the listeners position. MPEG-H Audio has been specifically designed for flexible loudspeaker signaling including traditional layouts such as stereo, 5.1, 7.1, as well as 3D configurations, namely 5.1+4H, 7.1+4H or 22.2 or even yet to be defined layouts. Within MPEG-H Audio, immersive sound can be carried as channels, objects or as a combination of those.

2) Interactive and Personalized Sound: MPEG-H Audio enables the listener to interact with the content and create personalized audio experiences. The advanced interactivity options range from simple adjustments, for example, increasing or decreasing the dialogue level in relation to other audio elements, to advanced scenarios in which audio elements may be selected and adjusted in level and/or position as preferred by the listener and under the limits authored by the content creator.

3) Universal Delivery: MPEG-H offers flexibility by delivering of the same bit stream through different distribution platforms (e.g., terrestrial, satellite, broadband or mobile networks) to all types of devices (e.g., TV set, AVR, soundbar, set-top box, tablet, virtual reality gears with 360-degree video) in various environments, for example, living room, home theater, or noisy mobile environments.

MPEG-H Audio is an international standard developed by the ISO/IEC Moving Picture Experts Group (MPEG), the organisation which has a long history in audio coding with mp3 and the AAC codec family. The MPEG-H Audio standard (ISO/IEC 23008-3) specifies two relevant profiles – Low Complexity (LC) and Baseline (BL) – essential for the broadcast and streaming industry, which allow decoding and rendering of immersive, 3D-audio content while enabling advanced personalization features. Audio objects may be used alone or in combination with channels for efficient delivery and reproduction of immersive sound. The use of these audio objects allows for interactivity or personalization of a program by adjusting the gain or position of the objects during playback. Details about the MPEG-H Audio standard can be found here.

MPEG-H Audio is a complete audio solution. It does not use other audio codecs, its codec functionality builds upon the developments from previous generations of MPEG audio codecs such as the AAC codec family instead.

MPEG-H Audio enriches the audio experience by combining immersive sound and advanced personalization options with bit rate efficient, universal delivery to meet requirements of today’s consumer needs.

The MPEG-H Audio System has proven to be the most advanced audio solution for enhancing the broadcast and streaming services for sport events, empowering the audience to experience the emotion of the sports arena in their living room and to decide what is more important for themselves, for example, listening only to the crowd of their favorite team or focus on the commentary. Read more here.

Similarly to sport events, streaming of live concerts is another major use case where service providers are eager to enhance the their services with immersive sound and interactivity options. Read more here and here.

The advanced accessibility features of the MPEG-H Audio system are essential for the elderly and visually or hearing impaired audience. With its Dialog Enhancement and advanced Audio Description Services, MPEG-H Audio makes broadcast audio more accessible for all viewers.

MPEG-H was adopted in several broadcast, streaming and virtual reality standards. A list can be found here.

MPEG-H Audio powers the music format 360 Reality Audio, initiated by Sony. The first 360 Reality Audio immersive music streaming services from Amazon Music HD, Deezer, nugs.net, Sony Select and TIDAL have launched in fall of 2019 with currently more than 3000 songs available. Major Labels supporting the 360RA initiative include Sony Music Entertainment, Universal Music, and Warner Music.

The MPEG-H Audio System is used as the sole audio system in the world’s first terrestrial UHD TV service in South Korea. Launch of the system was in May 2017 and commercial services from KBS, MBC and SBS are on-the-air 24/7 since then.

A growing number of devices support MPEG-H Audio, like the Sennheiser Ambeo sound bar, Audio-Video-Receivers from Denon, Marantz and McIntosh, the Amazon Echo Studio smart speaker or the Google ChromeCast Ultra 4K, as well as TV sets from Samsung and LG for the UHD TV service in South Korea.

Because of the flexibility of MPEG-H Audio when it comes to signal configurations, there is no simple answer to that question, as the bitrate depends on the number of signals (channel signals or object signals). With an increasing number of signals in a configuration, the efficiency of the codec increases and the resulting total bitrate is smaller than the sum of single-encoded signals. The following table indicates bitrates for some common channel configurations resp. a combination of channel and object signals, starting with stereo and 5.1 surround to several 3D configurations (indicated by “H” for the height channels) and combinations of 3D channel configurations and different numbers of object signals. All given examples use a total number 16 or less signals that is covered by “Level 3” in the MPEG-H Audio standard, except for the last configuration, “22.2”, that is covered by “Level 4”.
Bit rates in kbit/s for Good Excellent Transparent
2.0 48 64 96
5.1 128 192 256
5.1+2H 160 256 320
5.1+4H 192 320 448
7.1+4H/5.1+4H + 2 Objects 256 – 288 384 – 420 512 – 576
7.1+4H + 3 Objects/5.1+4H + 5 Objects 352 – 384 480 – 576 640 – 768
22.2 512 768 1024
Scale according to MUSHRA Recommendation ITU-R BS. 1534-3

Existing broadcast services that use AAC/HE-AAC stereo or surround audio, can be enhanced with the advanced MPEG-H Audio features by simply adding an additional MPEG-H Audio stream in the multiplex. All audio and video broadcast encoders that support MPEG-H Audio can create a multiplex containing the AAC stream as well as the MPEG-H Audio stream. The former can be decoded by legacy receivers and the latter will be decoded by newer receivers.

MPEG-H Audio enabled devices natively offer a “User Interface” which displays all the interactivity options enabled by an MPEG-H stream. Based on the content creator’s intentions, for each MPEG-H stream, different interactivity options might be offered to the viewers at home and through the User Interface they have the freedom to personalise their content.

An MPEG-H Audio scene comprises the audio content itself together with additional metadata. This metadata is created during production and contains all necessary information to render the audio content in arbitrary reproduction layouts and to ensure the best audio experience on any platform.

MPEG-H Audio has been carefully designed for enhancing broadcast, streaming and immersive music applications. To ensure the integrity of metadata in an SDI-based environment at any production step, the metadata is delivered in the “Control Track”. The Control Track is a “time-code like” audio signal and can be treated as a regular audio channel. This ensures the synchronization of metadata with its corresponding audio and video signals. The Control Track is robust enough to survive A/D and D/A conversions, level changes, sample rate conversions or frame-wise editing. The Control Track does not force audio equipment to be put into data mode or non-audio mode in order to pass through.

An MPEG-H Master carries all the uncompressed audio content and production metadata of the MPEG-H Audio scene. An MPEG-H Master can either be a Broadcast Wave Format File carrying Audio Definition Model metadata compliant to the MPEG-H Profile (MPEG-H BWF/ADM) or an MPEG-H Production Format (MPF) file carrying the metadata inside an MPEG-H Control Track.

The MPEG-H Control Track is a unique solution for delivering the metadata aligned with the audio and video data though existing SDI-based infrastructures. The Control Track is as a “time-code like” PCM audio signal that can be carried on an extra SDI or wave-file channel. It can be edited in a video editor just as any other audio signal.

It allows transport of the metadata tightly coupled with the audio content over any medium offering transport of PCM data, such as SDI, MADI, or AoIP. The Control Track can be treated like any other audio signal and is robust against sample rate conversions or level changes. The metadata contained in the Control Track is aligned to the audio and video data, thus any configuration change in live or post production can be applied at every video frame boundary.

The MPEG-H Production Format (MPF) is a multi-channel PCM audio file which contains all the audio content and production metadata of the MPEG-H Audio scene. The metadata is stored as a Control Track, which is a timecode-like PCM audio signal and one of the audio tracks in the multichannel wave-file.

The Audio Definition Model (ADM) according to ITU-R BS.2076 defines an open metadata format for production, exchange and archiving of next-generation audio (NGA) content in file-based workflows. Its comprehensive metadata syntax allows describing many types of audio content including channel-, object-, and scene-based representations for immersive and interactive audio experiences. A serial representation of the Audio Definition Model (S-ADM) is specified in ITU-R BS.2125 and defines a segmentation of the original ADM for use in linear workflows such as real-time production for broadcasting and streaming applications.

The MPEG-H ADM Profile defines constraints on ITU-R BS.2076 and ITU-R BS.2125 that enable interoperability with established NGA content production and distribution systems for MPEG-H Audio as defined in ISO/IEC 23008-3.

The freely available Fraunhofer ADM Info Tool is a software utility that provides support in creating profile-conform ADM metadata. Its conformance check framework runs input ADM metadata against an exhaustive set of checks derived from the MPEG-H ADM Profile, gathering detailed reports of any encountered conformance issues and providing information on how to resolve them.

With the MPEG-H Conversion Tool, Fraunhofer offers a simple one-click solution for converting existing Dolby Atmos BWF/ADM files into the MPEG-H Production Format. The tool is available as part of the MPEG-H Authoring Suite (MAS).

Fraunhofer IIS offers Production Tools, bundled in the MPEG-H Authoring Suite. The suite consists of the MPEG-H Authoring Plug-in (MHAPi), the standalone MPEG-H Authoring Tool (MHAT) and the  MPEG-H Conversion Tool (MCO).

Register here for a download of the MPEG-H Authoring Suite

Other options for producing MPEG-H include the New Audio Technology Spatial Audio Designer and Blackmagic DaVinci Resolve Studio for post-production workflows, as well as the Linear Acoustic AMS and the Jünger MMA Hardware for live production with MPEG-H Audio.

The MPEG-H Authoring Suite (MAS) is a set of tools that make the production of MPEG-H Audio content easier, faster, more intuitive, and more powerful. They support the recently published MPEG-H ADM Profile, as well as binaural monitoring for immersive audio reproduction over headphones.

The MPEG-H Authoring Plug-in (MHAPi) takes you through all the steps of creating object- or channel-based MPEG-H Audio productions inside a VST3- or AAX-enabled digital audio workstation (DAW). You will be able to export your immersive and interactive MPEG-H Audio scenes to either MPEG-H Production Format (MPF) or MPEG-H BWF/ADM, containing audio and metadata and ready for distribution via MPEG-H-enabled channels.

The MPEG-H Authoring Tool (MHAT) is a new software tool for Mac and Windows that helps you create MPEG-H metadata with existing audio material. The MHAT allows for easy MPEG-H authoring without the need of a digital audio workstation (DAW). You can define specific MPEG-H parameters, instantly listen to your configurations and export your authored mixes as MPEG-H Production Format (MPF), MPEG-H BWF/ADM or as a template export in an XML file.

The MPEG-H Conversion Tool (MCO) is a software tool for Mac and Windows that can be used to convert MPEG-H compliant content masters. The MCO serves as interface to the MPEG-H Audio ecosystem and supports the import and export of MPEG-H Production Format (MPF) and BWF/ADM files.

The MPEG-H Production Format Player (MPF-Player) is a software tool for Mac and Windows to check the quality of already authored MPEG-H metadata and the accompanying audio mix, with or without a corresponding video.

Object-based production requires a metadata authoring step for the object-based interactivity and accessibility features as well as for loudness measurement. There is no single answer that fits all kinds of production environments and production requirements, but a range of typical workflows starting at simple, automated or preset-based authoring that fits the most common content types, up to comprehensive authoring workflows for advanced applications. See here for more information.

The MPEG-H Audio System has been designed such that content creators can define multiple presets and explore new creative options. A broadcaster can prepare mixes (including the default or main mix of the program) using authoring tools that specify an ensemble of gain and position settings for objects to create preset mix selections that can be presented on a simple menu to the user. Even more control of the audio elements in a program is possible and can be enabled in the »advanced MPEG-H Audio interactivity menu« by enthusiast viewers. All interactivity features offered to the user are strictly defined by the broadcaster during metadata creation. This process of generating metadata is called »authoring« and is the most important difference in production of MPEG-H Audio content compared to a legacy production.

There are multiple solutions, depending on the production scenario. Using the tools of the MPEG-H Authoring Suite in post-productions, audio and metadata can be exported as:

MPEG­-H BWF/ADM: An MPEG-­H BWF/ADM (short for Broadcast Wave Format with embedded Audio Definition Model metadata) file is a multi­channel wave-­file which contains all the audio and metadata for the MPEG-­H scene. The exported BWF/ADM file is compliant to the MPEG­-H ADM Profile. Loudness will be measu­red during export and will be embedded into the exported file.

MPF: An MPF (short for MPEG­-H Production Format) file is a multi­channel wave-­file which contains all the audio and metadata for the MPEG-­H scene. The metadata is stored in the Control Track, which is one of the audio tracks in the multichannel wave-file and contains a modulated signal that is robust against sample rate conversions or level changes. Loudness will be measured during export and will be embedded into the exported file.

XML: This export option is intended for special applications that make use of MPEG­-H scene definitions as XML representation. The XML is accompanied by a multichan­nel wave file containing the audio essence.

For more information watch this video on Vimeo or this video on Youtube.

For MPEG-H live-productions, the Authoring and Monitoring Units (AMAU) export the audio signals and the Control Track in realtime. It allows transport of the metadata tightly coupled with the audio content over any medium offering transport of PCM data, such as SDI, MADI, or AoIP. The Control Track can be treated like any other audio signal and is robust against sample rate conversions or level changes.

For more information watch this video.

Yes, the MPEG-H Authoring Suite supports the export of audio and metadata as BWF/ADM according to the MPEG-H ADM Profile (MPEG-H BWF/ADM). You can dowload the profile here.

MPEG-H Audio has been specifically designed for flexible loudspeaker rendering, including traditional layouts such as stereo, 5.1 and 7.1, as well as 3D-audio configurations with height channels, like 5.1+4H and 7.1+4H, or configurations with height, mid and lower-layer channels, for example 22.2, or even yet to be defined layouts.

The loudspeaker configuration depends on the requirements of the intended production. Recommendations for loudspeaker placement, studio design and productions workflows can be found here.

We offer MPEG-H test signals including channel identification, lip sync, and level checks for verifying that the speakers are connected and adjusted properly.

Yes, this option is available in version 3.5 of the MPEG-H Authoring Suite.

MPEG-H Audio supports downmixing to typical, common speaker layouts with a set of pre-defined downmix configurations. Additionally, it comes with customizable downmix options enabling content-specific downmixing that is configurable for each layout.

Yes, this functionality can be enabled using the Dynamic Gains feature in the MPEG-H Authoring Plug-in version 3.0 and higher and in the MPEG-H Authoring Suite.

Yes, the MPEG-H Authoring Suite comes with a set of template sessions for Nuendo, Pro Tools, Reaper and Sequoia.

As a first step, we’d like to recommend our series of tutorial videos to help you get started with MPEG-H Authoring using our MPEG-H Authoring Plug-in.

Watch on YouTube

Watch on Vimeo

If you have further questions, you can always get in touch with our MPEG-H Tool experts via: productiontools-techsupport@iis.fraunhofer.de

FAQ

MPEG-H Audio is a new, next-generation audio technology providing more realism through sound from above as well as around the listener. With its unique personalization features, MPEG-H Audio offers viewers great flexibility to actively engage with the content and adapt it to their own preferences. Regardless of the device, the MPEG-H Audio System delivers the best sound experience possible.

MPEG-H Audio is a complete audio solution and much more than just a codec. Among others, it offers the following big advantages compared to legacy audio codecs:

1) Immersive Sound: MPEG-H Audio allows the transmission of three-dimensional immersive audio (3D-audio) by adding elevated sound sources above and below the listeners position. MPEG-H Audio has been specifically designed for flexible loudspeaker signaling including traditional layouts such as stereo, 5.1, 7.1, as well as 3D configurations, namely 5.1+4H, 7.1+4H or 22.2 or even yet to be defined layouts. Within MPEG-H Audio, immersive sound can be carried as channels, objects or as a combination of those.

2) Interactive and Personalized Sound: MPEG-H Audio enables the listener to interact with the content and create personalized audio experiences. The advanced interactivity options range from simple adjustments, for example, increasing or decreasing the dialogue level in relation to other audio elements, to advanced scenarios in which audio elements may be selected and adjusted in level and/or position as preferred by the listener and under the limits authored by the content creator.

3) Universal Delivery: MPEG-H offers flexibility by delivering of the same bit stream through different distribution platforms (e.g., terrestrial, satellite, broadband or mobile networks) to all types of devices (e.g., TV set, AVR, soundbar, set-top box, tablet, virtual reality gears with 360-degree video) in various environments, for example, living room, home theater, or noisy mobile environments.

MPEG-H Audio is an international standard developed by the ISO/IEC Moving Picture Experts Group (MPEG), the organisation which has a long history in audio coding with mp3 and the AAC codec family. The MPEG-H Audio standard (ISO/IEC 23008-3) specifies two relevant profiles – Low Complexity (LC) and Baseline (BL) – essential for the broadcast and streaming industry, which allow decoding and rendering of immersive, 3D-audio content while enabling advanced personalization features. Audio objects may be used alone or in combination with channels for efficient delivery and reproduction of immersive sound. The use of these audio objects allows for interactivity or personalization of a program by adjusting the gain or position of the objects during playback. Details about the MPEG-H Audio standard can be found here.

MPEG-H Audio is a complete audio solution. It does not use other audio codecs, its codec functionality builds upon the developments from previous generations of MPEG audio codecs such as the AAC codec family instead.

MPEG-H Audio enriches the audio experience by combining immersive sound and advanced personalization options with bit rate efficient, universal delivery to meet requirements of today’s consumer needs.

The MPEG-H Audio System has proven to be the most advanced audio solution for enhancing the broadcast and streaming services for sport events, empowering the audience to experience the emotion of the sports arena in their living room and to decide what is more important for themselves, for example, listening only to the crowd of their favorite team or focus on the commentary. Read more here.

Similarly to sport events, streaming of live concerts is another major use case where service providers are eager to enhance the their services with immersive sound and interactivity options. Read more here and here.

The advanced accessibility features of the MPEG-H Audio system are essential for the elderly and visually or hearing impaired audience. With its Dialog Enhancement and advanced Audio Description Services, MPEG-H Audio makes broadcast audio more accessible for all viewers.

MPEG-H was adopted in several broadcast, streaming and virtual reality standards. A list can be found here.

MPEG-H Audio powers the music format 360 Reality Audio, initiated by Sony. The first 360 Reality Audio immersive music streaming services from Amazon Music HD, Deezer, nugs.net, Sony Select and TIDAL have launched in fall of 2019 with currently more than 3000 songs available. Major Labels supporting the 360RA initiative include Sony Music Entertainment, Universal Music, and Warner Music.

The MPEG-H Audio System is used as the sole audio system in the world’s first terrestrial UHD TV service in South Korea. Launch of the system was in May 2017 and commercial services from KBS, MBC and SBS are on-the-air 24/7 since then.

A growing number of devices support MPEG-H Audio, like the Sennheiser Ambeo sound bar, Audio-Video-Receivers from Denon, Marantz and McIntosh, the Amazon Echo Studio smart speaker or the Google ChromeCast Ultra 4K, as well as TV sets from Samsung and LG for the UHD TV service in South Korea.

Because of the flexibility of MPEG-H Audio when it comes to signal configurations, there is no simple answer to that question, as the bitrate depends on the number of signals (channel signals or object signals). With an increasing number of signals in a configuration, the efficiency of the codec increases and the resulting total bitrate is smaller than the sum of single-encoded signals. The following table indicates bitrates for some common channel configurations resp. a combination of channel and object signals, starting with stereo and 5.1 surround to several 3D configurations (indicated by “H” for the height channels) and combinations of 3D channel configurations and different numbers of object signals. All given examples use a total number 16 or less signals that is covered by “Level 3” in the MPEG-H Audio standard, except for the last configuration, “22.2”, that is covered by “Level 4”.
Bit rates in kbit/s for Good Excellent Transparent
2.0 48 64 96
5.1 128 192 256
5.1+2H 160 256 320
5.1+4H 192 320 448
7.1+4H/5.1+4H + 2 Objects 256 – 288 384 – 420 512 – 576
7.1+4H + 3 Objects/5.1+4H + 5 Objects 352 – 384 480 – 576 640 – 768
22.2 512 768 1024
Scale according to MUSHRA Recommendation ITU-R BS. 1534-3

Existing broadcast services that use AAC/HE-AAC stereo or surround audio, can be enhanced with the advanced MPEG-H Audio features by simply adding an additional MPEG-H Audio stream in the multiplex. All audio and video broadcast encoders that support MPEG-H Audio can create a multiplex containing the AAC stream as well as the MPEG-H Audio stream. The former can be decoded by legacy receivers and the latter will be decoded by newer receivers.

MPEG-H Audio enabled devices natively offer a “User Interface” which displays all the interactivity options enabled by an MPEG-H stream. Based on the content creator’s intentions, for each MPEG-H stream, different interactivity options might be offered to the viewers at home and through the User Interface they have the freedom to personalise their content.

An MPEG-H Audio scene comprises the audio content itself together with additional metadata. This metadata is created during production and contains all necessary information to render the audio content in arbitrary reproduction layouts and to ensure the best audio experience on any platform.

MPEG-H Audio has been carefully designed for enhancing broadcast, streaming and immersive music applications. To ensure the integrity of metadata in an SDI-based environment at any production step, the metadata is delivered in the “Control Track”. The Control Track is a “time-code like” audio signal and can be treated as a regular audio channel. This ensures the synchronization of metadata with its corresponding audio and video signals. The Control Track is robust enough to survive A/D and D/A conversions, level changes, sample rate conversions or frame-wise editing. The Control Track does not force audio equipment to be put into data mode or non-audio mode in order to pass through.

An MPEG-H Master carries all the uncompressed audio content and production metadata of the MPEG-H Audio scene. An MPEG-H Master can either be a Broadcast Wave Format File carrying Audio Definition Model metadata compliant to the MPEG-H Profile (MPEG-H BWF/ADM) or an MPEG-H Production Format (MPF) file carrying the metadata inside an MPEG-H Control Track.

The MPEG-H Control Track is a unique solution for delivering the metadata aligned with the audio and video data though existing SDI-based infrastructures. The Control Track is as a “time-code like” PCM audio signal that can be carried on an extra SDI or wave-file channel. It can be edited in a video editor just as any other audio signal.

It allows transport of the metadata tightly coupled with the audio content over any medium offering transport of PCM data, such as SDI, MADI, or AoIP. The Control Track can be treated like any other audio signal and is robust against sample rate conversions or level changes. The metadata contained in the Control Track is aligned to the audio and video data, thus any configuration change in live or post production can be applied at every video frame boundary.

The MPEG-H Production Format (MPF) is a multi-channel PCM audio file which contains all the audio content and production metadata of the MPEG-H Audio scene. The metadata is stored as a Control Track, which is a timecode-like PCM audio signal and one of the audio tracks in the multichannel wave-file.

The Audio Definition Model (ADM) according to ITU-R BS.2076 defines an open metadata format for production, exchange and archiving of next-generation audio (NGA) content in file-based workflows. Its comprehensive metadata syntax allows describing many types of audio content including channel-, object-, and scene-based representations for immersive and interactive audio experiences. A serial representation of the Audio Definition Model (S-ADM) is specified in ITU-R BS.2125 and defines a segmentation of the original ADM for use in linear workflows such as real-time production for broadcasting and streaming applications.

The MPEG-H ADM Profile defines constraints on ITU-R BS.2076 and ITU-R BS.2125 that enable interoperability with established NGA content production and distribution systems for MPEG-H Audio as defined in ISO/IEC 23008-3.

The freely available Fraunhofer ADM Info Tool is a software utility that provides support in creating profile-conform ADM metadata. Its conformance check framework runs input ADM metadata against an exhaustive set of checks derived from the MPEG-H ADM Profile, gathering detailed reports of any encountered conformance issues and providing information on how to resolve them.

With the MPEG-H Conversion Tool, Fraunhofer offers a simple one-click solution for converting existing Dolby Atmos BWF/ADM files into the MPEG-H Production Format. The tool is available as part of the MPEG-H Authoring Suite (MAS).

Fraunhofer IIS offers Production Tools, bundled in the MPEG-H Authoring Suite. The suite consists of the MPEG-H Authoring Plug-in (MHAPi), the standalone MPEG-H Authoring Tool (MHAT) and the  MPEG-H Conversion Tool (MCO).

Register here for a download of the MPEG-H Authoring Suite

Other options for producing MPEG-H include the New Audio Technology Spatial Audio Designer and Blackmagic DaVinci Resolve Studio for post-production workflows, as well as the Linear Acoustic AMS and the Jünger MMA Hardware for live production with MPEG-H Audio.

The MPEG-H Authoring Suite (MAS) is a set of tools that make the production of MPEG-H Audio content easier, faster, more intuitive, and more powerful. They support the recently published MPEG-H ADM Profile, as well as binaural monitoring for immersive audio reproduction over headphones.

The MPEG-H Authoring Plug-in (MHAPi) takes you through all the steps of creating object- or channel-based MPEG-H Audio productions inside a VST3- or AAX-enabled digital audio workstation (DAW). You will be able to export your immersive and interactive MPEG-H Audio scenes to either MPEG-H Production Format (MPF) or MPEG-H BWF/ADM, containing audio and metadata and ready for distribution via MPEG-H-enabled channels.

The MPEG-H Authoring Tool (MHAT) is a new software tool for Mac and Windows that helps you create MPEG-H metadata with existing audio material. The MHAT allows for easy MPEG-H authoring without the need of a digital audio workstation (DAW). You can define specific MPEG-H parameters, instantly listen to your configurations and export your authored mixes as MPEG-H Production Format (MPF), MPEG-H BWF/ADM or as a template export in an XML file.

The MPEG-H Conversion Tool (MCO) is a software tool for Mac and Windows that can be used to convert MPEG-H compliant content masters. The MCO serves as interface to the MPEG-H Audio ecosystem and supports the import and export of MPEG-H Production Format (MPF) and BWF/ADM files.

The MPEG-H Production Format Player (MPF-Player) is a software tool for Mac and Windows to check the quality of already authored MPEG-H metadata and the accompanying audio mix, with or without a corresponding video.

Object-based production requires a metadata authoring step for the object-based interactivity and accessibility features as well as for loudness measurement. There is no single answer that fits all kinds of production environments and production requirements, but a range of typical workflows starting at simple, automated or preset-based authoring that fits the most common content types, up to comprehensive authoring workflows for advanced applications. See here for more information.

The MPEG-H Audio System has been designed such that content creators can define multiple presets and explore new creative options. A broadcaster can prepare mixes (including the default or main mix of the program) using authoring tools that specify an ensemble of gain and position settings for objects to create preset mix selections that can be presented on a simple menu to the user. Even more control of the audio elements in a program is possible and can be enabled in the »advanced MPEG-H Audio interactivity menu« by enthusiast viewers. All interactivity features offered to the user are strictly defined by the broadcaster during metadata creation. This process of generating metadata is called »authoring« and is the most important difference in production of MPEG-H Audio content compared to a legacy production.

There are multiple solutions, depending on the production scenario. Using the tools of the MPEG-H Authoring Suite in post-productions, audio and metadata can be exported as:

MPEG­-H BWF/ADM: An MPEG-­H BWF/ADM (short for Broadcast Wave Format with embedded Audio Definition Model metadata) file is a multi­channel wave-­file which contains all the audio and metadata for the MPEG-­H scene. The exported BWF/ADM file is compliant to the MPEG­-H ADM Profile. Loudness will be measu­red during export and will be embedded into the exported file.

MPF: An MPF (short for MPEG­-H Production Format) file is a multi­channel wave-­file which contains all the audio and metadata for the MPEG-­H scene. The metadata is stored in the Control Track, which is one of the audio tracks in the multichannel wave-file and contains a modulated signal that is robust against sample rate conversions or level changes. Loudness will be measured during export and will be embedded into the exported file.

XML: This export option is intended for special applications that make use of MPEG­-H scene definitions as XML representation. The XML is accompanied by a multichan­nel wave file containing the audio essence.

For more information watch this video on Vimeo or this video on Youtube.

For MPEG-H live-productions, the Authoring and Monitoring Units (AMAU) export the audio signals and the Control Track in realtime. It allows transport of the metadata tightly coupled with the audio content over any medium offering transport of PCM data, such as SDI, MADI, or AoIP. The Control Track can be treated like any other audio signal and is robust against sample rate conversions or level changes.

For more information watch this video.

Yes, the MPEG-H Authoring Suite supports the export of audio and metadata as BWF/ADM according to the MPEG-H ADM Profile (MPEG-H BWF/ADM). You can dowload the profile here.

MPEG-H Audio has been specifically designed for flexible loudspeaker rendering, including traditional layouts such as stereo, 5.1 and 7.1, as well as 3D-audio configurations with height channels, like 5.1+4H and 7.1+4H, or configurations with height, mid and lower-layer channels, for example 22.2, or even yet to be defined layouts.

The loudspeaker configuration depends on the requirements of the intended production. Recommendations for loudspeaker placement, studio design and productions workflows can be found here.

We offer MPEG-H test signals including channel identification, lip sync, and level checks for verifying that the speakers are connected and adjusted properly.

Yes, this option is available in version 3.5 of the MPEG-H Authoring Suite.

MPEG-H Audio supports downmixing to typical, common speaker layouts with a set of pre-defined downmix configurations. Additionally, it comes with customizable downmix options enabling content-specific downmixing that is configurable for each layout.

Yes, this functionality can be enabled using the Dynamic Gains feature in the MPEG-H Authoring Plug-in version 3.0 and higher and in the MPEG-H Authoring Suite.

Yes, the MPEG-H Authoring Suite comes with a set of template sessions for Nuendo, Pro Tools, Reaper and Sequoia.

As a first step, we’d like to recommend our series of tutorial videos to help you get started with MPEG-H Authoring using our MPEG-H Authoring Plug-in.

Watch on YouTube

Watch on Vimeo

If you have further questions, you can always get in touch with our MPEG-H Tool experts via: productiontools-techsupport@iis.fraunhofer.de