Light

5.2 Ambisonics and spatial audio formats

6 min read•august 19, 2024

is a cutting-edge audio technique that creates a full-sphere soundfield, offering a more immersive experience than traditional formats. It captures and reproduces sound from all directions, including height, allowing for flexible playback on various systems.

Ambisonics uses spherical harmonics to decompose the soundfield, representing it with coefficients that describe sound pressure and direction. Different formats like A-format, B-format, and Higher Order Ambisonics (HOA) offer various ways to store and transmit these immersive audio signals.

Ambisonics overview

Ambisonics is a full-sphere surround sound technique that aims to recreate a complete soundfield at a given point in space
Provides a more immersive and realistic spatial audio experience compared to traditional audio formats
Allows for flexible playback on various speaker configurations or headphones

History of ambisonics

Top images from around the web for History of ambisonics

Friday Interview #3: Jörn Nettingsmeier | Libre Music Production View original
Is this image relevant?
LAC 2012 - The why and how of with-height surround production in Ambisonics : linuxaudio.org ... View original
Is this image relevant?
How Immersive Sound Brings Mixed Reality to Life – Microsoft Design – Medium View original
Is this image relevant?
Friday Interview #3: Jörn Nettingsmeier | Libre Music Production View original
Is this image relevant?
LAC 2012 - The why and how of with-height surround production in Ambisonics : linuxaudio.org ... View original
Is this image relevant?

1 of 3

Top images from around the web for History of ambisonics

Friday Interview #3: Jörn Nettingsmeier | Libre Music Production View original
Is this image relevant?
LAC 2012 - The why and how of with-height surround production in Ambisonics : linuxaudio.org ... View original
Is this image relevant?
How Immersive Sound Brings Mixed Reality to Life – Microsoft Design – Medium View original
Is this image relevant?
Friday Interview #3: Jörn Nettingsmeier | Libre Music Production View original
Is this image relevant?
LAC 2012 - The why and how of with-height surround production in Ambisonics : linuxaudio.org ... View original
Is this image relevant?

1 of 3

Developed in the 1970s by Michael Gerzon and Peter Fellgett
Initially aimed to improve upon quadraphonic sound systems
Gained popularity in recent years due to advancements in virtual reality and 360° audio

Ambisonics vs stereo

Stereo audio uses two channels (left and right) to create a limited spatial image
Ambisonics captures and reproduces sound from all directions, including height information
Ambisonic recordings can be rotated and tilted during playback, unlike stereo

Ambisonics vs surround sound

Surround sound systems (5.1, 7.1) use discrete channels for specific speaker positions
Ambisonics encodes the full soundfield, allowing for flexible speaker placement and scalability
Ambisonics can be decoded to various speaker configurations or binaurally rendered for headphones

Ambisonic theory

Based on the decomposition of a soundfield into spherical harmonics
Represents the soundfield using a set of coefficients that describe the sound pressure and direction
Higher-order ambisonics capture more spatial detail and allow for a larger sweet spot

Sound field decomposition

Soundfield is decomposed into a series of spherical harmonics
Spherical harmonics are a set of orthogonal basis functions on the surface of a sphere
Each harmonic represents a specific spatial pattern of sound pressure

Spherical harmonics

Denoted by the order (n) and degree (m)
Zero-order (W) represents the omnidirectional component
First-order (X, Y, Z) captures directional information
Higher-order harmonics capture more detailed spatial information

Ambisonic order

Determines the spatial resolution and the number of channels required
First-order ambisonics (FOA) uses four channels (W, X, Y, Z)
Higher-order ambisonics (HOA) use more channels for increased spatial resolution
- Second-order ambisonics use nine channels
- Third-order ambisonics use 16 channels

Periphonic vs pantophonic

Periphonic ambisonics capture and reproduce sound in full-sphere (including height)
Pantophonic ambisonics only capture and reproduce sound in the horizontal plane
Periphonic ambisonics require more channels and a 3D speaker setup for accurate reproduction

Ambisonic formats

Different representations of the ambisonic signal
Describe how the ambisonic components are stored and transmitted
Choice of format depends on the application and workflow

A-format

Raw output from a soundfield microphone or tetrahedral microphone array
Consists of four unprocessed capsule signals (LFU, RFD, LBD, RBU)
Requires encoding to B-format for further processing and playback

B-format

Most common ambisonic format for first-order ambisonics
Consists of four channels (W, X, Y, Z)
- W: omnidirectional component
- X: front-back directional component
- Y: left-right directional component
- Z: up-down directional component
Can be easily manipulated and decoded for various playback systems

Higher order ambisonics (HOA)

Extension of B-format to higher orders
Captures more detailed spatial information
Requires more channels and processing power
Enables a larger sweet spot and more accurate spatial reproduction

Ambisonic recording

Capturing a soundfield using specialized microphone arrays
Aims to preserve the spatial characteristics of the recorded environment
Requires encoding of the raw microphone signals into an ambisonic format

Soundfield microphones

Integrated microphone arrays designed for ambisonic recording
Typically consist of four closely-spaced capsules arranged in a tetrahedron
Examples: Core Sound TetraMic, Sennheiser AMBEO VR Mic

Tetrahedral microphone arrays

Custom-built microphone arrays using four matched capsules
Capsules are arranged in a tetrahedral configuration
Allows for flexibility in capsule selection and spacing

Ambisonic encoding

Process of converting raw microphone signals into an ambisonic format
Involves applying a matrix transformation to the capsule signals
Encoding coefficients depend on the microphone array geometry and normalization scheme

Ambisonic processing

Manipulating and transforming ambisonic signals
Allows for creative control and adaptation to different playback scenarios
Includes decoding, rendering, and mixing techniques

Ambisonic decoding

Process of converting ambisonic signals to speaker feeds
Involves applying a decoding matrix based on the speaker layout
Different decoding schemes available (e.g., projection, pseudo-inverse, mode-matching)

Binaural rendering of ambisonics

Creating a binaural stereo output from ambisonic signals
Allows for immersive headphone playback
Utilizes head-related transfer functions (HRTFs) to simulate spatial cues
Requires head-tracking for accurate localization

Ambisonic manipulation

Rotating, tilting, and zooming ambisonic soundfields
Achieved by applying rotation matrices to the ambisonic coefficients
Enables dynamic spatial transformations and interactive audio experiences

Ambisonic mixing

Combining multiple ambisonic signals and mono/stereo sources
Involves encoding non-ambisonic sources into the ambisonic domain
Requires careful consideration of source positions and spatial coherence

Ambisonic playback

Reproducing ambisonic audio on various playback systems
Ranges from dedicated ambisonic speaker arrays to headphone playback
Aims to recreate the original soundfield and provide an immersive listening experience

Ambisonic speaker layouts

Regular speaker arrangements for optimal ambisonic playback
Examples: cube, octahedron, icosahedron
Higher-order ambisonics require more speakers for accurate reproduction

Virtual loudspeakers

Technique for playing ambisonics on non-regular speaker layouts
Involves virtualizing a regular ambisonic speaker layout using VBAP or DBAP
Allows for flexible playback on existing surround sound systems

Headphone playback of ambisonics

Reproducing ambisonics on headphones using binaural rendering
Requires head-tracking for accurate localization
Provides an immersive experience without the need for a physical speaker setup

Spatial audio formats

Different approaches to representing and storing spatial audio information
Includes channel-based, object-based, and scene-based formats
Choice of format depends on the application, content, and playback system

Channel-based formats

Traditional approach using discrete audio channels for each speaker
Examples: stereo, 5.1, 7.1, 22.2
Limited flexibility and adaptability to different playback systems

Object-based formats

Represents audio as individual sound objects with metadata
Metadata includes position, size, and other properties
Allows for dynamic rendering and personalization of the audio scene
Examples: Dolby Atmos, DTS:X, MPEG-H 3D Audio

Scene-based formats

Represents the entire soundfield using a set of basis functions
Ambisonics is an example of a scene-based format
Provides a compact and flexible representation of spatial audio
Enables rendering to various playback systems and speaker layouts

Ambisonics applications

Ambisonics finds use in various fields related to immersive audio and virtual reality
Provides a compelling and realistic spatial audio experience
Enhances the sense of presence and immersion in virtual environments

VR and 360° video

Ambisonics is well-suited for spatial audio in virtual reality applications
Allows for head-tracked audio that responds to user movements
Provides a seamless and in 360° videos

Immersive installations

Ambisonics enables the creation of immersive audio installations
Surrounds the listener with a full-sphere soundfield
Ideal for soundscapes, interactive art, and experiential exhibits

Game audio

Ambisonics can enhance the spatial audio in video games
Provides a more realistic and immersive sound environment
Allows for dynamic audio that adapts to the player's actions and position

Music production

Ambisonics offers new creative possibilities for music production
Enables the creation of immersive and spatially-rich musical experiences
Allows for spatial composition, mixing, and live performances in 3D space

Key Terms to Review (18)

3D Audio Rendering: 3D audio rendering refers to the process of creating sound that simulates a three-dimensional space, allowing listeners to perceive sounds as coming from specific directions and distances. This technique enhances immersive experiences in virtual and augmented reality by providing a realistic auditory environment that matches visual elements, making users feel as if they are truly present in the space. 3D audio rendering is essential for creating believable interactions within virtual worlds and helps in establishing an emotional connection with the experience.

Ambisonics: Ambisonics is a spatial audio technique that captures and reproduces sound in three-dimensional space, allowing for an immersive audio experience. This method encodes sound using spherical harmonics, enabling accurate localization of sound sources regardless of the listener's position. It connects with various aspects of audio technology, including sound design in virtual environments and enhancing the perception of spatial audio formats.

Audio-reactive visuals: Audio-reactive visuals are dynamic visual elements that respond in real-time to audio input, creating a synchronized experience between sound and imagery. This interaction enhances the immersive quality of environments, particularly in virtual reality and multimedia art installations, allowing audiences to engage more deeply with the experience. By utilizing various data from the audio, such as frequency, amplitude, and rhythm, these visuals can be transformed, animated, or generated in accordance with the sound.

Auditory Scene Analysis: Auditory scene analysis is the process by which the auditory system organizes sound information into meaningful perceptions, allowing us to distinguish between different sound sources in our environment. This process is crucial for understanding complex auditory environments, as it enables listeners to separate overlapping sounds and identify their origins. By doing so, it plays a significant role in how we perceive spatial audio and sound within immersive virtual environments.

Binaural Audio: Binaural audio is a recording technique that captures sound using two microphones, creating a three-dimensional auditory experience for listeners. This technique mimics human hearing, allowing sounds to be perceived from specific locations in space, which is crucial for creating immersive experiences in virtual reality and enhancing the realism of sound in various applications. Binaural audio enhances the listener's ability to identify the direction and distance of sounds, making it essential for spatial audio formats and integrating with audio middleware to create rich auditory environments.

David McKeown: David McKeown is a prominent figure in the field of spatial audio and immersive sound design, known for his work in advancing techniques related to ambisonics and immersive sound formats. He has contributed significantly to the development of tools and methodologies that enhance the user experience in virtual environments through spatial audio technologies. His research emphasizes the importance of accurately representing sound sources in three-dimensional space, which is crucial for creating realistic auditory experiences in virtual and augmented realities.

Immersive audio experience: An immersive audio experience is a sound environment that envelops the listener, creating a sense of presence and engagement in a virtual or augmented reality setting. This type of audio goes beyond traditional stereo sound, utilizing advanced techniques like Ambisonics and spatial audio formats to replicate real-world acoustics, allowing users to perceive sounds from various directions and distances. By enhancing the realism of sound in these environments, it plays a crucial role in how individuals interact with and perceive virtual spaces.

Interactive Sound Design: Interactive sound design refers to the creation and implementation of sound elements that respond to user actions or environmental changes within a digital medium. This type of sound design enhances the immersive experience by allowing users to engage with audio elements that adapt dynamically, creating a more responsive and engaging atmosphere. It is particularly important in fields like gaming and virtual reality, where sound can significantly impact user experience and immersion.

Julius Smith: Julius Smith is a prominent figure in the field of spatial audio and ambisonics, recognized for his contributions to audio engineering and immersive sound experiences. He is known for developing important tools and algorithms that have enhanced the understanding and implementation of spatial audio formats, particularly in the context of creating immersive environments for virtual reality and multimedia applications. His work has been instrumental in shaping how we perceive and interact with sound in three-dimensional space.

Listening tests: Listening tests are evaluations designed to assess an individual's ability to perceive and analyze audio content, specifically in the context of spatial audio formats and ambisonics. These tests can involve various tasks such as identifying sound locations, recognizing sound sources, or discerning differences in audio quality. They are crucial for understanding how listeners interact with complex audio environments and can help in evaluating the effectiveness of spatial audio systems.

Max/MSP: Max/MSP is a visual programming language primarily used for music and multimedia, allowing users to create interactive software by connecting graphical elements. It integrates audio, video, and sensor input, making it a powerful tool for artists and developers who want to create immersive experiences. Max/MSP's flexibility enables artists to experiment with sound spatialization techniques and biofeedback applications in their works.

Psychoacoustics: Psychoacoustics is the study of how humans perceive sound, examining the psychological and physiological effects of sound waves on our senses. This field explores not just the physical properties of sound, but also how we interpret those sounds in our environment, which is essential for creating realistic audio experiences in various formats. Understanding psychoacoustics is crucial for designing immersive audio environments, such as those found in virtual reality, where sound localization and spatial audio enhance user experience.

Reaper: In the context of ambisonics and spatial audio formats, a reaper is a digital audio workstation (DAW) that supports advanced audio recording, editing, and mixing capabilities. It allows artists and sound designers to manipulate audio tracks in a way that can incorporate spatial sound formats, creating immersive experiences that engage listeners from multiple dimensions and angles.

Sound localization: Sound localization is the ability to identify the origin of a sound in three-dimensional space, allowing listeners to perceive where a sound is coming from. This skill is crucial for creating immersive audio experiences, as it helps to replicate real-world auditory environments in virtual settings and enhances the overall realism of the experience.

Soundscape: A soundscape refers to the acoustic environment as perceived by individuals, encompassing all sounds in a particular location or setting. It includes natural sounds, human-made noises, and the interactions between them, shaping our auditory experience and influencing our perception of space. Soundscapes play a crucial role in virtual environments, where the representation and manipulation of sound can enhance immersion and emotional response.

Spatialization: Spatialization refers to the technique of positioning sound within a three-dimensional space, creating an immersive audio experience that mimics how we perceive sound in real life. This concept is crucial for enhancing realism and engagement in virtual environments by allowing users to hear sounds from specific directions and distances, replicating natural auditory cues and creating a sense of presence.

Subjective Assessment: Subjective assessment refers to a method of evaluation that is influenced by personal opinions, interpretations, feelings, and biases. In the context of immersive and virtual reality experiences, this type of assessment becomes crucial as it captures individual perceptions of sound and spatial audio. Understanding subjective assessment helps in evaluating the effectiveness of ambisonics and spatial audio formats, as different listeners may have varying reactions to audio presentations based on their unique experiences and preferences.

Surround Sound Mixing: Surround sound mixing is the process of creating a multi-channel audio experience that envelops the listener in sound from all directions, rather than just from a single source. This technique enhances the auditory experience by placing sounds in a three-dimensional space, making it feel more immersive and engaging. It is crucial in various formats such as films, video games, and virtual reality, where spatial audio can significantly enhance the narrative and emotional impact of the content.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

Back

Practice QuizGlossary

Practice Quiz Glossary

5.2 Ambisonics and spatial audio formats

Ambisonics overview

History of ambisonics

Top images from around the web for History of ambisonics

Top images from around the web for History of ambisonics

Ambisonics vs stereo

Ambisonics vs surround sound

Ambisonic theory

Sound field decomposition

Spherical harmonics

Ambisonic order

Periphonic vs pantophonic

Ambisonic formats

A-format

B-format

Higher order ambisonics (HOA)

Ambisonic recording

Soundfield microphones

Tetrahedral microphone arrays

Ambisonic encoding

Ambisonic processing

Ambisonic decoding

Binaural rendering of ambisonics

Ambisonic manipulation

Ambisonic mixing

Ambisonic playback

Ambisonic speaker layouts

Virtual loudspeakers

Headphone playback of ambisonics

Spatial audio formats

Channel-based formats

Object-based formats

Scene-based formats

Ambisonics applications

VR and 360° video

Immersive installations

Game audio

Music production

Key Terms to Review (18)

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

Back

Next guide