Spatialization Techniques

This part of the guide covers the techniques used to place sound in space: stereo, amplitude panning, surround, object-based audio, Ambisonics, binaural, transaural, and wave field synthesis. They look very different on the surface, yet they are all answers to the same question, and all built on the same two-step idea.

One idea behind every method: encode, then decode

Every spatial format beyond mono does two things:

Encode — capture or author a sound field into a finite representation: a set of channels, an object with positional metadata, or a set of spherical-harmonic coefficients.
Decode — turn that representation into signals for the loudspeakers (or headphones) actually present at playback.

The decode step almost always produces speaker feeds that were not authored as such — they are derived by an algorithm. This is true whether you matrix-decode a stereo signal into surround, pan an object across three speakers, or decode an Ambisonic scene to a dome. Seen this way, the boundary between "playback," "rendering," and "upmixing" dissolves: they are the same operation, deriving feeds for a target system from a more compact representation.

Keeping this in mind makes the rest of the guide much easier. Each technique is just a different choice of how to encode and how to derive the speaker feeds.

The three representations, again

From the Fundamentals, recall the three ways to represent a spatial mix — each technique below belongs to one of them:

Channel-based — stereo, surround, matrix systems. Encoded for a known layout.
Object-based — objects + metadata, rendered to the available speakers (Dolby Atmos).
Scene-based — the whole field as spherical harmonics, decoded to any array (Ambisonics).

Wave field synthesis sits slightly apart: instead of creating phantom sources, it physically reconstructs the wavefront with a dense array.

What this part covers

Stereo & the phantom image — the original spatial encoding, and the perceptual trick every later method reuses.
Amplitude panning: VBAP & DBAP — placing a source by level across two or three speakers, generalised to any 3D layout.
Surround & matrix systems — channel-based 5.1/7.1 and the matrix encode/decode lineage.
Object-based audio & rendering — objects, metadata, and the renderer (Dolby Atmos).
Ambisonics — scene-based audio, spherical harmonics, orders, rotation and decoding.
Binaural — HRTF rendering for headphones.
Transaural — binaural over loudspeakers, via crosstalk cancellation.
Wave field synthesis — physically reconstructing wavefronts with dense arrays.
Stereo is already spatial — why real-world stereo must be decoded, not just routed, and the bridge to DAM Audio's HSR and RIPL.

A theme we will return to

Because stereo is already a spatial encoding — two channels carrying a continuous field — the information needed to spread it across more speakers is, to a large extent, already present in the signal. That single observation is the foundation of modern upmixing and of DAM Audio's own tools — see Stereo is already spatial.

→ Start: Stereo & the phantom image

One idea behind every method: encode, then decode​

The three representations, again​

What this part covers​

A theme we will return to​

One idea behind every method: encode, then decode

The three representations, again

What this part covers

A theme we will return to