Skip to main content

Source Directivity & Orientation

A loudspeaker placed at an angle and a trumpet placed at the same angle are not the same thing. The loudspeaker is, more or less, a device built to project sound forward; the trumpet is a physical object that throws its high frequencies in a tight beam down its bell, leaks its low frequencies in every direction, and changes everything about what you hear the instant the player turns to face away from you. If a spatializer treats every virtual source as a featureless point that radiates equally in all directions, it has thrown away one of the richest perceptual cues we have for distance, presence, intimacy, and the sense that a sound belongs to a body in a room.

This chapter is about directivity — the fact that real sources radiate unevenly as a function of direction — and about orientation, the related fact that a source has a facing, and that rotating it changes the render. These are two sides of one idea. Directivity describes the shape of the radiation pattern around the source; orientation describes how that fixed pattern is pointed relative to the listener and the room. Together they determine how much direct sound reaches the listener, how much energy is dumped into the room to become reverberation, and how the spectral balance of both changes as the source turns.

This is squarely a Part III topic: it is about what happens to a virtual source once it must behave like a real one in a real space. It builds directly on Distance & Air and Reverberation, and it constantly cashes out in the perceptual currency established in Psychoacoustics. The recurring theme of this guide applies with full force here: a spatializer does not merely place a source at an angle, it must reproduce the cues that make the placement believable, and directivity is one of the most under-served of those cues.

Real sources are not omnidirectional

The everyday evidence

Start with intuition you already possess. Stand in front of a friend who is talking and ask them to slowly turn their head away from you while they keep speaking at constant effort. The voice gets duller and slightly more distant before it gets much quieter. Consonants — the high-frequency sibilants and plosives that carry intelligibility — fade first. The body of the voice, the low and mid frequencies, survives the turn much better. You have just heard frequency-dependent directivity in action, and your auditory system has used it, without any conscious effort, to infer that the talker turned away rather than walked away. That distinction — turned versus walked — is precisely what a spatializer that ignores directivity cannot make.

Now consider a brass instrument. A trumpet or trombone radiates its fundamental and lower harmonics broadly, but above roughly 2 kHz the bell becomes an efficient directional horn and the upper harmonics form a beam that points where the bell points. Orchestral seating and recording practice both encode this knowledge: a trumpet section pointed at the back wall sounds very different from one pointed at the audience, and a close microphone slightly off the bell axis tames the brashness without losing the body.

Why the differences matter to a renderer

Each instrument family has a characteristic radiation signature, and these signatures are part of how we recognise and place instruments:

  • The human voice. Broad at low frequencies, increasingly forward-biased above 1–2 kHz, with a pronounced loss of high-frequency energy to the rear. The front-to-back level difference at 4 kHz can exceed 1010 dB.
  • Brass (trumpet, trombone, horn). Strongly directional above 2 kHz along the bell axis. The French horn is a special case: its bell points backward and to the side, which is why horn players are often seated to reflect sound off a rear surface.
  • Bowed strings (violin, cello). Complex, lobed patterns governed by the vibration modes of the body and the f-holes. Directivity is irregular and frequency-dependent, with strong peaks and nulls that shift with frequency. This irregularity is part of the characteristic "spread" of a string section.
  • Guitar cabinets. An electric guitar speaker in a closed cab is quite directional in the highs (the source of the familiar advice to mic on-axis for bite, off-axis for warmth) and beams a narrow high-frequency lobe straight out of the cone.
  • Acoustic guitar, piano, percussion. Each radiates from an extended, resonant body with its own pattern; a grand piano with the lid up beams reflected sound in a direction set by the lid angle.
Key takeaway

The point for the renderer is not to memorise instrument zoology but to accept the consequence: direction and frequency are coupled at the source, before the room ever gets involved. A correct spatializer must apply a direction- and frequency-dependent gain to the direct sound, and — crucially — a different weighting to the energy it sends into the reverberation engine, because the room is excited by the total radiated power in all directions, not by the slice aimed at the listener.

Defining directivity

The radiation pattern

The fundamental object is the radiation pattern (or polar pattern): the radiated sound pressure as a function of direction, at a fixed distance in the far field, usually normalised to its maximum and plotted for one frequency at a time. In two dimensions it is the familiar polar plot; in three dimensions it is a balloon — a closed surface whose radius in each direction is the relative level radiated that way.

A few definitions make the rest of the chapter precise:

  • On-axis is the reference direction, conventionally the direction of maximum radiation and the direction the source is "facing."
  • Off-axis is any other direction, described by an angle θ\theta from the axis (and, in 3D, an azimuth ϕ\phi around it).
  • The far field is the distance regime where the pattern has stabilised and pressure falls off as p1/rp \propto 1/r (see Distance & Air). Directivity is defined there; close to an extended source the pattern is not yet formed.

The directivity factor Q and the directivity index DI

To turn the pattern into a single number we ask: how much more intense is the on-axis sound than it would be if the same total acoustic power were radiated equally in all directions? That ratio is the directivity factor, QQ.

Formally, Q=Iaxis/IavgQ = I_{\text{axis}} / I_{\text{avg}}, where IaxisI_{\text{axis}} is the on-axis intensity at distance rr and IavgI_{\text{avg}} is the intensity an omnidirectional source of the same total power would produce at the same distance. An omni has Q=1Q = 1 by definition. A source that concentrates its energy forward has Q>1Q > 1.

Because we hear in decibels, QQ is usually quoted as the directivity index:

DI=10log10(Q)(in dB).DI = 10\log_{10}(Q) \quad \text{(in dB)}.

So Q=1Q = 1 gives DI=0DI = 0 dB (omni), Q=2Q = 2 gives DI3DI \approx 3 dB, Q=10Q = 10 gives DI=10DI = 10 dB, and so on. The DI tells you, in decibels, how much "free gain" on-axis you get from the source concentrating its power, relative to spreading it uniformly.

There is a second, closely related descriptor used for loudspeakers and microphones, the front-to-back ratio: the level on-axis minus the level at 180180 degrees. It is not the same as DI — DI integrates over the whole sphere, the front-to-back ratio looks at one rear direction — but the two move together and both matter to a renderer, because the front-to-back ratio governs what a listener hears when a source turns around.

Worked example 1 — from a pattern to Q and DI. Take an ideal hemispherical radiator: a source mounted on a large rigid wall that radiates uniformly into the front half-space and nothing to the rear. The same power that an omni would spread over the full sphere (solid angle 4π4\pi steradians) is now spread over a hemisphere (2π2\pi steradians). Intensity is power per area, so halving the area doubles the on-axis intensity: Q=2Q = 2. Therefore DI=10log10(2)3.0DI = 10\log_{10}(2) \approx 3.0 dB. A source baffled into a quarter-space (a corner) concentrates into π\pi steradians, giving Q=4Q = 4, DI=6DI = 6 dB. This is exactly why placing a subwoofer against a wall, and then into a corner, raises its output by roughly 33 dB and then 66 dB — directivity and boundary loading are the same physics seen from two directions.

Why sources get more directional with frequency

This is the single most important mechanism in the chapter, and it follows from one comparison: the size of the radiating area versus the wavelength of the sound.

A radiator behaves omnidirectionally when it is small compared with the wavelength it is producing, and it beams when it is large compared with the wavelength. The physical intuition: when the source is much smaller than a wavelength, every part of it is effectively in phase with every other part as seen from anywhere around it, so it pushes the surrounding air out symmetrically — a breathing sphere, radiating equally in all directions. When the source is large compared with the wavelength, different parts of its surface are many wavelengths apart as seen from off-axis directions, their contributions arrive with different phases and partially cancel off-axis while adding up on-axis. The result is a forward beam.

Wavelength λ=c/f\lambda = c / f, with c343c \approx 343 m/s in air. So:

  • At 100 Hz, λ3.43\lambda \approx 3.43 m. A 0.30.3 m guitar cone or a 0.150.15 m mouth is tiny compared with this — omnidirectional.
  • At 1 kHz, λ0.34\lambda \approx 0.34 m, comparable to a guitar cone or a trumpet bell — directivity is emerging.
  • At 10 kHz, λ0.034\lambda \approx 0.034 m (3434 mm). Now even a small cone is ten wavelengths across — a tight beam.
Rule of thumb

A useful rule of thumb: directivity becomes significant once the source dimension DD satisfies roughly D>λ/2D > \lambda/2, i.e. above a frequency near fc/(2D)f \approx c / (2D). For a 0.30.3 m guitar speaker that transition sits around 343/0.6570343 / 0.6 \approx 570 Hz; below it the cab is broadly omnidirectional, above it the beam tightens steadily.

Worked example 2 — beam width of a circular piston. For a flat circular radiator of diameter DD, the first off-axis null of the radiation pattern occurs at the angle θ\theta where sin(θ)1.22λ/D\sin(\theta) \approx 1.22\,\lambda / D. Take a 0.300.30 m guitar cone at 5 kHz: λ=343/5000=0.0686\lambda = 343/5000 = 0.0686 m, so sin(θ)=1.220.0686/0.30=0.279\sin(\theta) = 1.22 \cdot 0.0686 / 0.30 = 0.279, giving θ16\theta \approx 16 degrees. The main lobe is only about 3232 degrees wide — that is why a guitar cab at 5 kHz throws a narrow, bright beam and why moving a microphone a few centimetres across the cone changes the tone so dramatically. Drop to 1 kHz and sin(θ)=1.220.343/0.30=1.39\sin(\theta) = 1.22 \cdot 0.343 / 0.30 = 1.39, which exceeds 11 — there is no null at all; the source is essentially omnidirectional at that frequency. One radiator, two completely different patterns, set entirely by frequency.

This frequency dependence is not a nuisance to be averaged away — it is the cue. The fact that highs beam and lows spread is exactly what lets a listener hear orientation. A renderer that applies a single frequency-flat directivity gain captures none of it.

Idealised patterns

Before modelling messy real instruments, it helps to have a vocabulary of clean analytic patterns. These come historically from microphone theory (where they are realised by mixing an omnidirectional pressure component with a bidirectional pressure-gradient component) but they apply equally to sources and are the staple primitives offered by spatializers.

The first-order family

A first-order pattern is described by g(θ)=a+bcos(θ)g(\theta) = a + b\cos(\theta), where θ\theta is measured from the on-axis direction and a+b=1a + b = 1 (normalised on-axis). The balance between the omni term aa and the figure-of-eight term bb sweeps out the whole family:

  • Omnidirectional (a=1, b=0a = 1,\ b = 0): equal in all directions. DI=0DI = 0 dB.
  • Cardioid (a=0.5, b=0.5a = 0.5,\ b = 0.5): g=0.5(1+cosθ)g = 0.5(1 + \cos\theta). Maximum forward, a single deep null directly to the rear (180180 degrees), heart-shaped.
  • Supercardioid (a0.37, b0.63a \approx 0.37,\ b \approx 0.63): narrower front lobe, nulls at about 125125 degrees, a small rear lobe. Maximum front-to-random rejection.
  • Hypercardioid (a=0.25, b=0.75a = 0.25,\ b = 0.75): narrower still, nulls near 110110 degrees, larger rear lobe. Maximum directivity factor of the family (Q=4Q = 4).
  • Figure-of-eight / bidirectional (a=0, b=1a = 0,\ b = 1): g=cosθg = \cos\theta. Equal front and rear lobes, deep nulls at 9090 and 270270 degrees. Q=3Q = 3, DI4.8DI \approx 4.8 dB.

A table of idealised patterns

Patterng(θ)g(\theta)Null angle(s)Front/backQQDI (dB)
Omni11none00 dB110.00.0
Subcardioid0.7+0.3cos0.7 + 0.3\cosnone (min at 180180)8\sim 8 dB1.3\sim 1.31.1\sim 1.1
Cardioid0.5+0.5cos0.5 + 0.5\cos180180 deginfinite (null)334.84.8
Supercardioid0.37+0.63cos0.37 + 0.63\cos125\sim 125 deg11.7\sim 11.7 dB3.7\sim 3.75.7\sim 5.7
Hypercardioid0.25+0.75cos0.25 + 0.75\cos110\sim 110 deg6\sim 6 dB446.06.0
Figure-of-eightcos\cos90, 27090,\ 270 deg00 dB334.84.8

Two subtleties worth noting. First, the cardioid has an infinite front-to-back ratio (a true null at the rear) but a lower QQ than the hypercardioid — front-to-back ratio and directivity index are genuinely different measures, and the cardioid optimises the former while the hypercardioid optimises the latter. Second, the figure-of-eight and the cardioid happen to share Q=3Q = 3; the figure-of-eight achieves it by rejecting the sides, the cardioid by rejecting the rear. For a renderer these primitives are cheap, frequency-flat building blocks. Their great limitation is exactly that flatness: a real cardioid source would have its pattern tighten with frequency, whereas the idealised 0.5(1+cosθ)0.5(1 + \cos\theta) does not. We return to this in the modelling and limits sections.

Higher-order patterns

Microphone and source design can go beyond first order to g(θ)=a+bcos(θ)+ccos2(θ)+g(\theta) = a + b\cos(\theta) + c\cos^2(\theta) + \ldots. Each added order narrows the main lobe and adds lobes and nulls, raising QQ. A second-order pattern can reach DIDI of 7–9 dB. Real horns and waveguides are effectively high-order, frequency-varying patterns. In ambisonic terms — see Ambisonics — the order of the directivity pattern maps onto the spherical-harmonic order needed to represent it, which is why faithfully encoding a directional source into a soundfield representation requires more than first-order components.

Frequency-dependent directivity of real instruments and the voice

The voice as a moving, frequency-dependent radiator

The human voice is the case a renderer must get right, because dialogue and vocals dominate most productions and listeners are exquisitely sensitive to voice. The mouth is a small aperture (about 5 cm) set in the larger baffle of the head and torso. Consequences:

  • Below ~500 Hz the voice is nearly omnidirectional; you can hear someone humming a low note about as well from behind as in front.
  • From ~1 kHz upward the head shadows the rear and the pattern becomes forward-biased. By 4 kHz the front-to-back difference is typically 8–12 dB; by 8 kHz it can exceed 1515 dB.
  • The diffraction of the head adds a broad mid-frequency forward bias and side lobes, so the voice is not a clean cardioid; it is an irregular, frequency-dependent balloon.
note

The practical upshot: when a virtual talker turns away, the renderer should not merely attenuate broadband, it should low-pass the direct sound progressively, because that is what nature does. A talker facing away sounds muffled and roomy, not merely quieter. The muffling (HF loss) and the increased room contribution are two independent consequences of the same rotation, and both are perceptual cues the auditory system reads as orientation. Tie this directly to Psychoacoustics: the brain's spectral expectations for a familiar source like the voice are strong, so violations (a back-turned voice that is merely quieter but still bright) read as "wrong" even to untrained listeners.

Instruments and the front/back spectral signature

Most acoustic instruments share the qualitative pattern — broadly omnidirectional at low frequencies, increasingly directional and forward-beaming at high frequencies — but the details differ and carry identity:

  • Trumpet: nearly omni below 500 Hz, a forward beam tightening steadily above 1.5 kHz, very directional by 4–5 kHz. The "brightness" of a trumpet is an on-axis phenomenon; off-axis it is mellow.
  • Violin: strongly lobed. The body radiates broadly at low frequencies; from 1 kHz upward there are pronounced directional peaks and nulls that rotate with frequency and with the instrument's tilt. There is significant upward and forward radiation that the room then diffuses.
  • Cello and double bass: large bodies, so directivity sets in lower; significant floor coupling.
  • Grand piano: the soundboard radiates downward and the raised lid reflects a frequency-dependent beam outward, which is why the lid angle and the piano's orientation on a stage matter to both players and recordists.

These signatures are why a single anechoic recording, panned to an angle, never quite convinces: the recording captured one direction of a complex balloon, and re-pointing it requires knowledge of the whole balloon, not just the captured slice.

How directivity shapes perceived distance, presence, and the direct-to-reverberant balance

Directivity feeds the room

This is the conceptual heart of the chapter and the strongest link to the rest of Part III. A source in a room produces two things at the listener: direct sound (the slice of the balloon aimed at the listener, attenuated by 1/r1/r and air absorption per Distance & Air) and reverberant sound (the room's response to the total power radiated in all directions, per Reverberation).

Crucially, these two depend on different parts of the directivity pattern. The direct sound depends on g(θlistener)g(\theta_{\text{listener}}) — only the direction toward the listener. The reverberation depends on the total radiated power, which is the integral of the pattern over the whole sphere — and that total is, to first order, independent of where the source points. Turning a source does not change how much total power it radiates; it changes only how that power is distributed in direction. So when a source turns away from the listener:

  • The direct sound drops (the listener is now off-axis, g(θ)g(\theta) is smaller, and at high frequencies dramatically so).
  • The reverberant level stays essentially the same (total power unchanged).
  • Therefore the direct-to-reverberant ratio (DRR) falls — the source sounds farther away and more enveloped.

The reverberation radius and how directivity moves it

There is a distance, the reverberation radius (or critical distance) rcr_c, at which direct and reverberant energy are equal. Inside it the direct sound dominates and the source sounds close and present; beyond it the reverberation dominates and the source sounds distant and diffuse. Its value depends on room absorption and on source directivity:

rc0.057QVRT60r_c \approx 0.057\sqrt{\frac{Q \cdot V}{RT_{60} \cdot \ldots}}

in one common form, or more transparently rc=0.141QAr_c = 0.141\sqrt{Q \cdot A} where AA is the room's total absorption in metric sabins. The key dependence for us is rcQr_c \propto \sqrt{Q}: a more directional source pushes its critical distance outward. A source aimed at you behaves as if the room were less reverberant, because more of its power reaches you directly relative to the room's diffuse field — even though the room and the total power are unchanged.

Worked example 3 — directivity moves the critical distance. Suppose a room has a critical distance of rc=2.0r_c = 2.0 m for an omnidirectional source (Q=1Q = 1). Replace the source with a cardioid (Q=3Q = 3) aimed at the listener. Because rcQr_c \propto \sqrt{Q}, the new critical distance is 2.03/1=2.01.73=3.462.0\sqrt{3/1} = 2.0 \cdot 1.73 = 3.46 m. A listener who was sitting exactly at the old critical distance (2.02.0 m, equal direct and reverberant) now sits well inside the new critical distance: the source sounds markedly closer and more present, the reverberation has receded behind it — and not a single thing about the room changed. Now rotate that cardioid 180180 degrees so its null faces the listener: the listener sees QQ effectively far below 11 in their direction, the direct sound collapses, and they are suddenly far outside the critical distance — the source sounds distant and swallowed by the room. This single example contains the whole reason orientation belongs in a scene description.

Presence, intimacy, and "in the room with you"

Presence — the sense that a source is close, immediate, and bodily — is largely a high-frequency, high-DRR phenomenon. A source on-axis delivers its full high-frequency content directly, with a high DRR; the result reads as intimate and present. The same source off-axis delivers rolled-off highs and a lower DRR; it reads as distant, ambient, "across the room." A spatializer that wants to author intimacy versus distance has, in directivity and orientation, a far more natural control than a simple wet/dry knob — because turning a source affects highs and DRR together, exactly as the ear expects, whereas a wet/dry knob moving alone is a cue the ear can catch as artificial. The relationship to envelopment is developed in Direct, Diffuse & Envelopment.

Source orientation as an authorable scene parameter

What "facing" means

In an object-based scene (see Object-Based Audio) a source has not only a position but an orientation: a facing direction, ideally a full set of three angles (yaw, pitch, roll) defining how the source's intrinsic pattern is rotated in the world. The directivity balloon is defined in the source's own coordinate frame; orientation is the rotation that maps that frame into the world frame. To render, the engine computes the direction from source to listener in the source's frame and looks up the pattern there.

The minimal useful parameterisation is a single facing vector (where is the front pointing) plus the assumption of rotational symmetry about that axis. That is enough for a voice or a trumpet to first order. Asymmetric sources (a violin, a piano with a lid, a French horn) need the full three-angle orientation because their patterns are not symmetric about the facing axis.

How rotation changes the render

When a source rotates by some angle, three things change in the render, and a correct engine updates all three:

  1. The direct-sound gain and EQ change, because the listener now sits at a different θ\theta on the balloon — and because the balloon is frequency-dependent, this is a filter change, not just a gain change.
  2. The early reflections change, because each reflection leaves the source toward a wall, not toward the listener, so each early reflection samples the balloon at its own angle. A source turned toward a side wall can send a brighter first reflection off that wall than the (now off-axis) direct sound — a strong and realistic cue.
  3. The reverb send is essentially unchanged in level (total power conserved) but may change slightly in spectral content if the engine models the frequency-dependent power radiation.

The early-reflection coupling (point 2) is subtle and powerful. In a real room, a singer who turns toward the side wall produces a direct sound that is dull but a wall reflection that is bright — and listeners use that mismatch to nail the orientation. Engines that model early reflections individually (image-source or ray-based) can reproduce this; engines that use a single diffuse reverb cannot, and lose the cue. See Reverberation for the reflection models this depends on.

Dynamic orientation: turning in real time

Orientation is not just static authoring; it can be animated. A talker who turns their head, a guitarist who swings the neck, a performer who walks a circle while facing center — these are dynamic orientation changes, and the render must update the direct-sound filter, the reflection weighting, and the DRR continuously. Because directivity changes are spectral, they must be smoothed (interpolated) over time to avoid zipper artefacts, exactly as gain changes are smoothed for moving sources. Combined with motion, orientation animation is what makes a virtual performer feel embodied rather than placed. Note the interaction with Doppler: a source can move and rotate independently, and a complete engine treats translation (which drives delay and Doppler) and rotation (which drives the directivity filter) as separate degrees of freedom.

Modelling directivity in a spatializer

The architecture: two filters, not one

The clean way to implement directivity is to insert a direction-dependent filter at two points in the per-source signal chain:

  • On the direct path, a filter Hdirect(θlistener,ϕlistener,f)H_{\text{direct}}(\theta_{\text{listener}}, \phi_{\text{listener}}, f) that applies the balloon's response in the listener's direction. This filter is updated whenever the source moves, the source rotates, or the listener moves.
  • On the reverb send, a filter Hsend(f)H_{\text{send}}(f) representing the source's total radiated power spectrum — the power balloon integrated over the sphere. This filter is essentially orientation-independent (turning the source does not change its total power) and so updates rarely, only with the source's intrinsic spectrum.
Common mistake

Keeping these separate is what makes the DRR behave correctly when a source turns: the direct path loses high frequencies, the send does not, and the DRR falls in the highs exactly as in a real room. Engines that drive the reverb send from the direct (post-directivity) signal get this wrong — they make a back-turned source quieter in the reverb too, which the ear reads as the source having become smaller rather than turned away.

Idealised versus measured directivity data

There are two ways to obtain the pattern:

Idealised analytic patterns. Use a first- or higher-order function g(θ,f)g(\theta, f) whose coefficients vary with frequency to mimic the "tightens with frequency" behaviour. Cheap, controllable, and adequate for stylised work. A common practical model is a frequency-dependent cardioid: a low-frequency near-omni that morphs toward a cardioid or hypercardioid above a corner frequency, plus a first-order low-pass on the rear directions. With a handful of parameters this captures the essential voice-turning behaviour convincingly.

Measured directivity data. Real sources can be measured on a sphere of microphones (or one microphone and a turntable) to produce a directivity balloon sampled at many directions and many frequencies. The loudspeaker industry has standardised file formats for exactly this — GLL (the Generic Loudspeaker Library used by AFMG's EASE, see the GLL reverse-engineering notes in the broader DAM Audio work) and the open CLF (Common Loudspeaker Format), both of which store frequency-dependent balloon data on a regular angular grid (commonly 55-degree steps). For instruments, research databases (notably the work out of TU Berlin and others) provide measured directivities of orchestral instruments. Rendering from measured data means interpolating the balloon between grid points in both angle and frequency, then realising the resulting magnitude (and possibly phase) response as a filter.

ApproachProsConsUse when
Idealised analytic patternCheap, smooth, fully parametric, easy to animateNot physically accurate, genericStylised mixing, games, low CPU budgets
Frequency-dependent parametric (e.g. morphing cardioid)Captures HF beaming and front/back cheaplyStill an approximationVoice and simple instruments, real-time
Measured balloon (GLL/CLF/research data)Physically faithful, instrument-specific identityData scarce for instruments, heavy interpolation, CPUHigh-fidelity reproduction, research, archival

Realising the filter efficiently

The expensive part is that HdirectH_{\text{direct}} is both direction-dependent and frequency-dependent and may need to update every audio block as things move. Strategies the field uses:

  • Frequency-band gains. Split the direct path into a few bands (e.g. octave bands) and apply a per-band gain interpolated from the balloon. A handful of bands captures most of the perceptual effect at a fraction of the cost of a full filter.
  • Low-order IIR shelving/peaking. Model the pattern's smooth magnitude with a couple of biquads whose parameters track the look-up angle. Very cheap, smooth to interpolate, ideal for the voice (one tilting high-shelf does most of the work).
  • Spherical-harmonic (ambisonic) source encoding. Represent the directivity as a set of spherical-harmonic coefficients; rotating the source is then a rotation in the SH domain, and the listener-direction sample is an SH evaluation. This is elegant for scenes with many rotating sources and connects naturally to Ambisonics.
  • Partitioned convolution with a measured directional impulse response, when full fidelity is required and CPU allows.
tip

For most real-time spatialisers the sweet spot is a small number of bands or a couple of tracking biquads on the direct path, plus a fixed power-spectrum filter on the send — enough to make orientation audible and believable without the cost of full balloon convolution per source.

The playback-side analogue: loudspeaker directivity and coverage

Directivity is not only a property of the virtual source; it is a property of the real loudspeaker that ultimately reproduces the render, and the two interact. A spatializer that has lovingly modelled a source's directivity is still played back through transducers that have directivities of their own, and over a real coverage area those transducer patterns determine who actually hears the intended balance.

Why coverage is a directivity problem

A loudspeaker, like any source, beams its high frequencies. Off-axis listeners get less high-frequency energy directly — the same physics as the guitar cone in Worked Example 2. For a single listener at the sweet spot this is fine; for an audience spread across a coverage area it is a design problem: how do you give every seat a similar spectral balance when the speaker is more directional in the highs than in the lows? This is the central question of sound-system design, addressed at length in the Installation part of this guide (by name; consult that part for arrays, splay angles, and coverage mapping). The directivity index of a loudspeaker as a function of frequency — sometimes summarised as its DI curve — is precisely the data (often stored in the GLL/CLF formats above) that designers use to predict coverage and even out the response across an audience using line arrays, constant-directivity horns, and electronic beam steering.

There is a clean conceptual symmetry worth stating: the same mathematics and the same file formats describe a virtual instrument's radiation and a physical loudspeaker's radiation. A directivity balloon is a directivity balloon. A renderer that consumes GLL/CLF data to model a virtual source is using the loudspeaker industry's tooling for source modelling; a system designer using GLL data to predict coverage is doing the inverse problem on the playback side. For immersive systems calibrated to a real room — covered in the Calibration part by name — the loudspeaker directivities also shape the room's reverberant field, so the playback-side directivity feeds the very reverberation the source-side directivity was trying to control. The two analogues are not merely parallel; in a real installation they are coupled.

The capture-side analogue: microphone polar patterns and placement

Microphones are directivity in reverse

The idealised patterns of the earlier table — omni, cardioid, supercardioid, figure-of-eight — originated as microphone patterns, and the reason is reciprocity: a microphone's directional sensitivity follows the same first-order mathematics as a source's directional radiation. This is why microphone choice and placement are, at bottom, directivity decisions, and why they are inseparable from the spatial result. The Recording part of this guide (by name) covers stereo and surround microphone techniques in depth; here we make only the structural point.

Placement is a directivity negotiation

When an engineer places a microphone in front of an instrument, two directivities meet: the instrument's radiation balloon and the microphone's pickup pattern. The captured sound is their product, integrated over the directions that reach the capsule directly plus the room arriving from all around. Consequences the renderer should appreciate:

  • A close microphone on a directional source captures one slice of the source balloon — the on-axis slice if aimed at the bell, an off-axis slice if angled away. This is why "mic position changes the tone": you are choosing which point on the frequency-dependent balloon to record.
  • The microphone's own pattern sets the direct-to-reverberant ratio at capture. A cardioid (Q=3Q = 3) at a given distance captures the same DRR as an omni moved to about 3=1.73\sqrt{3} = 1.73 times closer — the same rcQr_c \propto \sqrt{Q} relationship as in Worked Example 3, now on the capture side. The directivity of the mic and the directivity of the source jointly set how "roomy" the recording is.
  • Spaced and coincident arrays exploit pattern directivity to encode spatial information (level and time differences) that later becomes the stereo or surround image — see Stereo and Surround & Matrix.

The lesson that closes the loop with the source-modelling sections: a recording is a sample of a source's directivity through a microphone's directivity in a particular room. When that recording is later re-spatialised, the engine inherits whatever slice of the balloon the microphone captured. This is the deep reason that anechoic, on-axis, multi-direction source measurements are so valuable for high-fidelity rendering — they capture the whole balloon rather than one negotiated slice of it.

Worked examples: rotating a talker, and omni versus cardioid

Worked example 4 — rotating a virtual talker from facing to back-turned

Take a virtual talker 3 m from the listener in a room whose omnidirectional critical distance is rc=2.0r_c = 2.0 m. Model the voice with a simple but realistic frequency-dependent pattern: near-omni at 250 Hz, and a front-to-back attenuation that grows with frequency — say 44 dB at 1 kHz, 99 dB at 2 kHz, and 1414 dB at 4 kHz when fully reversed (180180 degrees). These figures are typical of measured human-voice directivity.

Facing the listener (θ=0\theta = 0). The direct sound delivers the full spectrum on-axis. Because the source is at 3 m and rc=2.0r_c = 2.0 m, the listener is somewhat beyond critical distance, so reverberation already slightly dominates; still, the on-axis highs give the voice presence. Compute the DRR. Direct energy goes as g(0)2/r2g(0)^2 / r^2; reverberant energy is fixed by total power. At the critical distance DRR is 00 dB by definition, and DRR scales as (rc/r)2(r_c / r)^2 for an omni-equivalent direct field, so at 3 m: DRRbroadband=20log10(2.0/3.0)=20log10(0.667)=3.5DRR_{\text{broadband}} = 20\log_{10}(2.0/3.0) = 20\log_{10}(0.667) = -3.5 dB. The voice is a touch distant but clearly intelligible, highs intact.

Turned 180180 degrees (back to the listener). The direct sound is now attenuated by the front-to-back figures: roughly 44 dB at 1 kHz, 99 dB at 2 kHz, 1414 dB at 4 kHz, and only a fraction of a dB at 250 Hz. The reverberant field is unchanged (total power conserved). So the DRR in each band falls by the front-to-back attenuation of that band:

BandFront-to-back lossDRR facingDRR back-turnedChange
250 Hz0.5\sim 0.5 dB3.5-3.5 dB4.0-4.0 dB0.5-0.5 dB
1 kHz4\sim 4 dB3.5-3.5 dB7.5-7.5 dB4-4 dB
2 kHz9\sim 9 dB3.5-3.5 dB12.5-12.5 dB9-9 dB
4 kHz14\sim 14 dB3.5-3.5 dB17.5-17.5 dB14-14 dB

Read down the "back-turned" column: the highs have collapsed relative to the lows, and every band's DRR has dropped, with the drop strongly frequency-weighted. Perceptually this is unmistakable — the voice has become muffled (the 4 kHz band is now 1414 dB more reverberant-dominated than the 250 Hz band) and generally more distant and roomy. The listener hears "turned away," not "walked away," precisely because the low frequencies barely changed while the highs fell off a cliff. Had the engine applied a single broadband attenuation, it might have matched the loudness change but would have kept the highs bright and the spectral DRR flat — and the result would read as a smaller or quieter talker, not a turned one. This is the cue, quantified.

A subtle bonus: if the engine models early reflections, a side wall reflection now samples the balloon at, say, 100100 degrees rather than 180180, so it loses less high-frequency energy than the direct path — the back-turned voice may actually arrive brightest via the wall, exactly as in reality. That mismatch between a dull direct path and a brighter reflected path is among the most convincing orientation cues a renderer can produce.

Worked example 5 — omni versus cardioid source in the same room

Now compare two sources of equal total power in the same room, listener at 3 m, rc(omni)=2.0r_c(\text{omni}) = 2.0 m:

  • Omni source (Q=1Q = 1). Critical distance 2.02.0 m; listener at 3 m is beyond it; broadband DRR =20log10(2.0/3.0)=3.5= 20\log_{10}(2.0/3.0) = -3.5 dB. Sounds distant, diffuse, equally so in all directions — the room dominates.
  • Cardioid source aimed at the listener (Q=3Q = 3). Critical distance rises to 2.03=3.462.0\sqrt{3} = 3.46 m; listener at 3 m is now inside it; broadband DRR =20log10(3.46/3.0)=+1.2= 20\log_{10}(3.46/3.0) = +1.2 dB. Same room, same total power, same distance — but the direct sound now slightly dominates the reverberation. The cardioid source sounds closer and more present than the omni by about 1.2(3.5)=4.71.2 - (-3.5) = 4.7 dB of DRR, even though both radiate the same total acoustic power and excite the room equally.

The interpretation is the crux of the whole chapter. Two sources can pour identical total energy into a room — exciting identical reverberation — yet sound at completely different distances, purely because of how they aim their energy. Distance perception is not just about 1/r1/r attenuation and air absorption (those of Distance & Air); it is also about directivity and orientation setting the DRR. A spatializer that models only distance attenuation can never reproduce the difference between these two sources. One that models directivity can author it with a single parameter — the pattern, or equivalently the facing.

Limits

Data availability

The honest limit is data. The loudspeaker industry has mature, standardised balloon data (GLL, CLF); the instrument world does not. Measured directivity databases for orchestral instruments exist but are limited in coverage, in the number of articulations and dynamics measured, and in angular and frequency resolution. The voice is well characterised on average but varies with vowel, effort, and individual anatomy. For most instruments a renderer must fall back on idealised or parametric patterns, accepting that it captures the behaviour (highs beam, lows spread, turning dulls and adds room) without the instrument-specific identity that measured data would give.

CPU cost of frequency-dependent filtering

A truly faithful directivity render needs a frequency-dependent filter per source on the direct path, updated as the source and listener move, with smooth interpolation to avoid artefacts — and a separate send filter. Multiply that across dozens or hundreds of sources in an interactive scene and the cost is real. The mitigations from the modelling section (a few bands, a couple of tracking biquads, SH encoding) are approximations chosen precisely to fit a CPU budget. There is a genuine fidelity/cost trade-off here, and it is why many shipping spatialisers expose directivity as an optional or simplified feature rather than a full balloon convolution.

Interpolation and the moving listener

Measured balloons are sampled on a grid (often 55 degrees, in octave or third-octave bands). As a source or listener moves, the look-up direction sweeps continuously across grid points, and naive interpolation can produce audible spectral wobble or comb artefacts, especially where the real pattern has sharp nulls. Smooth, perceptually weighted interpolation (and sometimes deliberate smoothing of deep nulls, which are rarely perceptually critical) is required. The deep nulls of idealised patterns are themselves a limit: a true cardioid null at 180180 degrees gives infinite attenuation, which is unphysical and unpleasant; real renderers floor the attenuation at a finite value.

What directivity does not capture

Directivity as discussed here is a far-field, single-point abstraction. It does not capture near-field behaviour (where an extended source does not present a single direction at all — think of standing right next to a grand piano, where different frequencies clearly come from different parts of the body), nor source extent (a choir or a piano is spatially large, not a point), nor the radiation phase structure that matters for very precise transaural and wavefield reconstruction (see Transaural and WFS). These are separate modelling problems layered on top of directivity, and a complete spatial scene may need all of them.

Common mistakes and pitfalls

warning
  • Treating directivity as a single broadband gain. The whole perceptual payload is in the frequency dependence. A flat directivity gain changes loudness but not the spectral cue, and the ear reads it as a level change, not an orientation change. Always model directivity as a filter.
  • Sending the post-directivity direct signal to the reverb. This couples the reverb level to orientation and destroys the conserved-power behaviour. A back-turned source should keep filling the room; only its direct path should dull. Drive the send from a separate, orientation-independent power-spectrum filter.
  • Ignoring orientation entirely while animating position. A talker who walks across the scene but never changes facing is a half-rendered talker; real talkers turn, and the absence of orientation cues makes motion feel like a sliding loudspeaker rather than a person.
  • Using infinitely deep idealised nulls. A perfect cardioid rear null or figure-of-eight side null gives unphysical total silence and nasty interpolation artefacts. Floor the attenuation.
  • Forgetting the early-reflection coupling. Applying directivity only to the direct path and feeding a diffuse reverb misses the bright-reflection-from-a-dull-source cue that sells orientation. If the engine has discrete early reflections, weight each by the balloon at its launch angle.
  • Confusing front-to-back ratio with directivity index. They are different measures (one rear direction versus the whole-sphere integral); a cardioid maximises the former, a hypercardioid the latter. Quote the one your application needs.
  • Mismatching source and playback directivity assumptions. A render authored for a directional listening situation, reproduced over loudspeakers with very different directivity in a live room, can have its carefully built DRR undone by the room's response to the speakers' directivity. Keep the playback-side analogue (Installation and Calibration parts) in view.
  • Over-trusting scarce instrument data. A single measured balloon for "violin" hides enormous variation across instruments, players, dynamics, and articulations. Treat measured instrument directivity as representative, not definitive.

References

  • Heinrich Kuttruff, Room Acoustics, 6th ed., CRC Press, 2016 — critical distance, reverberant field, and the relationship between source power and steady-state room level.
  • Jürgen Meyer, Acoustics and the Performance of Music, 5th ed., Springer, 2009 — the definitive reference on the frequency-dependent directivity of orchestral instruments and the singing/speaking voice.
  • Jens Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization, revised ed., MIT Press, 1997 — perception of distance, presence, and the cues the auditory system extracts.
  • Leo L. Beranek, Acoustics: Sound Fields and Transducers (with Tim Mellow), Academic Press, 2012 — piston radiation, directivity factor QQ and index DI, and the wavelength-versus-size mechanism.
  • David Griesinger, "The Psychoacoustics of Apparent Source Width, Spaciousness and Envelopment in Performance Halls," Acustica, 1997 — how direct/reverberant balance and reflections shape perceived distance and envelopment.
  • Francis Rumsey, Spatial Audio, Focal Press, 2001 — microphone polar patterns, capture-side directivity, and spatial recording technique.
  • Jean-Marc Jot, "Efficient Models for Reverberation and Distance Rendering in Computer Music and Virtual Audio Reality," Proc. ICMC, 1997 — separating direct and reverberant paths and rendering distance/DRR in a spatializer.
  • AES56-2008 (r2019), AES standard on acoustics — Sound source modeling — Loudspeaker polar radiation measurements — standardised measurement and representation of loudspeaker directivity balloons.
  • ISO 3382 (Parts 1–2), Acoustics — Measurement of room acoustic parameters — reverberation time and the room measures underlying the critical-distance relationships used here.

← Back to the Sound Field & the Room