Equalization & Room Correction

Equalization is where the abstractions of the earlier parts of this guide meet the stubborn physics of an actual room. The panning laws of amplitude panning, the timbral neutrality assumed by object-based rendering, the precedence-effect cues that make localization work (see psychoacoustics) — all of them quietly assume that each loudspeaker delivers a clean, even, predictable response at the listening position. Real rooms violate that assumption. Walls reflect, air absorbs, modes resonate, and two nominally identical speakers in two corners measure nothing alike. Equalization and room correction are the tools we use to claw back the neutrality the system design promised.

This chapter builds the subject from first principles: what a room does to a frequency response, which of those effects EQ can legitimately fix and which it physically cannot, what we should be aiming at (the target curve), how the correction is actually realised in IIR or FIR filters, and — crucially for spatial audio — how to make every speaker share a timbre so that a panned source does not change colour as it travels across the array. We close with a step-by-step procedure and an honest accounting of the limits.

This chapter assumes you have already laid out and time-aligned the system; see Speaker Layouts & Topologies and Time Alignment & Phase. EQ comes after geometry and timing are correct, never before.

The goal: a neutral, even response without fighting physics

The objective of room EQ is deceptively simple to state: at the listening area, each loudspeaker should reproduce the intended spectral balance, with a smooth response free of gross peaks and dips, and all loudspeakers should match one another. Achieving this is not about chasing a ruler-flat line on a measurement. It is about distinguishing the problems that are electrical and minimum-phase (correctable) from those that are acoustic and spatial (not correctable by filtering at all).

A loudspeaker in a room produces a sound field that is the sum of the direct sound and a dense cloud of reflections. Our hearing separates these in time: the direct sound and early-arriving energy dominate timbre and localization, while the later reverberant field sets the sense of space and the direct-to-diffuse balance. A single microphone at one point captures their complex sum — including interference notches that exist only at that point. If you EQ to flatten that single measured curve, you "correct" artifacts that another listener never hears, and you damage the direct sound that everybody hears.

What EQ can fix

The on-axis or anechoic response errors of the loudspeaker itself (driver roll-off, crossover ripple, baffle effects) — these are minimum-phase and consistent everywhere.
Broad tonal tilts caused by the room's average absorption (a "dark" or "bright" room), correctable because they are smooth and spatially consistent.
Low-frequency room modes, partially and carefully, because below the Schroeder frequency the field is dominated by a few resonances that are themselves minimum-phase at the peaks.
Speaker-to-speaker level and timbre mismatches, so the array is coherent.

What EQ cannot fix

Deep interference nulls from a single reflection (e.g. floor bounce, the acoustic comb filter) — these are non-minimum-phase and position-dependent.
Reverberation decay time and the late energy that smears transients — that is an absorption problem, not a gain problem.
Poor direct-to-reverberant ratio from a speaker that is too far or aimed wrong — geometry, not filtering.
Localization or imaging errors from bad placement.

Key takeaway

EQ changes the magnitude (and possibly phase) of the signal sent to a speaker. It cannot change where energy goes in space or how long the room rings. If a problem is spatial (varies as you move) or temporal (a decay), reach for absorption, diffusion, placement, or subwoofer optimization — not a filter.

Below a certain frequency, a room does not behave like a diffuse acoustic space; it behaves like a resonator. Standing waves form between parallel surfaces, reinforcing some frequencies and cancelling others, and these room modes are the single largest source of low-frequency unevenness in any small room. Understanding them is prerequisite to any sane bass EQ.

Standing waves and mode types

A standing wave forms when a wavelength fits an integer number of half-wavelengths between two boundaries. The pressure is maximal (an antinode) at the walls and zero (a node) at certain interior planes. At an antinode the response peaks; at a node it disappears entirely. Modes come in three families, in order of strength:

Axial modes involve two opposite surfaces (front–back, left–right, floor–ceiling). They carry the most energy and dominate.
Tangential modes involve four surfaces and are roughly 3 dB weaker.
Oblique modes involve all six surfaces and are weaker still (~6 dB down).

The full modal frequencies of a rigid rectangular room follow:

f_{n_x, n_y, n_z} = \frac{c}{2}\sqrt{\left(\frac{n_x}{L_x}\right)^2 + \left(\frac{n_y}{L_y}\right)^2 + \left(\frac{n_z}{L_z}\right)^2}

where $c \approx 343\ \text{m/s}$ is the speed of sound, $L_x, L_y, L_z$ are the room dimensions, and $n_x, n_y, n_z$ are non-negative integers (not all zero). An axial mode has two of the indices equal to zero, so for the length axis the expression collapses to the familiar:

f = \frac{c}{2 L}

Worked example: axial modes of a control room

Take a room $L_x = 6.5\ \text{m}$ long, $L_y = 4.8\ \text{m}$ wide, $L_z = 2.9\ \text{m}$ high.

The fundamental length-axis axial mode:

f_{1,0,0} = \frac{343}{2 \times 6.5} \approx 26.4\ \text{Hz}

Its harmonics fall at 52.8, 79.2, 105.6 Hz, etc. The width axis:

f_{0,1,0} = \frac{343}{2 \times 4.8} \approx 35.7\ \text{Hz}

with harmonics at 71.5, 107.2 Hz. The height axis:

f_{0,0,1} = \frac{343}{2 \times 2.9} \approx 59.1\ \text{Hz}

A tangential mode combining length and width, $f_{1,1,0}$ :

f_{1,1,0} = \frac{343}{2}\sqrt{\left(\frac{1}{6.5}\right)^2 + \left(\frac{1}{4.8}\right)^2} \approx \frac{343}{2}\sqrt{0.0237 + 0.0434} \approx 44.4\ \text{Hz}

The table below lists the lowest modes; notice how, even in a reasonably proportioned room, the modes are not evenly spaced — there is a cluster near 105–107 Hz and gaps elsewhere. Gaps mean weak coupling (a dip); clusters mean reinforcement (a peak).

Mode $(n_x,n_y,n_z)$	Type	Frequency (Hz)
(1,0,0)	Axial L	26.4
(0,1,0)	Axial W	35.7
(1,1,0)	Tangential	44.4
(2,0,0)	Axial L	52.8
(0,0,1)	Axial H	59.1
(0,1,1)	Tangential	69.0
(2,1,0)	Tangential	64.2
(0,2,0)	Axial W	71.5
(3,0,0)	Axial L	79.2
(2,0,1)	Tangential	79.1
(4,0,0)	Axial L	105.6
(0,3,0)	Axial W	107.2

The Schroeder frequency

Above some frequency the modes become so dense and overlapping that they merge into a statistically diffuse field, and the room stops behaving as a set of discrete resonators. That boundary is the Schroeder frequency:

f_S \approx 2000\sqrt{\frac{RT_{60}}{V}}

with $RT_{60}$ in seconds and volume $V$ in cubic metres. For our example room, $V = 6.5 \times 4.8 \times 2.9 \approx 90.5\ \text{m}^3$ . With a typical control-room $RT_{60}$ of 0.3 s:

f_S \approx 2000\sqrt{\frac{0.3}{90.5}} \approx 2000 \times 0.0576 \approx 115\ \text{Hz}

Below ~115 Hz this room is modal; above it, it is statistical. This boundary matters enormously for EQ strategy: below $f_S$ , EQ (and especially multi-subwoofer optimization) can help because a few resonances dominate; above $f_S$ , the response is governed by reflections that vary point-to-point, and broadband EQ should be gentle and based on spatial averages. This ties directly to the treatment of decay and the diffuse field in Reverberation.

Modal density grows fast

The number of modes below a frequency $f$ scales roughly as $N(f) \approx \frac{4\pi V}{3}\left(\frac{f}{c}\right)^3$ . At 100 Hz in our room there are only a handful of modes; by 1 kHz there are tens of thousands. That is why room "correction" is really two different problems on either side of $f_S$ .

Minimum-phase vs non-minimum-phase: what is even correctable

This is the most important conceptual distinction in room correction, and the one most often ignored by automatic systems and amateurs alike.

A minimum-phase system has a unique relationship between its magnitude and phase response: knowing one fully determines the other. If a problem is minimum-phase, an inverse filter (a reciprocal EQ) can cancel both its magnitude error and its phase error simultaneously, restoring a flat response. Loudspeaker driver roll-offs, crossover dips, and the peaks of room resonances are essentially minimum-phase.

A non-minimum-phase problem has phase behaviour that is not tied to its magnitude — it contains pure delay or all-pass-like behaviour. The canonical example is an interference null caused by a delayed reflection summing with the direct sound. At the cancellation frequency, the two contributions are equal in magnitude and opposite in phase, so they subtract toward zero. You cannot fill that null with EQ: boosting the electrical signal boosts both the direct and the reflected copy, they continue to cancel, and you simply pump more energy into the room (heating the amplifier and exciting other positions) without raising the level at the null.

Consider a floor-bounce comb filter. The direct path is $d_1$ and the reflected path is $d_2 = d_1 + \Delta d$ . The first cancellation occurs when the path difference equals a half wavelength:

\Delta d = \frac{\lambda}{2} \quad\Rightarrow\quad f_{\text{null}} = \frac{c}{2\,\Delta d}

If $\Delta d = 0.6\ \text{m}$ , the first null sits at $f = 343/(2 \times 0.6) \approx 286\ \text{Hz}$ , with further nulls at odd multiples. Move the microphone 30 cm and $\Delta d$ changes, so the null moves. A filter tuned to 286 Hz is wrong everywhere except the one mic position — and even there it does not actually fill the null, it just makes the measurement look filled while wrecking the response a seat away.

The practical test

In measurement software (REW, Smaart) you can compute the excess-phase or excess-group-delay response, or compare the measured impulse response against its minimum-phase reconstruction. Where they agree, the deviation is minimum-phase and EQ-able. Where the measured phase departs (especially at deep, narrow dips), it is non-minimum-phase: leave it alone.

Never EQ a deep dip

Boosting a deep, narrow null is the single most destructive mistake in room correction. The null does not fill; instead the boost pours energy into the room at that frequency, raising the level elsewhere, eating amplifier headroom, and risking driver damage. As a rule: cut peaks, do not boost nulls. Treat narrow dips (under roughly a one-third-octave wide) as off-limits unless you have proven they are minimum-phase.

Target curves: what "correct" looks like

If we are not aiming at flat, what are we aiming at? The answer is the target curve — the desired in-room magnitude response, measured with a defined averaging method.

Why not flat in-room?

A loudspeaker that measures flat anechoically (on axis, free field) will measure with a gently falling high end in a room, because the room's reverberant field rolls off at high frequencies (air and surface absorption rise with frequency, see distance and air absorption) and the speaker's directivity narrows, sending less HF energy into the reverberant field. Crucially, listeners prefer this gentle downtilt — a flat in-room measurement sounds too bright. Toole's research at Harman established that the perceptually neutral in-room target is a smooth curve falling roughly 0.5–1 dB per octave from the low-mids to the top.

Common target curves

Target curve	Domain	Shape	Typical tilt
Anechoic flat	Speaker design / on-axis	Flat free-field	0 dB
Harman / "preferred" room curve	Music monitoring, hi-fi	Flat to ~200 Hz, gentle downtilt above	~ −0.7 dB/oct
X-curve (SMPTE ST 202)	Cinema dubbing/playback	Flat to 2 kHz, then steep rolloff	−3 dB/oct above 2 kHz (large rooms)
Small-room / near-field music	Project studios	Slight LF lift, mild HF downtilt	~ −0.5 to −1 dB/oct
Dolby Atmos music (near-field)	Immersive music mix	Flat-ish, gentle tilt, matched channels	~ −0.5 dB/oct
Bass-managed LFE	Sub channel	+10 dB in-band gain re: mains	n/a

The X-curve deserves special mention. Codified in SMPTE ST 202, it specifies the electroacoustic response of a dubbing stage or cinema measured with pink noise: flat from roughly 63 Hz to 2 kHz, then rolling off at about 3 dB/octave above 2 kHz in a large room (gentler, ~1.5 dB/oct, in small rooms). The X-curve was designed so that mixes translate between rooms of similar reverberant character; it is steeper than a music room curve because large cinemas have long, HF-absorbing reverberant fields. The X-curve is increasingly debated — its measurement is steady-state spatially-averaged pink noise, which over-weights the reverberant field — but it remains the reference for theatrical alignment.

Setting a target for spatial audio

For an immersive music or post room, a sensible target is: flat (±2 dB) from the modal region up to ~300 Hz, then a smooth downtilt reaching about −3 to −5 dB at 16–20 kHz, with the same target applied to every channel. The absolute tilt matters less than its consistency across speakers — which is the spatial-audio theme we return to below.

Choose the target before you measure

Decide on the target curve and the averaging method first, then measure against it. Reverse-engineering a target from one measurement is how you end up "correcting" the room's interference pattern instead of the loudspeaker. Document the target so the whole array is aligned to the same reference.

FIR vs IIR EQ: how the correction is realised

Once you know what to change, you implement it with filters. There are two fundamental families.

IIR (parametric and graphic) EQ

Infinite Impulse Response filters are recursive: the output depends on past outputs. A parametric EQ band (a biquad) is defined by centre frequency $f_0$ , gain $G$ , and quality factor $Q$ , where $Q$ relates to bandwidth in octaves $BW$ by:

Q = \frac{\sqrt{2^{BW}}}{2^{BW} - 1}

A one-octave band ( $BW = 1$ ) has $Q \approx 1.41$ ; a one-third-octave band has $Q \approx 4.3$ . IIR filters are cheap (a few multiply-accumulates per sample), introduce negligible latency (a sample or two), and are inherently minimum-phase — making them the correct tool for minimum-phase problems, where they fix magnitude and phase together. Their downside: they cannot impose an arbitrary phase response independent of magnitude, and very high- $Q$ filters ring.

A graphic EQ is a bank of fixed-frequency IIR bands (typically 31 one-third-octave bands). It is convenient but coarse, and adjacent bands interact, so it is poorly suited to surgical modal work.

FIR (convolution) correction

Finite Impulse Response filters convolve the signal with a stored impulse response of length $N$ taps. Because the coefficients are arbitrary, an FIR filter can realise any magnitude response and, independently, any phase response — including the linear-phase correction that fixes magnitude while leaving phase untouched, or a mixed-phase correction that attempts to invert non-minimum-phase room behaviour over a small region.

The price is latency and frequency resolution. A linear-phase FIR introduces a constant delay of half its length:

t_{\text{lat}} = \frac{N/2}{f_s}

and its frequency resolution (the narrowest feature it can shape) is:

\Delta f = \frac{f_s}{N}

Worked example: FIR resolution vs latency

To correct a modal peak at 40 Hz you need resolution finer than, say, 5 Hz. At $f_s = 48\ \text{kHz}$ :

N = \frac{f_s}{\Delta f} = \frac{48000}{5} = 9600\ \text{taps}

A linear-phase filter of 9600 taps adds:

t_{\text{lat}} = \frac{4800}{48000} = 0.1\ \text{s} = 100\ \text{ms}

That 100 ms is fine for music playback but ruinous for live monitoring, broadcast, or any system that must stay in lip-sync — see the latency budget discussion in Networking & Integration. This is the central FIR trade-off: low-frequency resolution costs latency. Many practical systems use a hybrid approach — minimum-phase FIR or IIR at low frequencies (no added latency penalty from linearity), reserving linear-phase behaviour for a band where it matters.

Property	IIR (parametric)	FIR linear-phase	FIR minimum-phase
Latency	~0 (a few samples)	High ( $N/2$ taps)	~0
Phase control	Tied to magnitude	Arbitrary / flat	Tied to magnitude
LF resolution	Excellent, cheap	Costly (long filter)	Costly (long filter)
Can fix non-min-phase?	No	Partially, position-bound	No
CPU cost	Very low	Moderate–high	Moderate–high
Pre-ring risk	None	Yes (symmetric ringing)	None

Linear-phase pre-ringing

A linear-phase FIR that cuts a sharp peak produces symmetric ringing — energy before the main impulse (pre-ring). Pre-ring is audible and unnatural (nothing in nature precedes its own transient), particularly on percussive material. For corrective EQ, mixed- or minimum-phase FIR usually sounds more natural than aggressive linear-phase. Reserve linear-phase for gentle, broad tonal shaping.

Per-speaker vs global EQ

In a spatial system you face a choice: correct each loudspeaker individually, or apply one correction to the whole array. The right answer is both, in order.

Correct each speaker to a common target first

Every speaker sits in a different acoustic situation — a corner, a wall, a ceiling mount, behind a screen. Even identical loudspeaker models will measure differently because of boundary loading (a speaker near a corner gains low-frequency output from boundary reinforcement, up to +6 dB per adjacent boundary at low frequencies). The first step is to bring each speaker individually to the same target curve and the same sensitivity, so that a 1 kHz tone panned from front-left to front-right does not change level or timbre.

The boundary effect deserves a number. A speaker against a single large boundary sees a low-frequency lift; near a trihedral corner (three boundaries) the theoretical low-frequency gain approaches:

\Delta L = 20\log_{10}(n_{\text{boundaries}} \text{ image sources})

In practice a corner-loaded sub can be 9–12 dB hotter in the bottom octaves than a free-standing one. Per-speaker EQ tames this so the array matches.

Then a light global pass

After individual matching, a gentle global correction can address an overall tonal tilt shared by the whole system (e.g. a room that is uniformly a little dark). Keep it broad and shallow. The reason to do global after per-speaker is that any per-speaker idiosyncrasy left uncorrected will not be fixed by a global filter — it will only smear the array's coherence.

The matching criterion

The goal is not that each speaker measures identically at one mic (impossible — each has different reflections) but that their direct sound and early response match. This is why per-speaker EQ should be derived from a window that emphasises the first arrival (see windowing in Measurement & Calibration), not the full steady-state curve dominated by each speaker's particular room interaction.

The limits of room correction

Room correction is powerful below the Schroeder frequency and for tonal matching; it is weak or counterproductive elsewhere. Knowing the boundary keeps you from chasing impossibilities.

The single-point trap

A measurement at one microphone position captures that point's unique interference pattern — its private constellation of peaks and nulls. Optimising a filter to flatten that curve guarantees a great measurement at exactly one point in space and, frequently, a worse result everywhere else. Above $f_S$ , where the response varies by many dB over a head-width, correcting to a single point is not just useless but harmful.

Spatial averaging is mandatory

The cure is spatial averaging: measure at several positions spanning the listening area (a grid of 5–9 points, or a moving-mic average), then derive the correction from the average (typically an RMS or power average of the magnitude responses). The average reveals what is common to all positions — the genuine, correctable, speaker-and-tonal problems — and washes out the position-specific interference that should not be EQ'd. We expand the method in the procedure below.

Why absorption beats EQ

Two problems lie permanently outside EQ's reach:

Reflections / comb filtering — non-minimum-phase, position-dependent. The fix is a broadband absorber or a redirecting diffuser at the reflection point, which removes the offending energy rather than the impossible task of cancelling it electronically. A 100 mm porous absorber with an air gap gives meaningful absorption down to a few hundred Hz; that physically reduces the early reflection that causes the comb.
Excessive decay time — the room rings. EQ changes steady-state magnitude but not the rate of energy decay. Only absorption shortens $RT_{60}$ . If transients smear and the direct-to-diffuse ratio is poor, no filter will help; you need treatment.

Even in the modal region, bass trapping (membrane and pressure-based absorbers, porous traps in pressure-maximum corners) outperforms EQ because it lowers the resonance $Q$ — shortening the modal decay and flattening the peak — whereas EQ only lowers the peak height while the mode still rings for the same duration.

EQ flattens magnitude, treatment shortens time

A modal peak has both a height (magnitude) and a ringing time (decay). A cut filter lowers the height but the mode still decays at the same slow rate; a bass trap lowers both. For low frequencies, the ideal is treatment first (to reduce $Q$ and decay), then a small EQ cut on what remains. Multi-subwoofer placement (next chapter) is the other powerful, non-EQ modal tool.

Multi-sub optimization as a "spatial EQ"

The Welti & Devantier research at Harman showed that placing multiple subwoofers and optimising their relative levels, delays, and positions can flatten the low-frequency response across many seats simultaneously — something single-point EQ can never do, because it manipulates the modal excitation pattern in space rather than the signal spectrum at a point. This is covered in Subwoofers & Bass Management; for now, note that the most effective low-frequency "correction" is often spatial (where and how many sources) rather than spectral (filtering).

Spatial-audio-specific EQ: matching timbre across the array

Everything above applies to stereo too. What makes spatial audio special is that sources move across speakers, and any timbral mismatch between speakers becomes an audible artifact of motion.

Why mismatch is audible in motion

In amplitude panning, a phantom source is created by feeding correlated signal to two (or more) speakers with level differences. If speaker A has a 3 dB bump at 4 kHz and speaker B has a 3 dB dip there, then as a source pans from A to B, its spectrum at 4 kHz changes — the image audibly brightens then dulls as it moves. The listener perceives this as the object changing material or timbre mid-flight, which destroys the illusion of a single coherent moving source. Worse, the phantom centre between two mismatched speakers has a comb-filtered timbre that neither speaker exhibits alone, because the summed response of two non-identical signals creates interference.

The matching requirement

For convincing pans, every speaker in the array must share:

the same on-axis / direct-sound timbre (so the object's colour is constant), and
ideally similar directivity (so the reflected energy, which also contributes to timbre, is consistent).

Directivity matching is why immersive systems prefer identical loudspeaker models throughout the array, or at least a matched family. You cannot fully EQ a wide-dispersion speaker to sound like a narrow-dispersion one, because EQ corrects the on-axis response but not the off-axis energy that fills the reverberant field. Two speakers can be EQ'd to identical on-axis curves and still sound different because their power responses differ.

Worked example: panning level consistency

Suppose front-left and front-right are matched within ±0.5 dB across the band, but a height speaker is 2 dB hot at 6–10 kHz due to a close ceiling reflection. An object panning from ear-level to overhead will gain presence as it rises — an unintended cue. Calibrating the height speaker's direct-sound response to match the bed speakers (gating out the ceiling reflection in the measurement, then applying a −2 dB shelf or band cut at 6–10 kHz) removes the artifact. The pan now changes only in elevation, as intended, with constant timbre — exactly the behaviour the object-based renderer assumes.

Level calibration ties in

Timbre matching is incomplete without level matching: each speaker should produce the same reference SPL (commonly 79 dBC for film at −20 dBFS pink noise per channel, or 85 dBC for the older standard) at the reference position. Level and timbre matching together are what make a panned object behave predictably. This calibration step is detailed in Measurement & Calibration.

Match the array, then correct the room

Sequence: (1) level-calibrate every channel to the same reference SPL; (2) EQ each speaker's direct sound to a common target so timbre matches; (3) apply gentle spatially-averaged room correction; (4) verify with a pan that the object holds constant timbre as it moves. Skipping step 2 means your panning laws are lying.

Step-by-step EQ procedure with measurement

Here is a complete, defensible workflow. It assumes geometry, polarity, and time alignment are already correct (those come first; see the preceding chapters).

1. Define the target and method

Choose the target curve (e.g. flat to 300 Hz, −0.7 dB/oct downtilt above) and the spatial-averaging scheme (e.g. a 7-point grid: reference seat plus six surrounding it within ±0.5 m). Fix the measurement gear: a calibrated measurement microphone (with its calibration file loaded), a known sound card, and software (REW or Smaart). Set the SPL so the system runs at realistic levels.

2. Measure each speaker, anechoically-windowed and steady-state

For each loudspeaker, capture a swept-sine impulse response at every grid position. From the impulse response derive two views:

A gated / windowed response (e.g. a 4–7 ms window) that isolates the direct sound for the mid/high band — minimum-phase, EQ-able, and the basis for timbre matching.
The full / steady-state response for the low band, where windowing would throw away the modal information you actually need.

Splice them around 200–300 Hz to form a full-range "psychoacoustically weighted" curve.

3. Spatially average

Compute the RMS (power) average of the magnitude responses across the grid. Inspect the variance between positions too: where positions disagree wildly (deep, position-dependent nulls), mark those regions as off-limits for EQ.

4. Per-speaker correction to target

Working on the averaged curve for each speaker:

Identify broad peaks (wider than ~1/6 octave) that are consistent across positions. Apply cuts with parametric IIR bands. A peak of +6 dB at 63 Hz with $Q \approx 3$ gets a −6 dB band at 63 Hz, $Q = 3$ .
Address the overall tonal tilt with one or two low- $Q$ shelves to land on the target downtilt.
Do not boost dips. Leave deep narrow nulls untouched.
Use FIR only where you need linear-phase tonal shaping or sub-5 Hz LF resolution, and accept the latency.

5. Worked correction example

Averaged front-left measurement shows: +7 dB modal peak at 42 Hz ( $Q \approx 4$ ), a smooth +3 dB region centred 120 Hz ( $Q \approx 1$ ), a −9 dB null at 95 Hz (narrow, varies across grid), and a 2 dB excess above 8 kHz.

Apply: −7 dB / 42 Hz / $Q=4$ ; −3 dB / 120 Hz / $Q=1$ ; nothing at 95 Hz (it is a non-minimum-phase null — verify with excess-phase, then treat with a bass trap or sub repositioning instead); −2 dB high shelf at 8 kHz. Re-measure; the modal peak should drop into target, and the null — correctly — remains, because it is a physical interference you must solve acoustically.

6. Match all speakers

Apply the same target to every channel, then verify pairwise: the front-left and front-right direct-sound curves should overlay within ±1 dB; each surround and height likewise. Trim level so each channel hits the reference SPL with the calibration noise.

7. Light global pass and validation

Add a gentle global filter only for a shared tilt. Validate by:

Re-measuring the spatial average — it should sit within your tolerance window (e.g. ±3 dB) around the target across the modal-and-up range.
Listening to a moving pink-noise object across the array — it must hold constant timbre.
Listening to reference programme material you know well.

8. Document and lock

Save the filter set, the measurement set, the target, and the SPL calibration. Note the date and gear. Room correction drifts as treatment, furniture, or speakers change, so the documentation is your baseline for the next verification.

Common mistakes and pitfalls

EQ-ing a modal null (boosting a dip). The cardinal sin. Wastes headroom, can damage drivers, and does not fill the null. Cut peaks, never boost narrow nulls.
Single-microphone correction. Optimises one point and ruins the rest, especially above $f_S$ . Always spatially average.
Correcting the steady-state curve at high frequencies. Above $f_S$ the steady-state response is dominated by position-specific reflections. Use windowed/gated data and gentle, broad correction up there.
Over-correction (chasing flat). Aggressive high- $Q$ cuts and an enforced ruler-flat target sound sterile and ring. Use the fewest, broadest filters that achieve the target. A handful of well-chosen bands beats thirty fighting each other.
EQ before time alignment and polarity. A polarity error or misaligned driver creates a magnitude dip at the crossover that looks like an EQ problem but is a timing problem. Fix timing first (see Time Alignment & Phase), or you will "correct" a cancellation with a boost and make it worse.
Linear-phase pre-ring on transients. Audible, unnatural. Prefer minimum-/mixed-phase for corrective work.
Ignoring directivity / power response. EQ-ing two different speakers to the same on-axis curve does not make them sound the same; their off-axis energy still differs. Use matched speakers in the array.
Treating absorption problems with EQ. Long decay and strong early reflections are acoustic; no filter shortens a decay. Treat the room.
Trusting automatic room-correction blindly. Many auto systems boost nulls, over-correct above $f_S$ , and ignore the minimum-phase test. Inspect and override their decisions.

Limits

EQ cannot move energy in space. It changes the spectrum at a point, not the field distribution. Spatial problems need spatial solutions (placement, multi-sub, treatment).
EQ cannot change decay time. Only absorption does. Steady-state magnitude and time-domain ringing are different quantities.
Non-minimum-phase errors are uninvertible (globally). A reflection null can be partially flattened only at one position with a non-causal mixed-phase FIR, at the cost of error elsewhere. There is no global cure but treatment.
Correction is valid only within the averaged area. Outside the measured zone, all bets are off — particularly above $f_S$ .
Above the Schroeder frequency, broadband EQ is tonal only. It can set a target tilt; it cannot and should not flatten individual reflection ripples.
Directivity is not EQ-able. Power response and off-axis behaviour are properties of the transducer and cabinet, not the signal.
Correction drifts. Any change to the room, speakers, or listening geometry invalidates the filters. Re-measure after changes.

The throughline is the recurring theme of Part V: calibration exists to make the physical system deliver the cues the earlier parts assume — coherent arrival for precedence and summing localization (recall psychoacoustics), a controlled direct-to-reverberant balance, and an even, matched response so that panned objects keep their identity as they move. EQ is one instrument in that effort, powerful within its domain and useless — even harmful — outside it. Knowing the boundary between the correctable and the uncorrectable is the whole craft.

References

Toole, F. E. Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers and Rooms, 3rd ed. Routledge/Focal Press, 2017. (In-room target curves, the spatial-averaging argument, minimum-phase limits, why flat-in-room sounds bright.)
Welti, T. and Devantier, A. "Low-Frequency Optimization Using Multiple Subwoofers." Journal of the Audio Engineering Society, vol. 54, no. 5, 2006. (Spatial low-frequency optimization across seats.)
Davis, D. and Patronis, E. Sound System Engineering, 4th ed. Focal Press, 2013. (Modal analysis, alignment, measurement practice.)
Ahnert, W. and Steffen, F. Sound Reinforcement Engineering: Fundamentals and Practice. E & FN Spon, 1999. (Room acoustics, modal density, Schroeder frequency.)
ISO 3382-1:2009, Acoustics — Measurement of room acoustic parameters — Part 1: Performance spaces. (Reverberation and spatial-average measurement methodology.)
SMPTE ST 202:2010, Motion-Pictures — Dubbing Theaters, Review Rooms and Indoor Theaters — B-Chain Electroacoustic Response (X-curve). (Cinema target curve.)
Schroeder, M. R. "The Statistical Parameters of the Frequency Response Curves of Large Rooms." Journal of the Audio Engineering Society, vol. 35, 1987. (Origin of the Schroeder frequency.)
Genelec, Trinnov, and Dirac Live technical documentation; REW (Room EQ Wizard) Help and Smaart user guides. (Practical correction filters, FIR/IIR implementation, measurement and windowing procedure.)

← Back to Systems & Calibration

The goal: a neutral, even response without fighting physics​

What EQ can fix​

What EQ cannot fix​

Room modes and the modal region​

Standing waves and mode types​

Worked example: axial modes of a control room​

The Schroeder frequency​

Minimum-phase vs non-minimum-phase: what is even correctable​

Why modal nulls resist EQ​

The practical test​

Target curves: what "correct" looks like​

Why not flat in-room?​

Common target curves​

Setting a target for spatial audio​

FIR vs IIR EQ: how the correction is realised​

IIR (parametric and graphic) EQ​

FIR (convolution) correction​

Worked example: FIR resolution vs latency​

Per-speaker vs global EQ​

Correct each speaker to a common target first​

Then a light global pass​

The matching criterion​

The limits of room correction​

The single-point trap​

Spatial averaging is mandatory​

Why absorption beats EQ​

Multi-sub optimization as a "spatial EQ"​

Spatial-audio-specific EQ: matching timbre across the array​

Why mismatch is audible in motion​

The matching requirement​

Worked example: panning level consistency​

Level calibration ties in​

Step-by-step EQ procedure with measurement​

1. Define the target and method​

2. Measure each speaker, anechoically-windowed and steady-state​

3. Spatially average​

4. Per-speaker correction to target​

5. Worked correction example​

6. Match all speakers​

7. Light global pass and validation​

8. Document and lock​

Common mistakes and pitfalls​

Limits​

References​