Aller au contenu principal

Object-Based Audio

An approach where each source is an "object" with position metadata, rather than a pre-mixed signal for a specific configuration.

Concept

Audio Object

An audio object consists of:

  • Audio signal (mono or stereo)
  • Position metadata (x, y, z)
  • Other attributes (size, behavior, snap)

Renderer

The renderer converts objects to signals for a given speaker configuration:

Objects + Metadata ──► Renderer ──► Target Configuration
(5.1, 7.1.4, etc.)

Advantages

Adaptability

The same content can be reproduced on different systems:

  • Cinema (Atmos dome)
  • Home theater (soundbar)
  • Headphones (binaural)

Future-Proofing

Content isn't tied to specific technology. Renderers improve without touching content.

Interactivity

Options can be offered to users:

  • Choose dialogue language
  • Adjust relative levels
  • Personalize experience

Object-Based Formats

FormatMax ObjectsUsage
Dolby Atmos128 (cinema), 16 (home)Cinema, streaming, music
DTS:X32Cinema, home
MPEG-HVariableBroadcast
Sony 360 RA24Music streaming

Bed vs Objects

Bed

Traditional channel-based content (e.g., 7.1) integrated into the mix:

  • Ambiences, music
  • Predictable behavior
  • Less resource-intensive

Objects

Individual sources with metadata:

  • Dialog, spot effects
  • Precise positioning
  • Adapts to any configuration

Typical Mix

7.1.2 Bed: Music, ambiences
+ Objects: Dialog, effects, moving elements
= Complete Atmos Mix

Production Workflow

  1. Traditional mix for the bed
  2. Create objects for specific elements
  3. Automate positions
  4. Render and verify on different configs
  5. QC downmix

Spacelite adopts an interesting hybrid approach:

  • Input: Stereo signal (simple channel-based)
  • Processing: HSR distributes signal like an "object" with configurable position
  • Output: Flexible configuration (1-16 channels per bus)

This enables transforming existing stereo content into quasi-object-based content, where each bus can be configured for a different destination.

Object Metadata

Position

Typically normalized coordinates:

  • X: -1 (left) to +1 (right)
  • Y: -1 (back) to +1 (front)
  • Z: -1 (below) to +1 (above)

Size

Object spread/size:

  • Point source (0)
  • Extended source (>0)

Snap

Behavior at speaker boundaries:

  • Snap to nearest speaker
  • Smooth panning between speakers

Rendering Approaches

Point-Source Rendering

Objects rendered as points using VBAP/DBAP:

  • Precise localization
  • Efficient

Extent Rendering

Objects with size use multiple virtual sources:

  • More natural for large sources
  • Higher CPU usage

Distance Rendering

Distance affects:

  • Level (inverse square law)
  • Reverb amount
  • Spectral content

Comparison: Channel vs Object

AspectChannel-BasedObject-Based
FlexibilityFixed to formatAny format
ProductionSimplerMore complex
File sizeFixedVariable
MetadataNonePosition, size, etc.
PersonalizationNonePossible
Legacy contentNativeRequires conversion

Object-Based Tools

Production

  • Dolby Atmos Production Suite
  • DTS:X Creator Suite
  • Nuendo with Atmos integration
  • Logic Pro with Atmos

Monitoring

  • Dolby Atmos Renderer
  • DTS:X Encoder Suite
  • Apple Spatial Audio