Meta's AI-Powered Audio Codec EnCodec Promises 10x Better Compression Than MP3

•

November 20, 2024 at 04:49 AM

Meta has developed an AI-powered audio compression technology called 'EnCodec' that achieves 10x better compression than traditional MP3 formats while maintaining high audio quality.

Person using audio production software

EnCodec uses a three-part system to compress audio:

An encoder that converts uncompressed audio into a lower frame rate representation
A quantizer that compresses the data while preserving essential information
A decoder that reconstructs the audio in real-time using neural networks

Audio compression comparison graph

The system employs discriminators in a cat-and-mouse game to ensure the reconstructed audio remains perceptually similar to the original. EnCodec is the first neural network-based compression system capable of handling 48 kHz stereo audio, slightly surpassing CD quality (44.1 kHz).

Primary applications include:

Improving voice call quality over poor network connections
Enabling high-quality audio in metaverse experiences
Delivering superior audio quality with minimal bandwidth requirements

While still in the research phase, EnCodec represents a significant advancement in audio compression technology that could revolutionize digital audio delivery across various platforms and network conditions.