Neural codecs and compression algorithms

Neural bandwidth reduction

April 23, 2020 — February 26, 2025

compsci

computers are awful

information

metrics

music

photon choreography

standards

Suspiciously similar content

Not compressing neural networks themselves, but using them to compress other things. This means using neural nets to reconstruct signals (images, audio, video) with low error from a small (in bits) summary, especially alongside existing noisy data transmission pipelines. Maybe we could even try both at once and think about minimum description length. That might be interesting.

Descript Audio Codec / descriptinc/descript-audio-codec: “State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.” (Kumar et al. 2023)
Image Compression with Neural Networks – Google AI Blog
mlomnitz/DiffJPEG
rshin/differentiable-jpeg: Code for “JPEG-resistant Adversarial Images”

1 With vector embeddings

Thanks to the success of vector embeddings we can frequently represent in a weird vector space which looks suggestively like classic encodings. In NLP, the embeddings are usually larger than the original text, though, so can we actually compress this way?

Maybe. There have been a bunch of works in that domain recently (Duggal et al. 2024; Miwa et al. 2025; Yan et al. 2025). The most viral one is (Bachmann et al. 2025), which has an elegant demonstration.

FlexTok: Resampling Images into 1D Token Sequences of Flexible Length

2 References

Ananthabhotla, Ewert, and Paradiso. 2019. “Towards a Perceptual Loss: Using a Neural Network Codec Approximation as a Loss for Generative Audio Models.” In Proceedings of the 27th ACM International Conference on Multimedia.

Bachmann, Allardice, Mizrahi, et al. 2025. “FlexTok: Resampling Images into 1D Token Sequences of Flexible Length.”

Collobert, Hannun, and Synnaeve. 2019. “A Fully Differentiable Beam Search Decoder.” In Proceedings of the 36th International Conference on Machine Learning.

Duggal, Isola, Torralba, et al. 2024. “Adaptive Length Image Tokenization via Recurrent Allocation.”

Eusebio, Ascenso, and Pereira. 2021. “Optimizing an Image Coding Framework with Deep Learning-Based Pre- and Post-Processing.” In 2020 28th European Signal Processing Conference (EUSIPCO).

Guleryuz, Chou, Hoppe, et al. 2021. “Sandwiched Image Compression: Wrapping Neural Networks Around A Standard Codec.” In 2021 IEEE International Conference on Image Processing (ICIP).

Klopp, Liu, Chen, et al. 2021. “How to Exploit the Transferability of Learned Image Compression to Conventional Codecs.” In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Kumar, Seetharaman, Luebs, et al. 2023. “High-Fidelity Audio Compression with Improved RVQGAN.”

Miwa, Sasaki, Arai, et al. 2025. “One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression.”

Qiu, Yu, and Li. 2021. “Codec-Simulation Network for Joint Optimization of Video Coding With Pre- and Post-Processing.” IEEE Open Journal of Circuits and Systems.

Shin, and Song. 2017. “JPEG-Resistant Adversarial Images.” In NIPS 2017 Workshop on Machine Learning and Computer Security.

Shwartz-Ziv, and LeCun. 2023. “To Compress or Not to Compress- Self-Supervised Learning and Information Theory: A Review.”

Toderici, O’Malley, Hwang, et al. 2016. “Variable Rate Image Compression with Recurrent Neural Networks.”

Xu, and Raginsky. 2017. “Information-Theoretic Analysis of Generalization Capability of Learning Algorithms.” In Advances In Neural Information Processing Systems.

Yang, Mandt, and Theis. 2023. “An Introduction to Neural Data Compression.”

Yan, Mnih, Faust, et al. 2025. “ElasticTok: Adaptive Tokenization for Image and Video.”