HALAC ~ High Availability Lossless Audio Compression


HALAC focuses on a reasonable compression ratio and high processing speed. The compression rate for audio data is usually limited. So I wanted a solution that can work faster with a few percent concessions.

github.com/Hakan-Abbas/HALAC-High-Availability-Lossless-Audio-Compression
github.com/Hakan-Abbas/HALAC-Audio-Player
hydrogenaudio.org/index.php/topic,125248.0

Cavern ~ Object-based Audio Engine & CODEC


Cavern is a fully adaptive object-based audio rendering engine and (up)mixer without limitations for home, cinema, and stage use. Audio transcoding and self-calibration libraries built on the Cavern engine are also available. This repository also features a Unity plugin and a standalone converter called Cavernize.

Cavern goes beyond fixed-channel audio systems by rendering any number of audio “objects” in three-dimensional space, tailored to the listener’s speaker arrangement or headphone output. It is also supported by a standalone conversion tool, Cavernize, which allows users to convert spatial mixes into conventional channel-based PCM formats while maintaining positional accuracy.

Key Features and Capabilities:

Object-Based Rendering
Cavern supports an unrestricted number of audio objects and output channels. This allows precise spatial placement and movement of sounds in 3D space, independent of specific channel layouts.

Codec and Container Support
The engine and its companion tools support a wide range of codecs and containers, including those commonly used for immersive audio delivery. Traditional formats such as WAV and common multimedia containers are also supported.

Calibration and Room Correction
Cavern includes tools for self-calibration and room equalization. These can flatten frequency response, compensate for acoustic irregularities, and help unify tonal characteristics across speakers.

Headphone Virtualization
Through HRTF-based processing, Cavern enables spatial rendering over stereo headphones. This simulates direction, distance, and spatial cues to reproduce the effect of multichannel speaker setups in a binaural listening environment.

Real-Time Up-Mixing
Legacy stereo or multichannel content can be up-mixed into fully rendered 3D scenes. This provides an immersive experience even when the source was not originally produced as object-based audio.

Integration with Game Engines
Cavern offers integration with Unity, enabling developers to incorporate real-time positional audio into games, simulations, and interactive media.


Use Cases

Home Cinema and Media Playback
Cavern can render object-based audio tracks for users who do not have commercial hardware processors. It allows accurate spatial playback through both speakers and headphones.

Headphone-Focused Listening
The binaural virtualization system benefits users who rely on headphones for movies, music, gaming, or general media consumption.

Game and VR Development
Developers can use Cavern inside Unity to produce dynamic, spatially accurate audio scenes in interactive applications.

Archiving and Conversion
Cavernize converts object-based audio into standard PCM or channel-based formats, preserving positional intent while enabling playback on conventional systems.

Speaker Optimization
Its calibration tools provide a software-based approach to room correction and multi-speaker alignment without requiring dedicated hardware processors.


Limitations and Considerations

  • Some supporting utilities are not fully open-source and may be distributed under separate licensing terms.
  • Spatial rendering benefits depend on input quality; poor-quality stereo sources will not yield true immersive results.
  • Speaker hardware, room acoustics, and HRTF compatibility affect the perceived accuracy of spatialization.
  • Integrating Cavern into custom software projects requires familiarity with its API and spatial-audio concepts.

Why Cavern Matters

Cavern stands out by making advanced spatial-audio technology accessible without requiring specialized hardware or proprietary processors. By combining open-source rendering, a flexible object-based architecture, codec support, calibration tools, and developer integration, it provides a versatile platform for enthusiasts, researchers, and media creators.

For users interested in experimenting with immersive audio workflows, whether for home cinema, headphone listening, archiving, or game development, Cavern offers a free, comprehensive and adaptable approach.


References:

  • VoidXH / Cavern – GitHub repository
  • Cavern documentation website
  • Cavern package listing on NuGet

cavern.sbence.hu/cavern
github.com/VoidXH/Cavern
github.com/VoidXH/HRTF
cavern.sbence.hu/cavern/doc
cavern.sbence.hu/cavern/downloads
www.nuget.org/packages/Cavern
en.wikipedia.org/wiki/Digital_room_correction#Cavern_QuickEQ

SAC ~ State-Of-The-Art Lossless Audio Compression


Sac is a state-of-the-art lossless audio compression model.

Lossless audio compression is a complex problem, because PCM data is highly non-stationary and uses high sample resolution (typically >=16bit). That’s why classic context modelling suffers from context dilution problems. Sac employs a simple OLS-NLMS predictor per frame including bias correction. Prediction residuals are encoded using a sophisticated bitplane coder including SSE and various forms of probability estimations. Meta-parameters of the predictor are optimized with DDS on by-frame basis. This results in a highly asymmetric codec design.

Technical features:

  • Input: wav file with 1-16 bit sample size, mono/stereo, pcm
  • Output: sac file including all input metadata
  • Decoded wav file is bit for bit identical to input wav file
  • MD5 of raw pcm values

github.com/slmdev/sac

FSLAC ~ Free Semi-Lossless Audio Codec


constrained VBR (CVBR) version of the publicly available open-source lossless audio coder FLAC.

FLAC, being a mathematically lossless audio codec, inevitably creates VBR streams as compressed files. Depending on the «difficulty» of coding each segment of the audio signal, the instantaneous coding bit-rate can be quite high. However, one can observe that, during passages of high FLAC bit-rate, the coded audio also exhibits the greatest ability of psychoacoustic masking. FSLAC exploits this property to limit the maximum instantaneous bit-rate of the compressed file. It does so by detecting the difficult audio blocks (by measuring their predictability via linear-prediction error energy calculations) and requantizing each of the detected blocks to a lower bit-depth, thereby reducing the bit-rate needed for lossless coding of that block. To prevent the quantization error from becoming audible (or visible in a spectrogram), simple adaptive noise shaping is used.

This approach is similar to the one used by LossyWAV, but differs in two important aspects. First, FSLAC is not a stand-alone pre-processor but instead is coupled with a FLAC encoder and, hence, directly creates FLAC compatible compressed files. Second, FSLAC only alters the high-bit-rate audio segments, not (almost) all parts of the audio input as LossyWAV does. The coded audio, therefore, remains perceptually lossless. In addition, it is worth noting that, due to its simplicity, FSLAC encoding is very fast. All of these features make FSLAC attractive for audio production and archival applications.

www.ecodis.de/audio.htm#fslac
hydrogenaud.io/index.php?topic=122390

fdkaac ~ Command Line Frontend For libfdk-aac Encoder


dkaac reads linear PCM audio in either WAV, raw PCM, or CAF format,
and encodes it into either M4A / AAC file.

If the input file is "-", data is read from stdin. Likewise, if the
output file is "-", data is written to stdout if one of streamable AAC
transport formats are selected by **-f**.

When CAF input and M4A output is used, tags in CAF file are copied into
the resulting M4A.

github.com/nu774/fdkaac
github.com/mstorsjo/fdk-aac
launchpad.net/ubuntu/bionic/+package/fdkaac
packages.debian.org/stretch/fdkaac

TAK ~ Tom’s Lossless Audio Kompressor


TAK stands for (T)om’s lossless (A)audio (k)ompressor. Besides, it’s a throwback to a (not very philanthropic) character from Stephen King’s “Regulators”. Early semi-public evaluation versions operated under the working title YALAC .

Characteristics:

  • High compression . The strongest mode is on a par with Monkey’s Audio High and OptimFrog’s Normal; for specific files such as classical music or voice recordings, it often outperforms both. This classification is based on the evaluation of hundreds of files of various styles; it definitely does not apply to every single file.
  • High compression speed . I am currently not aware of any other compressor that works faster than TAK’s Turbo or Fast mode and achieves similar compression rates.
  • Multi-core compressor . The compressor optionally generates up to four threads in order to take advantage of multi-core cpus.
  • Very high decompression speed . It is at the level of FLAC and therefore significantly higher than most symmetrical compressors.
  • Support for every popular audio format (not yet fully implemented).
  • Streaming support . An info frame, which contains all the information required for decoding, is inserted into the compressed audio data every 2 seconds.
  • Fault tolerance . A single bit error never damages the audio data for more than a maximum of 250 ms, since it is stored in completely independent frames of a maximum of this duration. The decoder processes even extremely damaged files, optionally replacing or removing the affected data with silence.
  • Error detection . Every single frame is protected by a 24-bit checksum (CRC).
  • MD5 checksums for quick identification of audio material (e.g. for searching for duplicates).
  • Fast, sample-accurate access to any playback position . The file header contains a look-up table with index positions every second. Even without this table, efficient random access is possible; for this purpose, the synchronization codes of the frame headers and / or the offset values ​​optionally recorded in the frame header, which refer to the beginning of the previous and next frame, can be used.
  • Metadata . A flexible and expandable structure allows the recording of non-audio data such as images or cuesheets.
  • Playback plugins for Winamp and Foobar are currently available.
  • An SDK provides other developers with decoding functions for integration into their applications. An extension to include coding functions is planned.

www.thbeck.de/Tak/Tak
wiki.hydrogenaud.io/index.php?title=TAK
hydrogenaud.io/index.php?topic=120760
www.foobar2000.org/components/view/foo_input_tak
foobar.hyv.fi/?view=foo_input_tak

dsfTAKSource ~ TAK  DirectShow  Source  Filter
  • Playback TAK audio files in any DirectShow Player (Windows Media Player, MediaPlayerClassic, …)
  • Now supporting TAK 2.2.0 (multi-channel audio, …)
  • Support UNICODE filenames and Sample Rates > 44.1 KHz
  • Upgrade: now correctly works in Windows 7 64bit !  (and other 64bit Windows versions)

liviocavallo.altervista.org

Monkey’s Audio ~ Free Lossless CODEC


Monkey’s Audio is a fast and easy way to compress digital music.  Unlike traditional methods such as mp3, ogg, or wma that permanently discard quality to save space, Monkey’s Audio only makes perfect, bit-for-bit copies of your music.  That means it always sounds perfect – exactly the same as the original.  Even though the sound is perfect, it still saves a lot of space (think of it as a beefed-up Winzip™ your music).  The other great thing is that you can always decompress your Monkey’s Audio files back to the exact, original files.  That way, you’ll never have to recopy your CD collection to switch formats, and you’ll always be able to perfectly recreate the original music CD.

monkeysaudio.com

References:

en.wikipedia.org/wiki/Monkey%27s_Audio

en.wikipedia.org/wiki/Category:Lossless_audio_codecs

en.wikipedia.org/wiki/Comparison_of_audio_coding_formats

wiki.hydrogenaud.io/index.php?title=Lossless_comparison#Monkey.27s_Audio_.28APE.29

exhale ~ MPEG-4 Audio Encoder


exhale, which is an acronym for “Ecodis eXtended High-efficiency And Low-complexity Encoder”, is a lightweight library and application to encode uncompressed WAVE-format audio files into MPEG-4 format files complying with the ISO/IEC 23003-3 (MPEG-D) Unified Speech and Audio Coding (USAC, also known as Extended High-Efficiency AAC) standard. In addition, exhale writes program peak-level and loudness data into the generated MPEG-4 files according to the ISO/IEC 23003-4, Dynamic Range Control (DRC) specification for use by decoders providing DRC. exhale currently makes use of all frequency-domain (FD) coding tools in the scale factor based MDCT processing path, except for predictive joint stereo, which is still being integrated. Its objective is high quality mono, stereo, and multichannel coding at medium and high bit rates, so the lower-rate USAC coding tools (ACELP, TCX, Enhanced SBR and MPEG Surround with Unified Stereo coding) won’t be integrated.

gitlab.com/ecodis/exhale
hydrogenaud.io/index.php?topic=118888

opencore-amr ~ Android Audio CODECS


Library of OpenCORE Framework implementation of Adaptive Multi Rate Narrowband and Wideband (AMR-NB and AMR-WB) speech codec. Library of VisualOn implementation of Adaptive Multi Rate Wideband (AMR-WB) encoder and Advanced Audio Coding (AAC) encoder. Modified library of Fraunhofer AAC decoder and encoder.

sourceforge.net/projects/opencore-amr

FAAD2 ~ Freeware AAC Decoder


FAAD2 is an open source MPEG-4 and MPEG-2 AAC decoder, it is licensed under the GPLv2 license.

www.audiocoding.com/faad2.html
faac.sourceforge.net
rarewares.org/aac-decoders
en.wikipedia.org/wiki/FAAC#FAAD2_decoder

Free CODECS ~ Online Repository


http://www.free-codecs.com
http://www.free-codecs.com/audio_codecs

lossyWAV ~ Lossy PCM In WAV File Format


lossyWAV is a freelossy pre-processor for PCM audio contained in the WAV file format. Proposed by David Robinson, it reduces bit depth of the input signal, which, when used in conjunction with certain lossless codecs, reduces the bitrate of the encoded file significantly compared to unpreprocessed compression. lossyWAV’s primary goal is to maintain transparency with a high degree of confidence when processing any audio data. ~ wiki.hydrogenaud.io/index.php?title=LossyWAV

lossyWAV is a near lossless audio processor which dynamically reduces the bit depth of the signal on a block-by-block basis. Bit Depth reduction adds noise to the processed output. The added noise is adaptively shaped by default and can alternatively be fixed noise shaped or white noise depending on command line parameters. When lossyWAV processed output is compressed with certain lossless codecs (FLAC, Wavpack, Tak, LPAC, MPEG-4 ALS and WMA-Lossless) the bitrate of the output file is significantly[1] reduced compared to the lossless original.

[1]: on average, depending on content.

hydrogenaud.io/index.php/topic,112649
wiki.hydrogenaud.io/index.php?title=LossyWAV#lossyWAV_and_foobar2000
github.com/corrideat/lossywav
github.com/MoSal/lossywav-for-posix