I broke down and installed
SoX to be able to apply some primitive form of companding (compressor-expander pair) just so that my databent audio won't sound like the quantized screams of the damned (the damned being the bits shaved off during bit depth reduction). It's about that project of mine to interpret audio as video and back, for reference.
After bashing my head against the frankly
obtuse manual, I came up with this primitive construction. It's definitely caveman-era (a transfer function that's literally a single line segment? Hah!), but it's gotten the dynamic range of the output file up to an acceptable level. It seems to work by sheer coincidence.
sox foo.flac -b 8 foo-compressed.wav compand 0.3,0.8 6:−60,-10 −5 gain -n -1
sox foo-compressed.wav -b 24 foo-expanded.flac compand 0.3,0.8 6:−10,-60 −5 gain -n -1
And I'll probably do some experiments on the attack, release, and output gain parameters to determine the best settings for my stuff, but I think I've basically gotten the basic idea down.
I love having my cake and eating it too; retaining an 8-bit PCM input (since video codecs work in 8-bit color-component chunks by default), yet still having enough dynamic range to work with at the end. I'm ready to rewrite my script to include SoX in the chain. The processing chain in my head looks like:
Original FLAC -> Use SoX to compress, reduce bit depth to 8-bit -> Use FFmpeg to interpret as raw video, encode with codec -> Decode encoded video back to raw data -> Use SoX to expand, increase bit depth to 24-bit for output file
Which I think is the pinnacle of working around your problems. It's so much of a hack. It's beautiful!
I can already imagine the infomercial: "Tired of having too little dynamic range in your audio, but still need to work with low bit depths? Use companding! With companding, your audio goes from this: [digital screams of the damned], to this: [perfectly-normal classical music]! Act now! (...)"