In this quarantine week, let’s answer a (not that) simple question: how many bits do you need to encode sound and images with a satisfying dynamic range?
Let’s see what hypotheses are useful, and how we can use them to get a good idea on the number of bits needed.
I have shown how dB and bits are related, here and also here. Basically, adding one bit to a code adds about 6 dB to the resulting signal. Now, by definition, the threshold of hearing, is set at 0 dB. This corresponds to the weakest sound you can distinguish from true silence. Threshold of pain (the point where you kind of expect your ears to start bleeding) is somewhere above 120 dB. Much louder sounds lead to actual hearing damage—explosions, rocket launches, etc. If we assume that we stay in the 0 to 120 dB range, the useful range for safe sound reproduction, at about 6 dB by bit,
So about 20 bits would be enough. If you consider 0 dB as the threshold of hearing, you might want to use 1 or 2 more bits to account for people with much finer hearing (as would suggest the loudness contour chart). Round to the next byte, you get 24 bits. What pros suggest you use.
That one asked me a bit more research to find good references. Some report the total visual dynamic range is about 10 orders of magnitude (from to ) (in appropriate luminosity units), others, like Fein and Szuts, report 16 (from to ). Depending on the range, that’d yield
However, while the human eye can see luminosity on that range, it can’t do it simultaneously. The following figure (from Gonzalez & Woods, ) shows that around a base value (average scene luminosity), shown as in the figure, only a certain range can be perceived (with lower range marked as ). That range seems to be only 4, or 5 order of magnitude, so only
So if we consider the simultaneously perceivable range around some standard average-but-bright-enough luminosity, we might get away with 16 bits per color component (maybe less?).
The number we get are pretty much in line with what we find in audio and video. 24 bits is considered “professional” (but not necessarily useful, depending on the quantity of noise in the original source) for audio. HDMI support up to 48 bits per pixel (16 bits per component) while digital camera often sport 10, 12 or 14 bits per component.
 Alan Fein, Ete Zoltan Szutz
— Photoreceptors: Their Role in Vision —
Cambridge University Press (1982)
 Rafael C. Gonzalez, Richard E. Woods
— Digital Image Processing — 2nd ed, Prentice