PLing and dynamics

Comments about LibriVox? Suggestions to improve things? News?
LibriVox Admin Team
Posts: 3990
Joined: April 24th, 2019, 12:06 pm

Post by Kazbek » September 20th, 2021, 6:34 am

I think it's worth distinguishing between two different issues that are being discussed here. One is artistic expression: e.g., whether a passage is read emphatically or gently. These choices are left up to our readers and fall under the category of reading style, which PLs shouldn't comment on. The other issue is physical volume: whether the passage is loud or quiet in the recording. If there's a volume jump such that it hurts the listener's ears, or a volume drop such that a passage can't be understood without turning up the volume, that should be flagged in PL notes and fixed. It might seem that the two issues are connected, but it's not necessarily so. Whenever I PL a recording that can't be comfortably listened to at a constant volume, I ask the reader to apply dynamic range compression. In Audacity this can be done as follows:

Select the whole track (ctrl-A), go to Effects->Compressor, and use the following settings:
- lower the Threshold sliding bar to -20 dB (or -30 dB)
- check the box "Compress based on peaks"
- uncheck the box "Make up for gain..."
- keep defaults for the other settings.

This eliminates large jumps in volume, yielding more even dynamics. Audacity will remember these compressor settings for future tracks. After that, with the entire track still selected, use the Amplify effect with -4 in the Amplification box to reduce the volume by 4 dB (or whatever is needed to bring the average volume within our target range).

I use this effect for all my recordings. Other folks here have different Compressor settings they like to use. I believe all professional audiobooks are passed through dynamic range compression these days. I just finished listening to an old commercial audiobook, which sounded like the dynamics weren't compressed, and it was a challenge to listen to while walking out on the street. There was a lot of voice acting, which created a wide dynamic range, and I had to pause the recording whenever there was even minor ambient noise.

Dynamic range compression doesn't impinge on your artistic expression. The passages read emphatically still sound emphatic, and so forth. It simply makes the recording easier to listen to.


Posts: 327
Joined: June 22nd, 2020, 11:30 am
Location: S-H, Germany

Post by habasud » September 23rd, 2021, 10:30 am

First of all "artistic expression" sounds a bit sophisticated. But I think it's clear what is meant :-)

Indeed I do a "mastering" on every record I make here. That includes filter, de-essing (...if needed) and usually a soft rms compression. The latter because I'm aware of the fact that podcast shouldn't have too much dynamics - or "loudness range" in terms of EBU R128/ITU BS 1770.
But normally I'm not reading such dramatic texts. So the rms compression is nice, because it doesn't squash the peaks too much - as one could see now :mrgreen:

According to your proposal (peak compression with low threshold and probably a low ratio) I would say that it is much better than just reducing the gain on the loud parts. Nevertheless peak compression does something similar, I think. Also at low threshold things like breathing noise get compressed, too. That might not be desired.
At home I can use a loudness leveler plugin. In this particular case it gave me good results: It leveled the different loudness at the start/end of the reading and it has a short term limiter that -with look ahead- reduced the perceived loudness of that part when the "ghost" talked - without changing too much the dynamics of the rest of the recording and without affecting breath noise.

Some thoughts about dynamics in general:
I started as a tape editor and sound tech at public broadcast in the mid 80ies. At that time no compressors were used at all. All we had was a limiter. But using it was considered bad craftmanship.

In the 90ies commercial radio was allowed and they brought with them the sound of American FM-Stations. They were using voiceprocessors and multiband compressors like Orban Optimod. Soon our techs were advised to level everything to the max. One result was, that spoken word couldn't be understood at all, because it sounded much softer than music at the same level.

Jingles and layout were produced using tools named "L1 Ultra Maximizer" or "MaxxBass". We had our fun making them as loud as possible - which caused anoying loudness jumps.

Then the station started using a stereo sum processor. But it soon was at it's limits.
At the end of the 90ies we got new digital mixing desks that allowed different busses for voice and music. So voice got pre-compressed. After some time they decided to switch over from stereo- to multiband compression; just like our competitors did years before.

This all meant a dramatic loss of dynamics over time. That's what is known as "loudness war". Everybody tried to be the loudest in the place and whole generations raise without ever having heard recordings of an uncompressed voice.

As a result people invented loudness tools like ReplayGain and others with the intentions I mentioned before. So for audiophil people, being able to use dynamics in their recordings should in my opinion generally be considered as something like a gift :D

Life is What Happens To You While You're Busy Making Other Plans (John Lennon)

LibriVox Admin Team
Posts: 3990
Joined: April 24th, 2019, 12:06 pm

Post by Kazbek » September 24th, 2021, 5:26 pm

Thank you, that was an interesting read, even if I don't know what half of those terms mean. :) You obviously know much more than me on the subject and are better qualified to reduce the dynamic range of your recording, if the PL finds that it is too wide for comfort. This latter judgement is to some extent subjective, and it's quite possible that you as the reader may disagree with it. In that case the BC would make the call, perhaps in consultation with the MC.


Posts: 1093
Joined: November 10th, 2016, 3:54 am
Location: LONDON UK

Post by lurcherlover » October 8th, 2021, 10:49 am

Music recordings have a much wider dynamic range than voice (narration etc) recordings. This is because a solo flute at ff is much quieter than the full orchestra at fff! I'm sure you know that a very dramatic full blooded actor will deliver his/her lines with extreme loudness as well as at a whisper. So the audio editor has to even it out by increasing the whisper and decreasing the loudest parts.

Of course the "loudness wars" were commercially driven in the pop and heavy metal world, and classical music did not suffer from this as far as I know. Music heavily compressed and pushed up to almost zero on the meter sounded awful and was the most dreadful rubbish one could be afflicted with. Thank heavens now we have some dynamics coming back.

Microphone technique is also important for singers and the spoken word as backing off on emotional outbursts and going in closer to the mic for the whisper makes a lot of sense.

Post Reply