I generally normalize things to -6db. This makes it so I still have room for a loud noise or spike in the voice over, but I’m still getting good overall volume.
So I’ll make sure all my audio clips are dancing around the -6db mark, one at a time. Then I’ll set keyframes and lower my music when someone is talking to around -15, although you could need more or less volume reduction depending on how compressed your music is.
Once I’ve done all that, I’ll usually try and raise the master volume very slightly and get as close as I can to 0db without any distortion occurring.
The key is having the entire piece at an even relative volume. If your video is being broadcasted on television. The master control room is usually equipped with a volume compresses/normalizer so it’s relatively the same as the rest of the broadcast.