Keep voice, remove background noise and music

Adobe Audition

Keep voice, remove background noise and music

Posted by Ryan Moyer on June 28, 2009 at 10:51 pm

I have a rather long video clip. At several points throughout the clip, there is some background noise and also some background music that occurs at the same time as someone speaking.

I would like to eliminate the background noise and music so I am left with ONLY the person’s voice. When I run it through Adobe Audition 3 I can use favorites->vocal remover which perfectly removes the vocals and leaves the background noise/music. This is the opposite effect of what I’m looking for, but I figure if it can remove the vocals it must also be able to somehow give me just the vocals.

Is this possible?

Brian Code replied 7 years, 10 months ago 20 Members · 32 Replies
32 Replies

Emmett Andrews
June 28, 2009 at 11:32 pm

Use the Center Channel Extractor. Using the standard channel mixer, it can’t be done.
Ryan Moyer
June 28, 2009 at 11:58 pm

The center channel extractor gives me the same result. It deletes JUST the voice and leaves everything else in. I want to leave the voice in and delete everything else.

I tried extracting all channels except the center individually (left, right, and surrounds) and it doesn’t change the output at all. It sounds exactly the same as it did before editing.
Steven Talley
June 30, 2009 at 3:05 am

In the CCE window select “Acapella” for the Effect Preset. Or you can try “Lift Vocals 10db”
Ryan Moyer
July 1, 2009 at 3:23 pm

Unfortunately neither of those worked.

It seems pretty crazy to me, that there are 5 different ways I can remove the vocals without the background noise being affected at all but no way to just leave the vocals. Obviously the program can recognize the vocals since it has no problem deleting them so it should be able to separate them too. There’s no way for it just to sort of extract them so I can paste them back in.
Ryan Moyer
July 1, 2009 at 5:38 pm

Well, in case anyone comes across this in a google search a year from now, I guess I’ll post up the half-solution I came up with.

In a few of the clips I was able to find sections where there was the background noise with no voice over it. I highlighted those sections, right-clicked and chose to use them as the pattern for noise reduction, and then chose (non-adaptive) noise reduction on the whole file. This worked perfectly on some of the clips, but on others where the background noise was less constant it would not work at all, obviously.

So at least I got some parts of it cleaned up.
Emmett Andrews
July 4, 2009 at 2:17 pm

Audition doesn’t recognize the voice. It finds and remove correlation between the two stereo channels. To properly isolate the vocals, there must be identical correlation in both channels on the vocals and zero correlation between the noise in each channel. As. you can imagine, that scenario isn’t very common.

Emmett
Ryan Moyer
July 5, 2009 at 10:47 pm

But that’s the thing Emmett, it can remove the vocals while leaving the background music virtually unaffected, so obviously in THIS case whatever magical stars have aligned to make it such that Audition can distinguish between the vocals and the background noise. However, I can’t figure out how to take that knowledge, and transfer it to only being left with the vocals rather than only being left with the background noise. Obviously audition can recognize the vocals because it can remove them no problem, but how can I then get to those vocals?

It would be like saying if I had a sentence on my computer screen that read “I like ducks because they look funny” and I wanted to separate the word “ducks”, and the only thing I could do was end up with a sentence that said “I like because they look funny”, but couldn’t make a statement of just “ducks”. Obviously I can recognize the word ducks in that sentence, so I can either remove that word or remove all the OTHER words. In my editing here, Audition can recognize the vocals because it can remove them, so there must be a way to also remove everything EXCEPT them, right?
Emmett Andrews
July 6, 2009 at 12:54 am

As I said, no magic involved and it isn’t recognizing the voice. It is recognizing correlated audio. There is enough of the background noise that is both correlated and uncorrelated that your ears are tricking you into thinking that it has been untouched. The fact is, you’re removing a large amount of the background noise with any of the vocal removal tools. And the “a capella” preset is doing the exact opposite. It’s leaving only the correlated audio. Everything you hear when you use that preset is exactly what’s being removed when you use a vocal removal tool.

What you’re trying to do does not work with your analogy. A much more correct analogy is that of trying to “unbake” a cake. Take a baked cake and get the eggs out of it. I dare you to try. A good scientist could remove and isolate large portions of egg product, but they certainly wouldn’t resemble eggs anymore.

The Center Channel Extractor works better than most tools because it combines traditional polarity tricks with FFT processing. You can read about the basics of phase relationship here: https://en.wikipedia.org/wiki/Phase_cancellation

Emmett
George Oconnell
August 24, 2009 at 3:47 am

Emmett,

I am unfamiliar with the software, I found your question while searching for related information on Google.

You have at some point a track with voice, music, noise and a track with just noise and music. Add them as Left track and Right track in a sound editing program but add one of them as Invert and Mix. This should leave you with the noise and music reduced or eliminated. Hopefully this might help.

George
Stanley Dilley
November 29, 2009 at 5:43 pm

I am 67 and find music and background noise on TV movies/shows drowning out voice. Therefore, I am also interested in removing or decreasing background sounds from voice. I have the following thought, please comment.

If a karaoke circuit can remove voice, inverting and adding the karaoke output to the original should leave the voice and delete or reduce the background sounds. It would need to be a stereo source with voice reasonably centered, which, I believe most TV sources are. I don’t need complete removal of background, a reduction would be wonderful.

Page 1 of 4

1 2 … 4 →

Reply to this Discussion! Login or Sign Up

Creative Communities of the World Forums