
The worst case for extracting vocal are multiple voices mixed in center. Maybe someone can suggest how to do it mutliplatform way. I cannot direct you somewhere how to do it, but I know surely that this spectral subtraction is part of the Vocal remover. I have my own Windows program to do it, but it is a piece of (work), I cannot publish it, it only reads in some specific wav formats. I do it by converting the signal to mono/side representation (L+R/L-R) and spectrally subtracting side part from mono part M-S - only magnitude spectra, setting the phase to zero. And Praat is very good at it.įortunately the reverb and backing vocals are usually spread in stereo, so what I do everytime to get rid of it is extracting only the center part of the stereo signal. The signal doesn't need to be clean, at least we want to see the phonemes and Praat should be able to extract the pitch. What we need is a vocal with natural human timing/lengths of phonemes and natural pitch transitions.

It doesn't matter if it's out of tune we can tune it later with parameters.

Therefore first I search for a raw solo performance, preferably without effects, but definitely stereo.Īmateur covers are ideal, these are not mixed well into music. Gettin a useful vocal from this is sometimes a dirty work. You can also have programs like Spectral Layers and RipX.īUT by these tools you will get ALL vocals from the song in addition with associated reverb. Installing is not very easy, I recomend to use Anaconda environment (it is in Python). It has a couple different AI models and is very good.

Although it is focused on making instrumentals, it also separates vocals. I can confirm that Vocal Remover is not the Spleeter, I use its offline open source version from GithubĪnd it doesn't do instrument stems, only vocal. SW: I don't use Spleeter, instead I used - they came later and compared themselves with Spleeter, so I suppose it is a differen algorithm. Click to expand.I think it is directly related and it is the worst part of all of it.Ī vocal stem separation is now relatively easy, we have many neural net based programs and web services.
