

Instructions can be found below under Inference GUI 2 header.

Some modifications have also been made to pitch inference for better performance.This fork has some modifications to make it work better on Windows and with smaller multi-speaker datasets.Additionally, changing the vocoder to NSF HiFiGAN to fix the issue with unwanted staccato.

SoftVC VITS Singing Voice Conversion Model OverviewĪ singing voice coversion (SVC) model, using the SoftVC encoder to extract features from the input audio, sent into VITS along with the F0 to replace the original input to acheive a voice conversion effect.
