You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What was the reason you switched from speaker embeddings (Cotatron) to a speaker encoder (this). Was it because it worked better? Or was it to support Any to Any voice conversion? I'm curious because I am currently trying to deploy my own architecture and can't really decide between the two.
The text was updated successfully, but these errors were encountered:
dunky11
changed the title
Reason to use Speaker Encoder over Speaker Embeddings?
Reason to use speaker encoder over speaker embeddings?
Jul 29, 2021
Hi, we used a speaker encoder over speaker embedding because speaker embedding can't capture the variation of speech within the same speaker. These variations may include the recording environment, the speaker's prosody, etc.
However, we have not performed extensive ablation studies of the benefits of speaker encoder over speaker embedding. So, we're not sure of the exact performance improvement we gain from using a speaker encoder.
What was the reason you switched from speaker embeddings (Cotatron) to a speaker encoder (this). Was it because it worked better? Or was it to support Any to Any voice conversion? I'm curious because I am currently trying to deploy my own architecture and can't really decide between the two.
The text was updated successfully, but these errors were encountered: