OpenVoice a versatile instant voice tone transferring and generating speech in various languages with just a brief audio snippet from the source speaker. OpenVoice represents has three main features: (i) high quality tone color replication with multiple languages and accents; (ii) it provides fine-tuned control over voice styles, including emotions, accents, as well as other parameters such as rhythm, pauses, and intonation. (iii) OpenVoice achieves zero-shot cross-lingual voice cloning, eliminating the need for the generated speech and the reference speech to be part of a massive-speaker multilingual training dataset
More details about model can be found in project web page, paper, and official repository
In this tutorial we will explore how to convert and run OpenVoice using OpenVINO.
This notebook demonstrates voice tone cloning with OpenVoice in OpenVINO.
The tutorial consists of following steps:
- Install prerequisites
- Load PyTorch model
- Convert Model to Openvino Intermediate Representation format
- Run OpenVINO model inference on a single example
- Launch interactive demo
This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to Installation Guide.