bytedance/MegaTTS3
TLDR MegaTTS3 by Bytedance is a lightweight and efficient text-to-speech (TTS) model with only 0.45B parameters. It supports high-quality voice cloning, bilingual (Chinese and English) speech synthesis, and accent intensity control. Users can download pre-trained models, use command-line tools for inference, and access a web UI. The project aims for academic use, with stringent security measures and is licensed under Apache-2.0.