A modern PyQt6 desktop GUI for the Hugging Face model nvidia/magpie_tts_multilingual_357m.
-
modern dark/light desktop UI
-
separate settings window for app language, TTS language, speaker, device and storage paths
-
local cache folder for Hugging Face downloads
-
first-run or manual model pre-download
-
WAV export with timestamp-based filenames
-
last-output playback inside the app
-
virtual-environment friendly Windows setup scripts
Direct english output (example):
magpie_20260417_001837_en_Sofia.mp4
German Voice Output (example random generated sci-fi story):
magpie_20260423_210746_de_Aria.mp4
app.py— GUI entry pointsrc/main_window.py— main application windowsrc/options_dialog.py— settings dialogsrc/tts_backend.py— Magpie/NeMo loading and synthesis backendtools/preload_models.py— optional model prefetch toolinstall_windows.bat— creates venv, installs dependencies, optionally pre-downloads modelrun_windows.bat— launches the app inside the venv
- Extract the ZIP.
- Run
install_windows.bat. - The installer now tries Python 3.12, 3.11, 3.10, then any available
py/pythoninterpreter. If your version is older than 3.10, it warns but still continues. - After setup, run
run_windows.bat.
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install -r requirements.txt
python -m pip install torch torchvision torchaudio
python -m pip install "nemo_toolkit[asr,tts] @ git+https://github.com/NVIDIA/NeMo.git"
python tools/preload_models.py
python app.py- This model is large, so keep enough free disk space for the
.nemocheckpoint plus cache files. - If the GitHub NeMo install fails, the Windows installer tries a published-package fallback.
Apply text normalizationis intentionally optional because text-normalization availability can vary by environment.- The app stores settings in
app_data/settings.json.
- English (
en) - German (
de) - Spanish (
es) - French (
fr) - Italian (
it) - Vietnamese (
vi) - Chinese (
zh) - Hindi (
hi) - Japanese (
ja)
- Sofia
- Aria
- Jason
- Leo
- John
This project uses a Windows-specific NeMo install path. nemo_text_processing / pynini is intentionally not installed by default because pip-based Windows installs are not officially supported for that dependency chain. The app can still synthesize speech normally, and text normalization remains optional.
github.com/zeittresor/Magpie_TTS_Studio