Friday, March 21, 2025

Running Kokoro-82M Text-to-Speech on Fedora with Nivdia GPU with Podman

I got interested in Kokoro, a new text-to-speech model that only uses 82 million parameters but is one of the top models on the TTS leaderboard.  I wanted to run it locally and a quick way to try it is to use Kokoro-FastAPI which comes in a Docker container.  The README on Kokoro-FastAPI’s github has instructions using Docker (with or without GPU), but I’m using Podman so I need to do some setup on Fedora to enable Podman access the GPU.

The instructions on the Podman and Nvidia site have you set up an Nvidia Repository to get the container kit that enables Podman to access the GPU, but in my previous post on installing Nvidia and CUDA drivers on Fedora I mentioned that there can be dependency conflicts.  I wasn't sure if the Nvidia container kit might also cause problems, but fortunately, you can just install the packages from Fedora's repo and avoid possible headaches: 

sudo dnf install golang-github-nvidia-container-toolkit

Assuming that you already installed Podman (if not, follow the Fedora doc on installing Docker and/or Podman) and you have an GPU, you can download and run:

docker run --device nvidia.com/gpu=all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2

If SELinux blocks Podman from accessing the GPU, you can follow Podman's instruction about giving permission for containers to access the GPU

sudo setsebool -P container_use_devices true

Then run the `docker run ...` command above and it should start.

This a quick way to give Kokoro a try.  I'll probably try to run it next with Go by using the onnxruntime to load the Kokoro-Onnx model.

No comments:

Post a Comment