A simple python script to set up local whisperx server to process audio files. Contains the FastAPI powered server code "main.py" and demo test script "main_client_test.py" to demonstrate usage. Idea is to avoid setting up WhisperX each time for a new project. It's much easier to setup once and then run WhisperX as an external service when needed. The code supports direct upload of audio files and Azure Blob storage system usage.
First install all requirements (including CUDA stuff). WhisperX has a known issue with missing CUDA-related .dll's. If you encounter this, get those dll files manually, see this: https://stackoverflow.com/questions/78320397/runtimeerror-library-cublas64-12-dll-is-not-found-or-cannot-be-loaded-while-us with solution "go to https://github.com/Purfview/whisper-standalone-win/releases/tag/libs download cuBLAS.and.cuDNN_CUDA12_win_v2.7z and add it do your cuda bin".
Then create .env file with keys:
PASSWORD = # you can choose this yourself
HF_TOKEN =
AZURE_API_KEY=
AZURE_ENDPOINT=
AZURE_STORAGE_BLOB_URL =
AZURE_STORAGE_ACCOUNT =
AZURE_STORAGE_KEY =
Only the first one "PASSWORD" is essential, others are optional if not used. Then start main.py service. Then you can make processing calls using REST API from other programs by sending audio files. You get transcripts back.
-JanneK