Open
Conversation
This commit adds support for mlx-whisper as a backend option, optimized for Apple Silicon (M1/M2/M3) Macs. MLX leverages Apple's Neural Engine and GPU for hardware-accelerated inference. Changes: - Add MLX_WHISPER to BackendType enum with is_mlx_whisper() method - Create WhisperMLX transcriber wrapper (transcriber_mlx.py) * Maps standard model sizes to MLX community models * Implements transcribe() with language detection support * Returns segments compatible with WhisperLive base interface - Create ServeClientMLXWhisper backend class (mlx_whisper_backend.py) * Extends ServeClientBase with MLX-specific implementation * Supports single model mode for memory efficiency * Thread-safe model access with locking * Graceful fallback to faster_whisper if MLX unavailable - Update server.py initialize_client() to instantiate MLX backend - Update run_server.py CLI to include mlx_whisper in backend options - Add mlx-whisper dependency to requirements/server.txt The backend follows the same pluggable architecture as other backends (faster_whisper, tensorrt, openvino) and implements the required transcribe_audio() and handle_transcription_output() methods. Usage: python run_server.py --backend mlx_whisper --model small.en
This commit adds:
1. MLX Model Path CLI Argument:
- Add --mlx_model_path/-mlx argument to run_server.py
- Server can now specify MLX model, overriding client's choice
- Supports model sizes (small.en) and HF repos (mlx-community/whisper-large-v3-turbo)
- Integrated into single_model mode support
2. Server-side MLX Model Configuration:
- Update server.run() to accept mlx_model_path parameter
- Update recv_audio(), handle_new_connection(), and initialize_client()
- MLX backend now uses server-specified model when provided
- Falls back to client-specified model if not set
3. Microphone Test Script (test_mlx_microphone.py):
- Real-time transcription test with microphone input
- Command-line args for host, port, model, language, translate
- User-friendly interface with status messages
- Saves output to SRT file
- Proper error handling and cleanup
4. GPU Verification Tool (verify_mlx_gpu.py):
- Checks if running on Apple Silicon (M1/M2/M3)
- Verifies MLX and mlx-whisper installation
- Tests GPU/Neural Engine access with sample computation
- Optional MLX Whisper model loading test
- Provides instructions for monitoring GPU usage:
* Activity Monitor (GUI)
* powermetrics (Terminal)
* asitop (Third-party)
- Comprehensive summary with actionable recommendations
Usage Examples:
# Server with specific MLX model
python run_server.py --backend mlx_whisper --mlx_model_path small.en
# Verify GPU functionality
python verify_mlx_gpu.py
# Test with microphone
python test_mlx_microphone.py --model small.en --lang en
Contributor
|
This is huge, I will test the backend this week. |
Collaborator
|
@hanrok can we also add tests for the mlx backend ? Maybe in another PR, we dont have to block this one. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.