CoreML Audio Resource Leak

# WhisperKit CoreML Audio Resource Leak - Complete Analysis

## Summary

WhisperKit's CoreML backend causes persistent high CPU usage in macOS's `coreaudiod` daemon (~10-12%) that persists after transcription completes. This appears to be caused by CoreML audio processing resources that are never fully released.

## Environment

- **macOS**: 26.2 (Tahoe)
- **Hardware**: MacBook Air (Apple Silicon M4)
- **App**: Hex (speech-to-text app using WhisperKit)
- **WhisperKit**: Latest version from main branch
- **Date Discovered**: January 8, 2026

## Reproduction Steps

1. Launch any app using WhisperKit for transcription
2. Perform **one audio transcription**
3. Wait for transcription to complete
4. Observe Activity Monitor

## Expected Behavior

After transcription completes and WhisperKit instance is released:
- `coreaudiod` CPU: ~0.5-1% (baseline idle)

## Actual Behavior

After transcription completes:
- `coreaudiod` CPU: **~10-12%** (persists indefinitely)
- Only returns to normal when app is quit
- Killing/restarting `coreaudiod` does NOT fix it
- The app itself can be idle - problem persists

## Technical Analysis

Using system diagnostics (`lsof`, `sample`, filesystem comparison), I traced the exact moment the issue occurs:

### Files Loaded During First Transcription

```
/Users/.../com.apple.e5rt.e5bundlecache/.../H16G.bundle/main/main_bnns/bnns_program.bnnsir
```

These are **BNNS (Basic Neural Network Subroutines)** files - WhisperKit's CoreML model caches.

### CPU State Change

**BEFORE first transcription:**
```
_coreaudiod  0.0% CPU  (idle)
```

**AFTER transcription completes:**
```
_coreaudiod  11.7% CPU  (stays forever until app quits)
```

### What We Tried (All Failed)

Attempted fixes in Hex app code:
1. ❌ Unload WhisperKit after transcription (`whisperKit = nil`)
2. ❌ Disable all audio recording warmup/priming
3. ❌ Destroy AVAudioRecorder immediately after use
4. ❌ Disable audio level metering
5. ❌ Prevent AVAudioEngine from staying active
6. ❌ Explicit cleanup of all audio resources
7. ❌ Force GC delays

**None of these helped** - the problem persists even when we destroy everything in our Swift code.

### Proof It's WhisperKit/CoreML

1. App has **ZERO audio activity** in logs after transcription
2. No audio files remain open
3. No recorder instances active
4. `lsof` shows only CoreML `.bnnsir` model files
5. Problem only occurs **after first transcription** (when CoreML models load)
6. Quitting app immediately fixes it (releases CoreML resources)

## Related Issues

This appears similar to known CoreML audio processing bugs:

1. **whisper.cpp #1202**: "CoreML + calls to whisper_full result in increased memory usage"
   - MLMultiArray allocations in `whisper_coreml_encode` never freed
   - https://github.com/ggml-org/whisper.cpp/issues/1202

2. **whisper.cpp #797**: "Increasing memory usage over time with CoreML"  
   - https://github.com/ggml-org/whisper.cpp/issues/797

3. **WhisperKit #265**: "Memory leak when repeatedly destroying and re-instantiating WhisperKit"
   - https://github.com/argmaxinc/WhisperKit/issues/265

## Root Cause Hypothesis

WhisperKit uses **CoreML's Audio Feature Print** or similar audio processing APIs. When the model is loaded for the first time:

1. CoreML initializes an **audio processing graph** in `coreaudiod`
2. This graph processes audio for feature extraction during transcription
3. Even after transcription completes and WhisperKit is deallocated, **CoreML doesn't release the audio processing graph**
4. The graph stays active, polling/processing at 10Hz or similar
5. This causes persistent high `coreaudiod` CPU usage

## Potential Solutions

### Option 1: WhisperKit Framework Fix (Ideal)

WhisperKit should add explicit cleanup:
```swift
deinit {
    // Release CoreML audio processing resources
    // Invalidate MLModel instances
    // Clear audio feature extraction caches
}
```

### Option 2: Apple CoreML Fix (Long-term)

File a Feedback with Apple about CoreML audio processing resources not being released after model deallocation.

### Option 3: App-level Workaround (Pragmatic)

For apps using WhisperKit:

**A) Quit and restart after transcription:**
```swift
// After transcription
DispatchQueue.main.asyncAfter(deadline: .now() + 0.5) {
    NSApplication.shared.relaunch(afterDelay: 0)
}
```

**B) Warn users about CPU usage:**
```
"Note: First transcription may cause increased background CPU usage (~10%).  
Restarting the app after use is recommended for battery life."
```

**C) Background task to reset coreaudiod:**
```swift
// Periodically check and reset if needed
if coreaudiodCPU > 8.0 {
    Process.launchedProcess(
        executableURL: URL(fileURLWithPath: "/usr/bin/killall"),
        arguments: ["coreaudiod"]
    )
}
```

## Impact

- **Battery life**: Significantly reduced on MacBooks (~10-15% battery drain)
- **Heat**: Fan activity increases
- **Performance**: One CPU core effectively locked at 100%
- **User experience**: Poor for battery-powered devices

## Request

1. **To WhisperKit maintainers**: Can you add explicit CoreML resource cleanup in `deinit` or provide a `cleanup()` method?

2. **To Apple**: Is this a known CoreML limitation? Should we file a FB?

3. **Temporary fix**: Add documentation warning users about this behavior until resolved

## Testing

To reproduce in ANY WhisperKit-using app:

```swift
// Before first transcription
// Check: ps aux | grep coreaudiod  → 0% CPU

let whisperKit = try await WhisperKit()
let result = try await whisperKit.transcribe(audioPath: "test.wav")

// After transcription  
whisperKit = nil  // Even with explicit dealloc
// Check: ps aux | grep coreaudiod  → 10% CPU (stays!)
```

---

**Found by**: Benjamin Jacobs  
**Date**: 2026-01-08  
**Diagnostic files**: Available upon request (before/after lsof dumps, sample traces)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoreML Audio Resource Leak #393

WhisperKit CoreML Audio Resource Leak - Complete Analysis

Summary

Environment

Reproduction Steps

Expected Behavior

Actual Behavior

Technical Analysis

Files Loaded During First Transcription

CPU State Change

What We Tried (All Failed)

Proof It's WhisperKit/CoreML

Related Issues

Root Cause Hypothesis

Potential Solutions

Option 1: WhisperKit Framework Fix (Ideal)

Option 2: Apple CoreML Fix (Long-term)

Option 3: App-level Workaround (Pragmatic)

Impact

Request

Testing

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CoreML Audio Resource Leak #393

Description

WhisperKit CoreML Audio Resource Leak - Complete Analysis

Summary

Environment

Reproduction Steps

Expected Behavior

Actual Behavior

Technical Analysis

Files Loaded During First Transcription

CPU State Change

What We Tried (All Failed)

Proof It's WhisperKit/CoreML

Related Issues

Root Cause Hypothesis

Potential Solutions

Option 1: WhisperKit Framework Fix (Ideal)

Option 2: Apple CoreML Fix (Long-term)

Option 3: App-level Workaround (Pragmatic)

Impact

Request

Testing

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions