Implement MLFlow import: `kit import mlflow://` for experiment-to-ModelKit workflow

Data scientists train models using MLFlow but must manually package them as ModelKits for deployment. This breaks the workflow continuity and creates opportunities for version mismatches between what was tracked in MLFlow and what gets deployed.

**Proposed Solution**

Add MLFlow as an import source with URI syntax: `kit import mlflow://[tracking_uri/]experiments/{exp_id}/runs/{run_id}`

### Architecture

Leverage existing `kit init` pipeline (which already runs during import). Your work is:

1. **MLFlow URI handler** - parse URIs, extract tracking server + run identifiers  
2. **Artifact downloader** - use pull run artifacts to a temp directory
3. **Metadata enrichment** - inject MLFlow provenance into generated Kitfile
4. **Error handling** - deal with auth, large files, incomplete runs

### Implementation Details

**URI Format**
```bash
# With explicit tracking URI
kit import mlflow://mlflow.company.com/experiments/42/runs/abc123 -t mymodel:v1

# Using MLFLOW_TRACKING_URI env var
export MLFLOW_TRACKING_URI=http://localhost:5000
kit import mlflow://experiments/42/runs/abc123 -t mymodel:v1

# Short form (uses default experiment)
kit import mlflow://runs/abc123 -t mymodel:v1
```

**Data Flow**
```
1. Parse mlflow:// URI → tracking_uri, experiment_id, run_id
2. MLFlow client: fetch run metadata + list artifacts
3. Download artifacts to temp dir (filtered by size/type)
4. Run kit init on temp dir → generates Kitfile
5. Augment Kitfile with MLFlow provenance metadata
6. Pack ModelKit using existing pipeline
```
### Implementation Challenges

**1. Authentication Hell**

MLFlow supports multiple auth mechanisms with no standard:
- Basic auth (username/password)
- Token-based (custom headers)
- Cloud provider auth (AWS IAM, GCP service accounts) for artifact stores
- No auth (local/trusted network)

**Approach**: 
- Support `MLFLOW_TRACKING_URI` and `MLFLOW_TRACKING_TOKEN` env vars
- Document that artifact store auth must be handled separately (AWS credentials, GCS keys, etc.)
- Fail fast with clear error messages when auth is missing

**2. Large Artifact Handling**

50GB model checkpoint will timeout/OOM with naive download make sure to use filters to download only what needs to be packed

**3. Incomplete/Failed Runs**

MLFlow runs can be RUNNING, FAILED, or KILLED. Artifacts may be partial.  Only import FINISHED runs

**4. Storage Backend Diversity**

MLFlow artifact stores can be:
- Local filesystem (`file:///`)
- S3 (`s3://`)
- GCS (`gs://`)
- Azure (`wasbs://`)
- SFTP, NFS, etc.

We can either implement these backends on Kit or rely on MLFlow client.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement MLFlow import: `kit import mlflow://` for experiment-to-ModelKit workflow #1014

Architecture

Implementation Details

Implementation Challenges

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement MLFlow import: kit import mlflow:// for experiment-to-ModelKit workflow #1014

Description

Architecture

Implementation Details

Implementation Challenges

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Implement MLFlow import: `kit import mlflow://` for experiment-to-ModelKit workflow #1014