Skip to content

Conversation

@arnab2001
Copy link

Description

This PR updates the kit init command to support generating Kitfiles directly from remote repositories (specifically HuggingFace models and datasets) without needing to clone or import them first. This functionality allows users to inspect and generate a Kitfile for a remote resource, which can then be edited before running an import.

Key changes:

  • Remote Detection: Updated kit init to detect if the input path is a remote HuggingFace repository (URL or org/repo format).
  • Repo Parsing: Implemented hf.ParseHuggingFaceRepo to correctly identify and parse references for both models and datasets.
  • API Integration: Updated hf.ListFiles to support listing files for both models and datasets using the correct API endpoints via a new RepositoryType enum.
  • Refactoring: Updated kit import to utilize the improved hf library components.
  • Testing: Added unit tests for remote detection logic, repository URL parsing, and package metadata extraction.

Linked issues

closes #1055

AI-Assisted Code

  • This PR contains AI-generated code that I have reviewed and tested
  • I take full responsibility for all code in this PR, regardless of how it was created

Signed-off-by: Arnab Chatterjee <arnabchat2001@gmail.com>
@arnab2001
Copy link
Author

Requesting review @gorkem @amisevsk

@arnab2001 arnab2001 changed the title feat: allow kit init to work with remote repositories (#1055) feat: allow kit init to work with remote repositories Jan 29, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 833da5bd08

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +39 to 41
// Parse HuggingFace repository URL to extract repo name and type
repo, repoType, err := hf.ParseHuggingFaceRepo(opts.repo)
if err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use dataset resolve URLs when repoType is dataset

Issue Statement: kit import now accepts dataset repos via ParseHuggingFaceRepo, but downloads still use the model resolve URL so dataset imports fail. Location: pkg/cmd/kitimport/hfimport.go:39-41. Investigation Summary: I traced the repoType parsed here through importUsingHF and confirmed hf.DownloadFiles in pkg/lib/hf/download.go still hardcodes https://huggingface.co/%s/resolve/... with no /datasets/ variant and no other dataset-specific download path exists. Trigger Conditions: using a HuggingFace dataset repo such as datasets/org/repo or https://huggingface.co/datasets/... so repoType is RepoTypeDataset. Impact Statement: file listing succeeds but download requests return 404, causing the import to abort and making dataset imports unusable.

Useful? React with 👍 / 👎.

Comment on lines +321 to +324
// Check if the path exists on the filesystem
// This handles cases like "models/my-model" which could be either local or remote
if _, err := os.Stat(path); err == nil {
// Path exists locally - treat as local

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle os.Stat errors as local in remote detection

Issue Statement: detectRemoteRepo ignores os.Stat errors other than nil, so local paths that exist but are unreadable are misclassified as remote. Location: pkg/cmd/kitinit/cmd.go:321-324. Investigation Summary: I inspected detectRemoteRepo, complete, and runLocalInit and found no other permission/error handling before the HF parse; the code only treats the path as local when os.Stat succeeds. Trigger Conditions: a relative path like private/model (no ./ prefix) that exists but returns a permission/IO error on stat. Impact Statement: kit init will call the HuggingFace API instead of surfacing the local filesystem error, leading to confusing failures or unintended remote output.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow kit init to work with remote repositories

1 participant