-
Notifications
You must be signed in to change notification settings - Fork 161
feat: allow kit init to work with remote repositories #1074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: allow kit init to work with remote repositories #1074
Conversation
Signed-off-by: Arnab Chatterjee <arnabchat2001@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 833da5bd08
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Parse HuggingFace repository URL to extract repo name and type | ||
| repo, repoType, err := hf.ParseHuggingFaceRepo(opts.repo) | ||
| if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use dataset resolve URLs when repoType is dataset
Issue Statement: kit import now accepts dataset repos via ParseHuggingFaceRepo, but downloads still use the model resolve URL so dataset imports fail. Location: pkg/cmd/kitimport/hfimport.go:39-41. Investigation Summary: I traced the repoType parsed here through importUsingHF and confirmed hf.DownloadFiles in pkg/lib/hf/download.go still hardcodes https://huggingface.co/%s/resolve/... with no /datasets/ variant and no other dataset-specific download path exists. Trigger Conditions: using a HuggingFace dataset repo such as datasets/org/repo or https://huggingface.co/datasets/... so repoType is RepoTypeDataset. Impact Statement: file listing succeeds but download requests return 404, causing the import to abort and making dataset imports unusable.
Useful? React with 👍 / 👎.
| // Check if the path exists on the filesystem | ||
| // This handles cases like "models/my-model" which could be either local or remote | ||
| if _, err := os.Stat(path); err == nil { | ||
| // Path exists locally - treat as local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle os.Stat errors as local in remote detection
Issue Statement: detectRemoteRepo ignores os.Stat errors other than nil, so local paths that exist but are unreadable are misclassified as remote. Location: pkg/cmd/kitinit/cmd.go:321-324. Investigation Summary: I inspected detectRemoteRepo, complete, and runLocalInit and found no other permission/error handling before the HF parse; the code only treats the path as local when os.Stat succeeds. Trigger Conditions: a relative path like private/model (no ./ prefix) that exists but returns a permission/IO error on stat. Impact Statement: kit init will call the HuggingFace API instead of surfacing the local filesystem error, leading to confusing failures or unintended remote output.
Useful? React with 👍 / 👎.
Description
This PR updates the
kit initcommand to support generating Kitfiles directly from remote repositories (specifically HuggingFace models and datasets) without needing to clone or import them first. This functionality allows users to inspect and generate a Kitfile for a remote resource, which can then be edited before running an import.Key changes:
kit initto detect if the input path is a remote HuggingFace repository (URL ororg/repoformat).hf.ParseHuggingFaceRepoto correctly identify and parse references for both models and datasets.hf.ListFilesto support listing files for both models and datasets using the correct API endpoints via a new RepositoryType enum.kit importto utilize the improved hf library components.Linked issues
closes #1055
AI-Assisted Code