-
Notifications
You must be signed in to change notification settings - Fork 326
Closed
Labels
Description
Describe the bug
I have two version for using component. One is directly uploading local code to each pod, this work fine that we could see the models artifact and metric curve. The other one use the same code as the first one, except for using private github to load the code. In this case, the "dashboards", "artifacts" and "resources" pages show only "NO DATA", and "logs" page shows training output message normally.
To reproduce
- Add private code connection
connections:
- name: my-repo
kind: git
schema:
url: https://github.com/xxx/my-repo
secret:
name: "github-secret-my-repo"
- runing job config:
run:
kind: job
init:
- connection: my-repo
- How we use
log_metricandlog_model
# log metric
tracking.log_metric("val_loss", val_loss, step=epoch)
tracking.log_metric("val_precision", precision, step=epoch)
tracking.log_metric("val_recall", recall, step=epoch)
# log model
model_output_dir = tracking.get_outputs_path("models", is_dir=True)
ckpt_file = os.path.join(model_output_dir, 'checkpoint.pth.tar')
torch.save({xxx}, ckpt_file)
tracking.log_model(name="checkpoint", path=ckpt_file, framework="pytorch")
Expected behavior
Showing metric curve and saving models normally.
Environment
minikube: v1.15.1
polyaxon ce: 1.7.5
Reactions are currently unavailable