Skip to content

Private github repo fail for using log_model and log_metric #1301

@sijin-dm

Description

@sijin-dm

Describe the bug

I have two version for using component. One is directly uploading local code to each pod, this work fine that we could see the models artifact and metric curve. The other one use the same code as the first one, except for using private github to load the code. In this case, the "dashboards", "artifacts" and "resources" pages show only "NO DATA", and "logs" page shows training output message normally.

To reproduce

  1. Add private code connection
connections:
  - name: my-repo
    kind: git
    schema:
      url: https://github.com/xxx/my-repo
    secret:
      name: "github-secret-my-repo"
  1. runing job config:
run:
  kind: job
  init:
    - connection: my-repo
  1. How we use log_metric and log_model
# log metric
tracking.log_metric("val_loss", val_loss, step=epoch)
tracking.log_metric("val_precision", precision, step=epoch)
tracking.log_metric("val_recall", recall, step=epoch)

# log model
model_output_dir = tracking.get_outputs_path("models", is_dir=True)
ckpt_file = os.path.join(model_output_dir, 'checkpoint.pth.tar')
torch.save({xxx}, ckpt_file)
tracking.log_model(name="checkpoint", path=ckpt_file, framework="pytorch")

Expected behavior

Showing metric curve and saving models normally.

Environment

minikube: v1.15.1
polyaxon ce: 1.7.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions