Skip to content

Ground truth data and resnet18 output prediction cooridnates not clear #8

@ccaccavella

Description

@ccaccavella

Hi all,

Thanks for the great work! I have few doubts:

1. ResNet18 Model Output Interpretation:

  • Normalization: Are the 12 output values from the ResNet18 model normalized (output from predict3_npz)? If so, could you provide details?

  • Coordinate System and Camera Model: The paper mentions the use of a pinhole camera model. However, I have observed instances of negative depth values in the output. Could you clarify the coordinate system employed and how the camera model is defined? Specifically, how should one interpret the translation vector, and what does a negative depth signify in this context?

  • 3D to 2D Projection: I aim to project the 3D coordinates obtained from the ResNet18 model onto the 2D image plane to visualize the hand's location (first 3 values of the output). Could you provide guidance or the correct methodology to accurately perform this projection?

2.Ground Truth Data Format:

  • In the real_eval_data/regular/gt_events/...txt files, each entry comprises 15 values across 150 data points. My understanding is that the ground truth should consist of 12 values. Could you clarify these 15 values?

Thanks for the help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions