-
Notifications
You must be signed in to change notification settings - Fork 90
Description
Hello, thank you for releasing HunyuanWorld-Mirror!
I am a graduate student working on 3D reconstruction.
While experimenting with the geometry priors (camera pose + intrinsics), I found what seems to be an issue:
Camera priors have no effect during inference in the open-source HunyuanWorld-Mirror model.
Below is a clean reproduction and evidence.
- Problem Summary
Even when passing valid camera pose and intrinsics (with correct shapes [1, N, 4,4] and [1, N, 3,3]) and enabling:
cond_flags = [1, 0, 1] # pose + intrinsics enabled
the outputs of:
pts3d
depth
normals
camera_params
camera_poses
camera_intrs
splats
etc.
are bitwise identical to the outputs without priors:
cond_flags = [0, 0, 0]
All L1 diffs = 0.00000000, across every tensor.
This strongly suggests that geometry priors are not used in the current inference path.
- Minimal Reproduction Setup
I used the same model loading method as your official infer.py
(Loading weights from local directory).
I built inputs according to README:
views = {
"img": imgs,
"camera_pose": pose_tensor, # [1, N, 4, 4]
"camera_intrinsics": intr_tensor, # [1, N, 3, 3]
}
cond_flags = [1, 0, 1]
Then I ran two forward passes:
out_no = model(views={"img": imgs}, cond_flags=[0,0,0])
out_yes = model(views=views, cond_flags=[1,0,1])
I compared all outputs’ L1 difference.
3. Debug Results (Key Evidence)
Input priors are valid:
pose shape: [1, 54, 4, 4]
K shape: [1, 54, 3, 3]
pose mean/min/max: 0.13 / -3.70 / 4.32
K mean/min/max: 728.6 / 0 / 1920
Priors successfully entered the model:
views_yes keys: {'img','camera_pose','camera_intrinsics'}
cond_flags_yes: [1,0,1]
But outputs are exactly identical:
Example:
[pts3d] L1 diff = 0.00000000
[depth] L1 diff = 0.00000000
[camera_params] L1 diff = 0.00000000
[camera_poses] L1 diff = 0.00000000
[camera_intrs] L1 diff = 0.00000000
...
This means the inference path does not incorporate priors.
4. Internal Parameter Inspection
I also inspected the model and confirmed the prior encoder modules do exist and have non-zero trained weights:
visual_geometry_transformer.pose_embed.0.weight
shape=(1024,7), mean=0.0035, max=0.3760, all_zero=False
visual_geometry_transformer.pose_embed.2.weight
shape=(1024,1024), mean≈0, max=0.10, all_zero=False
So:
The prior encoder is present
The weights are trained
But inference does not use them
This suggests a potential missing forward connection / disabled path.
5. Questions
I would like to confirm:
Q1 — Is geometry-prior usage intentionally disabled in the open-source Mirror model?
Q2 — If not intentional, is there a missing config / flag needed to enable pose & intrinsics routing in the forward pass?
Q3 — Should geometry priors only work in the full HunyuanWorld model, but not in Mirror?
Q4 — Can you provide a minimal code example showing correct prior usage?
Thank you for your time!
HunyuanWorld is amazing work — I appreciate any clarification about the intended behavior of geometry priors in the open-source Mirror version.