Skip to content

Conversation

@manuelcandales
Copy link
Contributor

@manuelcandales manuelcandales commented Feb 4, 2026

This pull request improves the aoti_torch_mps__linear_fp_act_4bit_weight function in the Metal backend by adding comprehensive input validation for all tensors and enhancing debug logging. The changes ensure that tensor shapes, data types, and memory layouts are strictly checked before kernel execution, reducing the risk of runtime errors and making debugging easier.

Input validation improvements:

  • Added strict checks for the shape, data type, and contiguity of the A, B, S, and Z tensors, including alignment requirements for dimensions and packed formats. Errors now provide detailed messages and return early if validation fails.
  • Ensured that the B tensor's packed shape matches the expected calculation for 4-bit weights and that both S and Z tensors are 2D, have the correct leading dimension, and are contiguous.

Debug logging enhancements:

  • Expanded debug logs to include detailed tensor shapes and strides for all relevant tensors after validation, improving traceability.
  • Added log statements to indicate which kernel is being dispatched, aiding in kernel selection debugging. [1] [2]

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 4, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17186

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 2 Unrelated Failures

As of commit c16dc59 with merge base ba6de95 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants