Skip to content

[FEATURE SUPPORT] add geru python kernel#55

Merged
LoserCheems merged 3 commits intomainfrom
add-geru-python-kernel
Dec 8, 2025
Merged

[FEATURE SUPPORT] add geru python kernel#55
LoserCheems merged 3 commits intomainfrom
add-geru-python-kernel

Conversation

@LoserCheems
Copy link
Collaborator

Summary

Design

  • API: geru(A: torch.Tensor, x: torch.Tensor, y: torch.Tensor, alpha: float) -> torch.Tensor (matches other Python ops).
  • Math: Computes the outer product and updates A via A = A + alpha * x[:, None] * y[None, :].
  • Semantics: Returns an updated tensor (out‑of‑place result) to avoid in‑place aliasing surprises.

Changes

  • Added: geru.py — Python helper implementing the rank‑1 update with a clear docstring.
  • Added: geru.md — documents math, shared API, and points to implementations/tests.

Implementation notes

  • Device/dtype: Accumulates on the same device and dtype as inputs (no host transfers).
  • Simplicity: Uses broadcasting (x[:, None] * y[None, :]) for clarity and correctness rather than micro‑optimizations.
  • Validation: Relies on PyTorch for shape/dtype errors to keep the reference minimal.

Tests

  • Status: No unit tests in this PR.
  • Manual check: Verified against small CPU examples and numpy/PyTorch equivalents.
  • Next: Add tests/test_geru.py to use the Python helper as ground truth and validate future PyTorch/Triton/CuTe backends.

Documentation

  • Docstring: Present in geru.py.
  • Guide: geru.md includes the definition, interface, and where to find implementations/tests.

Checklist

Would you like me to add the basic tests/test_geru.py now (correctness + device/dtype parametrization)?

Clarifies GERU math intuition and shared API across backends
Guides contributors to available implementations and tests to keep validation consistent
Expands python ops with a torch-based GERU to support scaled outer-product updates
@LoserCheems LoserCheems merged commit ead49b4 into main Dec 8, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants