feat(gpu): add NVIDIA OpenCL compile diagnostics via ERG_CL_VERBOSE#9
feat(gpu): add NVIDIA OpenCL compile diagnostics via ERG_CL_VERBOSE#9arkadianet merged 1 commit intomainfrom
Conversation
Set ERG_CL_VERBOSE=1 to enable -cl-nv-verbose flag for NVIDIA devices. Prints ptxas info including register usage, spills, and occupancy hints. Useful for profiling kernel register pressure before optimization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds conditional OpenCL verbose diagnostics for NVIDIA GPUs controlled by the ERG_CL_VERBOSE environment variable. The kernel builder now dynamically constructs compiler options and emits build logs when verbose mode is enabled. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Set ERG_CL_VERBOSE=1 to enable -cl-nv-verbose flag for NVIDIA devices. Prints ptxas info including register usage, spills, and occupancy hints.
Useful for profiling kernel register pressure before optimization.
🤖 Generated with Claude Code
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.