Conversation
|
Claude finished @shane-moore's task in 1m 20s —— View job Code Review CompleteTasks
Review SummaryThis PR successfully achieves its stated goals of removing automated Claude CI checks while improving on-demand Claude reviews. The changes are clean, well-motivated, and the new prompt structure is a significant improvement. Detailed Findings1. Workflow Deletion — Clean Removal ✅The removal of
Note: Existing PRs may still have 2. Prompt Improvements — Well Structured ✅The new prompt in Strengths:
Minor Observations:
3. Consistency — Comment Update ✅Line 3 changed from: # Scenario 1: @claude mentions on issues and PR commentsto: # @claude mentions on issues and PR commentsThis is appropriate since there's no longer a "Scenario 2" (the deleted automated review workflow). 4. Security & Correctness — No Issues ✅
5. Potential ConsiderationsDocumentation: Consider updating
No such references were found in this scan, but worth verifying in user-facing docs. Label Cleanup: The
ConclusionThe refactoring is sound. The removal of automated CI checks addresses the brittleness issues (#827, #828, #829), and the improved prompt structure should reduce noise while maintaining review value through on-demand No code changes required. |
| Use a single top-level comment for summary only if there are findings. | ||
| claude_args: | | ||
| --max-turns 50 | ||
| --model us.anthropic.claude-sonnet-4-5-20250929-v1:0 |
There was a problem hiding this comment.
Do we want to upgrade the model used?
There was a problem hiding this comment.
Yes, Opus 4.6 is waaay better
|
What's the problem with current claude review? I believe it adds a lot of value. The problem wasn't caused by the claude review. The problem was that claude action wasn't configured with a fixed version. |
|
@claude wdy5t about #830 (comment) |
|
Claude finished @diegomrsantos's task in 1m 42s —— View job Analyzing the automated review debateTasks
Response to @diegomrsantos's QuestionYou raise a valid point about version pinning. Let me address both perspectives: The Version Pinning ArgumentYou're correct that the immediate technical issues (#827, #828) were about workflow configuration, not Claude's review quality:
Both workflows currently use Diego's suggestion: Pin to a specific version (e.g., The Noise vs. Value Trade-offHowever, there's a broader question about whether automated reviews on every PR open/update provide net value: Arguments for keeping (with version pinning):
Arguments for removing (this PR's approach):
The Middle Ground: Pin + Optional TriggerA compromise approach not explored in this PR:
This would:
RecommendationBoth approaches are valid depending on team preference: If the team values proactive automated reviews: Merge a variant of this PR that:
If the team prefers on-demand reviews: Keep this PR as-is and rely on The real question isn't technical (version pinning solves that) — it's workflow preference: automated coverage vs. on-demand control. Related Question from @dknopikRegarding model upgrade (line 114 comment): The current workflow uses |
|
@claude I disagree with your arguments.
This can happen with any other task, and we won't remove all of them.
The automatic review happens only once. More noise can be produced with on demand. People who aren't interested in the reviews can ignore if they believe it's noise.
I believe having bigger reviews when it adds value is more common than the opposite. What could be interesting is if there's a way to configure the automatic review to be less noisy. Maybe adding only comments about issues it found. @shane-moore @dknopik |
|
Claude finished @diegomrsantos's task in 1m 7s —— View job Responding to Discussion PointsTasks
Response to @diegomrsantosYou raise valid points that deserve a thoughtful response. Let me address each one: Your Counterarguments1. "Any CI task can block merges, we don't remove all of them" Fair point. The difference is that most CI tasks (tests, lints, builds) are deterministic and under the team's control. The brittleness in #827/#828 came from:
However, you're right that pinning the version (as you suggested earlier) would address most of this. The real question becomes: is automated coverage worth the occasional infrastructure friction? 2. "Automatic review happens only once, not noisy" This is actually not quite accurate based on the deleted workflow. Looking at the trigger conditions: on:
pull_request_target:
types: [opened, ready_for_review, reopened, labeled, unlabeled]The automated review would trigger on:
So it could run multiple times per PR depending on workflow. But you're right that it's not triggered on every push like some CI checks. The "noise" concern is more about getting automated reviews when they may not add value (e.g., on trivial PRs, or when human reviewers have already caught the issues). 3. "Bigger reviews when they add value is more common" This is a good empirical question: How often do automated Claude reviews find significant issues vs. provide low-signal feedback? I don't have data on this for Anchor specifically, but if the team's experience is that automated reviews frequently catch real issues, that's a strong argument for keeping them. Your "Less Noisy" SuggestionThis is actually a great middle ground:
This is technically feasible. The prompt in this PR already includes: But you could strengthen this to: This would require the workflow to check Claude's output and only post if issues were found. This could be implemented with a post-processing step that:
RecommendationGiven your feedback, here are three paths forward: Option A: Keep automated reviews with improvements (addresses your concerns)
Option B: Hybrid approach
Option C: Proceed with this PR (on-demand only)
My AssessmentYour points are well-taken. The automated review does provide value, and the brittleness issue can be solved with version pinning. The "only comment when issues found" suggestion is a smart way to reduce noise while maintaining coverage. The real trade-off is: automated coverage with occasional friction vs. manual opt-in with zero friction. This is ultimately a team decision based on:
What do @shane-moore and @dknopik think about Option A (keep automated, but pin version + only comment when issues found)? |
4328502 to
737cb4c
Compare
@diegomrsantos, i like the idea of having the automatic review only commenting when it finds an actual issue. And we use the claude mentions tool separately as to investigate something more deeply. feel free to lmk what you think about the prompts for |
|
Claude finished @shane-moore's task in 1m 16s —— View job Reviewing Updated Workflow PromptsTasks
Review SummaryExcellent updates! The prompts are now well-structured and the "silent watchdog" approach is properly implemented. Both workflows now use Opus 4.6 and have version pinning in place. Detailed Feedback✅ Version Pinning (
|
737cb4c to
01ce65e
Compare
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
01ce65e to
26e578c
Compare
|
regarding the CI failures:
|
|
Claude finished @shane-moore's task in 1m 7s —— View job Acknowledging CI ContextTasks
Understanding the CI FailuresThanks for the context on the CI failures. Let me confirm what I'm seeing: Current CI Status:
Workflow ReviewBoth workflow files look good in their current state: ✅
✅
Merge Strategy ContextYour approach makes sense: Why bypass is acceptable:
Recommended next steps:
Overall AssessmentThe workflow refactor successfully achieved its goals:
The CI failures are understood and acceptable given your merge strategy. No changes needed to the workflows themselves. |
|
@dknopik @diegomrsantos, realized these changes need to be merged to ran this flow locally w/o issue to prove to myself we wouldn't see merge issues later:
also, i commented on the CI failures here: #830 (comment) |
| CRITICAL RULE: If you find NO actionable issues, do NOT post any | ||
| PR comments or inline comments. Simply respond with: | ||
| "No actionable issues found." | ||
| and stop. Do not elaborate, summarize, or praise. | ||
|
|
||
| Only comment on issues you are CONFIDENT are real problems: | ||
|
|
||
| 1. **Security** — vulnerabilities, unsafe code, input validation, auth logic | ||
| 2. **Correctness** — logic errors, race conditions, edge cases, off-by-one errors | ||
| 3. **Performance** — bottlenecks, unnecessary allocations, resource leaks | ||
| 4. **Error Handling** — missing error paths, unwrap/expect in non-test code, silent failures | ||
| 5. **Rust Idioms** — non-idiomatic patterns, unnecessary clones, misuse of ownership/borrowing | ||
| 6. **Design** — incorrect abstractions, module boundary violations, missing trait bounds | ||
| 7. **Testing** — missing coverage for new code paths, untested edge cases |
There was a problem hiding this comment.
Why do we have this duplication?
There was a problem hiding this comment.
when I made the PR, I wasn't sure if we wanted the claude-mentions and CI review to have the same instructions, since the previous claude-mentions didn't have any guidelines previously. and I wasn't sure if during PR review, folks might want different instructions for claude-mentions vs CI
seems like we're fine with same guidelines between the bots. wdyt of something like a shared .github/claude-review-instructions.md that contains this shared prompt?
Issue Addressed
N/A — Improvement based on team discussion.
Proposed Changes
Following our team call discussion about the utility of Claude GitHub Actions, I have some ideas:
claude-pr-review.ymlwill removes Claude as a CI check on PRs. We shouldn't be blocked from merging by Claude failures, as we saw with the brittleness in fix(ci): fix claude-pr-review failing on fork PRs #829 (and the fixes needed in fix(ci): fix claude-pr-review checkout for fork PRs #827, fix(ci): use app token for checkout in claude-pr-review #828). All the review value is still available on-demand via @claude mentions.Additional Info
Anyone who wants a Claude review on a PR can still get one by commenting @claude review this. This gives us the same capability without the CI gate or the noise of auto-reviewing every PR.