Skip to content

Add a workflow based agent#18

Closed
swghosh wants to merge 2 commits intomainfrom
poc
Closed

Add a workflow based agent#18
swghosh wants to merge 2 commits intomainfrom
poc

Conversation

@swghosh
Copy link
Collaborator

@swghosh swghosh commented Feb 17, 2026

Summary by CodeRabbit

  • New Features

    • Multi-PR workflow orchestration with automatic pull request generation, tracking, and management across multiple stages
    • Repository selection from team repositories with product grouping and detailed metadata display
    • Working directory configuration support for flexible execution
    • Live pull request tracking with created PR links displayed in the UI
  • UI/UX Enhancements

    • Expanded workflow information panel with PR progression visualization
    • Improved layout and form structure with "Start Workflow" interface
    • Repository discovery and selection dropdown with contextual information

Signed-off-by: Swarup Ghosh <swghosh@redhat.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 17, 2026

📝 Walkthrough

Walkthrough

The changes introduce a multi-PR workflow system replacing single-command operations. The system now includes repository management via CSV loading, new result data structures (PRResult, WorkflowResult), updated agent orchestration, server endpoints for workflow submission and repository listing, and UI updates for workflow initiation and PR tracking.

Changes

Cohort / File(s) Summary
Agent Layer Refactor
server/agent.py
Introduced multi-PR workflow orchestration replacing single-command flow. Added data structures PRResult and WorkflowResult. Implemented repository management via load_team_repos() and get_repo_info(). Updated run_workflow() and run_agent() to return WorkflowResult instead of AgentResult.
Server API Expansion
server/server.py
Added workflow and legacy job submission paths (submit_workflow_job, submit_legacy_job). Introduced repository listing endpoint (/repos). Extended job representation to track mode, repo, and PRs. Added async workers (_run_workflow_job, _run_legacy_job). Created new API endpoint for workflow initiation (api_workflow).
Frontend UI Updates
server/homepage.html
Increased card width from 720px to 800px. Replaced single command form with structured workflow form including target repository select (populated from /repos) and working directory field. Added PR display area to surface created pull requests. Updated submission logic to send repo instead of command.
Configuration Removal
team-repos.txt
Removed all guidance content and repository URL references. File is now empty.

Sequence Diagram

sequenceDiagram
    participant Client as Client (UI)
    participant Server as Server
    participant Agent as ClaudeAgent
    participant WorkflowEngine as Workflow Engine

    Client->>Server: POST /api/v1/oape-workflow<br/>(ep_url, repo, cwd)
    activate Server
    Server->>Server: Validate inputs<br/>Initialize workflow job
    Server-->>Client: Return job_id + status URL
    Server->>Agent: async run_workflow()<br/>(repo, ep_url, working_dir)
    deactivate Server
    
    activate Agent
    Agent->>Agent: Load repo info<br/>Build workflow prompt
    Agent->>WorkflowEngine: Execute orchestrated flow
    activate WorkflowEngine
    WorkflowEngine->>WorkflowEngine: PR1: init/api-generate/tests
    WorkflowEngine->>WorkflowEngine: review-and-fix → raise PR
    WorkflowEngine->>WorkflowEngine: PR2: api-implement
    WorkflowEngine->>WorkflowEngine: review-and-fix → raise PR
    WorkflowEngine->>WorkflowEngine: PR3: e2e-generate
    WorkflowEngine->>WorkflowEngine: review-and-fix → raise PR
    deactivate WorkflowEngine
    Agent->>Agent: Collect PRResult list<br/>Calculate costs
    Agent-->>Agent: Return WorkflowResult<br/>(output, cost_usd, prs)
    deactivate Agent
    
    Server->>Server: Update job status<br/>Populate prs list
    Server-->>Client: Stream: result + prs<br/>via SSE
    Client->>Client: Render created PRs<br/>Display links
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Add agent jobs flow #13: Directly modifies server/agent.py to introduce workflow orchestration, data structures (WorkflowResult/PRResult), and run_agent/run_workflow functions that this PR extends.
  • Integrate with Claude Agent SDK #7: Related through agent-server integration—both PRs connect Claude agent SDK functionality with server endpoints and job management infrastructure.
🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Add a workflow based agent' is somewhat vague and doesn't specifically convey the main changes, which involve restructuring from single-command flows to multi-PR workflows, adding repository management, and introducing new data structures like WorkflowResult and PRResult. Consider a more specific title like 'Introduce multi-PR workflow orchestration with repository management' or 'Add workflow-based agent with PR orchestration' to better reflect the primary architectural changes.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 80.95% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch poc

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/server.py (1)

47-47: ⚠️ Potential issue | 🟡 Minor

In-memory job store lacks cleanup, potential memory leak.

The jobs dict grows unboundedly as users submit workflows. Long-running servers will accumulate completed jobs indefinitely. Consider implementing TTL-based cleanup or a max job count.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/server.py` at line 47, The global in-memory jobs dict is unbounded and
can leak memory; modify the code that creates/updates entries (reference the
jobs dict) to implement cleanup: add a TTL timestamp field to each job entry and
run a periodic cleaner (e.g., background thread or async task) that removes
entries older than the TTL, or enforce a max job count and evict
oldest/failed/completed jobs when limit is reached; ensure you update
job-creation logic to set the timestamp and job-completion logic (where jobs are
marked done) to optionally schedule immediate removal or mark for later cleanup.
🧹 Nitpick comments (6)
server/agent.py (4)

123-191: Hardcoded placeholder OCPBUGS-0 in workflow prompt.

The prompt uses OCPBUGS-0 as a placeholder ticket ID (lines 152, 161, 169). While documented in the "Important Notes" section (line 186), this could cause confusion or issues if the review command validates ticket formats. Consider making this configurable or extracting it as a constant.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/agent.py` around lines 123 - 191, The prompt hardcodes the placeholder
ticket ID "OCPBUGS-0" inside _build_workflow_prompt (used in the "/oape:review"
steps and Notes); make the ticket ID configurable by adding a new parameter
(e.g., review_ticket_id) to _build_workflow_prompt (and callers) or read it from
repo_info (e.g., repo_info["review_ticket_id"]), replace all instances of
"OCPBUGS-0" with that variable, and update the Important Notes text to reference
the configurable value instead of a hardcoded string so the review command uses
a valid, maintainable ticket id.

256-312: Duplicated message handling logic between run_workflow and run_agent.

The block processing logic (lines 256-312 in run_workflow and 409-465 in run_agent) is nearly identical. Consider extracting this into a shared helper function to reduce duplication and maintenance burden.

Also applies to: 409-465

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/agent.py` around lines 256 - 312, The assistant message block
processing in run_workflow and run_agent is duplicated; extract the shared logic
into a single helper (e.g., _process_assistant_message or emit_assistant_blocks)
that accepts the AssistantMessage (or message), the _emit callback, and
conv_logger, then move the repeated body (handling TextBlock, ThinkingBlock,
ToolUseBlock, ToolResultBlock, and default case plus json serialization and
logging) into that helper and call it from both run_workflow and run_agent to
remove duplication and keep behavior identical.

70-89: Partial match can return unexpected results.

The partial match logic at lines 80-87 returns a result if exactly one repo contains the search term. However, searching for "operator" could match multiple repos, returning None, while a more specific partial like "cert" might unexpectedly match "cert-manager-operator". Consider documenting this behavior or logging when partial matching is used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/agent.py` around lines 70 - 89, The partial-match behavior in
get_repo_info can yield surprising results (e.g., "operator" matching many repos
or "cert" matching "cert-manager-operator"); update get_repo_info to (1) log
when a partial match search runs and what candidate keys matched (use the module
logger), (2) if multiple matches exist, either prefer keys that startwith the
query and then the longest match as a tie-breaker, or explicitly return None but
log the ambiguity so callers can surface it; reference get_repo_info and
TEAM_REPOS when making the change.

341-348: Broad exception handling loses error context.

Static analysis flagged except Exception (BLE001). While catching all exceptions prevents crashes, it may hide specific errors. Consider logging the exception type or re-raising after cleanup for unexpected exceptions.

Proposed fix to preserve error context
     except Exception as exc:
-        conv_logger.info(f"[error] {traceback.format_exc()}")
+        conv_logger.error(f"[error] {type(exc).__name__}: {traceback.format_exc()}")
         return WorkflowResult(
             output="",
             cost_usd=cost_usd,
-            error=str(exc),
+            error=f"{type(exc).__name__}: {exc}",
             conversation=conversation,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/agent.py` around lines 341 - 348, The current broad except Exception
block (around the WorkflowResult return using conv_logger) swallows error types;
narrow it by catching only expected exceptions (e.g., ValueError, RuntimeError)
when you want to convert to a WorkflowResult, and let other exceptions
propagate. If you must keep a generic handler, replace conv_logger.info with
conv_logger.exception (or include type via type(exc).__name__) and
traceback.format_exc() in the log, then re-raise unexpected exceptions instead
of returning; keep the existing WorkflowResult return only for explicitly
handled exception types (reference: the except block, conv_logger, and the
WorkflowResult construction).
server/server.py (2)

241-243: Callback on_message calls loop.create_task from potentially sync context.

The on_message callback (line 241-243) is invoked synchronously from within run_workflow, but it uses loop.create_task(_notify(...)). This is correct since loop is captured from the async context, but if run_workflow ever changes to run in a thread pool, this would break. Consider documenting this assumption.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/server.py` around lines 241 - 243, The on_message callback currently
captures loop and calls loop.create_task(_notify(condition)) which assumes
on_message is always invoked from the event loop; make this safe for
synchronous/threaded callers by using
asyncio.run_coroutine_threadsafe(_notify(condition), loop) instead of
loop.create_task, or alternatively add a clear docstring on on_message and
run_workflow stating the callback must only be invoked from the event loop;
reference symbols: on_message, run_workflow, loop, _notify, job_id, jobs.

128-128: Fire-and-forget tasks may silently fail.

Static analysis flagged that asyncio.create_task return values are not stored (RUF006). If these tasks raise exceptions, they'll be silently ignored. Consider storing references and adding error handling via task.add_done_callback().

Example fix for error visibility
+def _handle_task_exception(task: asyncio.Task) -> None:
+    if task.exception():
+        logging.error(f"Background task failed: {task.exception()}")
+
 `@app.post`("/submit")
 async def submit_workflow_job(...):
     ...
-    asyncio.create_task(_run_workflow_job(job_id, ep_url, repo, working_dir))
+    task = asyncio.create_task(_run_workflow_job(job_id, ep_url, repo, working_dir))
+    task.add_done_callback(_handle_task_exception)
     return {"job_id": job_id}

Also applies to: 161-161, 335-335

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/server.py` at line 128, The create_task calls that fire-and-forget
_run_workflow_job (and the other asyncio.create_task uses) can drop exceptions;
capture each returned Task (e.g., task =
asyncio.create_task(_run_workflow_job(...))) and attach a done-callback that
logs and re-raises or handles exceptions (task.add_done_callback(lambda t:
_log_task_error(t, "<context>"))), implement a small helper
_log_task_error(task, context) that checks task.exception() and logs traceback
via the existing logger, and replace the bare asyncio.create_task(...)
invocations with this pattern for all occurrences.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/agent.py`:
- Around line 45-67: The module currently calls load_team_repos() at import time
(creating TEAM_REPOS), which will crash the importer if TEAM_REPOS_CSV is
missing/malformed and also does not validate required columns; change this to
lazy or resilient loading by removing the top-level TEAM_REPOS =
load_team_repos() and instead provide a safe accessor (e.g., get_team_repos())
that calls load_team_repos() on first use, wraps file I/O in try/except to
handle FileNotFoundError/IOError and CSV parsing errors, and add explicit
validation inside load_team_repos() for required keys
("repo_url","base_branch","product","role") (raise or skip rows with a clear
logged error) so callers don’t get KeyError from missing columns.
- Around line 336-348: The WorkflowResult returned by run_workflow currently
never sets the prs field; update both the success and exception return paths in
run_workflow so WorkflowResult(prs=...) is populated by extracting PR metadata
from the conversation/tool results. Locate the conversation variable and parse
its messages/tool outputs (e.g., messages produced by the PR-creation tool or
assistant responses that contain pr_number, pr_url, branch_name, title) into
dicts matching the server's expected shape, then pass that list into
WorkflowResult(prs=parsed_prs, output=..., cost_usd=..., ...) in both the normal
return and the exception return; keep the parsing robust to missing fields and
default to an empty list when none are found.

In `@server/homepage.html`:
- Around line 124-150: The loadRepos() function currently only logs errors to
console; update it to surface failures to the user by handling both non-ok
responses and exceptions: inside loadRepos() (and the res.ok branch) detect
failure and update the UI—e.g., set repoSelect.disabled = true, replace or
append a visible error message element (create or reuse an element with id/class
like repoError) with a friendly message and optional details (err.message or
res.status), and ensure the placeholder option remains if desired; also clear
any previous optgroups on success. Use the existing symbols loadRepos,
repoSelect, and the fetch('/repos') response to locate the code to modify.
- Around line 219-225: The injected href uses escapeHtml(pr.pr_url) but that
doesn't prevent dangerous schemes (e.g., javascript:); update the rendering to
validate and normalize pr.pr_url before inserting into prItemsEl.innerHTML: in
the code that maps PRs (the block using prItemsEl.innerHTML and escapeHtml), add
a helper call (e.g., normalizePrUrl or verifyPrUrl) that returns a safe URL only
if it starts with "https://" (or "http://", per policy) otherwise returns a safe
placeholder (like "#") or an explicitly prefixed "https://"; use that sanitized
value for the href and keep escapeHtml(pr.title) for link text, and also ensure
anchors include rel="noopener noreferrer" when target="_blank".

In `@server/server.py`:
- Around line 98-129: Both submit_workflow_job and api_workflow duplicate
building the job dict and starting the same background task; extract a helper
(e.g., create_job_and_start_task) that accepts job parameters (ep_url, repo,
working_dir, mode) and returns job_id. Move construction of the job object
(status, mode, ep_url, repo, cwd, conversation, message_event, output, cost_usd,
error, prs) and the asyncio.create_task(_run_workflow_job(...)) call into that
helper, then replace the duplicated blocks in submit_workflow_job and
api_workflow to call the new helper and return the job_id. Ensure the helper
updates the shared jobs mapping and reuses the existing _run_workflow_job
function signature.

---

Outside diff comments:
In `@server/server.py`:
- Line 47: The global in-memory jobs dict is unbounded and can leak memory;
modify the code that creates/updates entries (reference the jobs dict) to
implement cleanup: add a TTL timestamp field to each job entry and run a
periodic cleaner (e.g., background thread or async task) that removes entries
older than the TTL, or enforce a max job count and evict oldest/failed/completed
jobs when limit is reached; ensure you update job-creation logic to set the
timestamp and job-completion logic (where jobs are marked done) to optionally
schedule immediate removal or mark for later cleanup.

---

Nitpick comments:
In `@server/agent.py`:
- Around line 123-191: The prompt hardcodes the placeholder ticket ID
"OCPBUGS-0" inside _build_workflow_prompt (used in the "/oape:review" steps and
Notes); make the ticket ID configurable by adding a new parameter (e.g.,
review_ticket_id) to _build_workflow_prompt (and callers) or read it from
repo_info (e.g., repo_info["review_ticket_id"]), replace all instances of
"OCPBUGS-0" with that variable, and update the Important Notes text to reference
the configurable value instead of a hardcoded string so the review command uses
a valid, maintainable ticket id.
- Around line 256-312: The assistant message block processing in run_workflow
and run_agent is duplicated; extract the shared logic into a single helper
(e.g., _process_assistant_message or emit_assistant_blocks) that accepts the
AssistantMessage (or message), the _emit callback, and conv_logger, then move
the repeated body (handling TextBlock, ThinkingBlock, ToolUseBlock,
ToolResultBlock, and default case plus json serialization and logging) into that
helper and call it from both run_workflow and run_agent to remove duplication
and keep behavior identical.
- Around line 70-89: The partial-match behavior in get_repo_info can yield
surprising results (e.g., "operator" matching many repos or "cert" matching
"cert-manager-operator"); update get_repo_info to (1) log when a partial match
search runs and what candidate keys matched (use the module logger), (2) if
multiple matches exist, either prefer keys that startwith the query and then the
longest match as a tie-breaker, or explicitly return None but log the ambiguity
so callers can surface it; reference get_repo_info and TEAM_REPOS when making
the change.
- Around line 341-348: The current broad except Exception block (around the
WorkflowResult return using conv_logger) swallows error types; narrow it by
catching only expected exceptions (e.g., ValueError, RuntimeError) when you want
to convert to a WorkflowResult, and let other exceptions propagate. If you must
keep a generic handler, replace conv_logger.info with conv_logger.exception (or
include type via type(exc).__name__) and traceback.format_exc() in the log, then
re-raise unexpected exceptions instead of returning; keep the existing
WorkflowResult return only for explicitly handled exception types (reference:
the except block, conv_logger, and the WorkflowResult construction).

In `@server/server.py`:
- Around line 241-243: The on_message callback currently captures loop and calls
loop.create_task(_notify(condition)) which assumes on_message is always invoked
from the event loop; make this safe for synchronous/threaded callers by using
asyncio.run_coroutine_threadsafe(_notify(condition), loop) instead of
loop.create_task, or alternatively add a clear docstring on on_message and
run_workflow stating the callback must only be invoked from the event loop;
reference symbols: on_message, run_workflow, loop, _notify, job_id, jobs.
- Line 128: The create_task calls that fire-and-forget _run_workflow_job (and
the other asyncio.create_task uses) can drop exceptions; capture each returned
Task (e.g., task = asyncio.create_task(_run_workflow_job(...))) and attach a
done-callback that logs and re-raises or handles exceptions
(task.add_done_callback(lambda t: _log_task_error(t, "<context>"))), implement a
small helper _log_task_error(task, context) that checks task.exception() and
logs traceback via the existing logger, and replace the bare
asyncio.create_task(...) invocations with this pattern for all occurrences.

Comment on lines +45 to +67
def load_team_repos() -> dict[str, dict]:
"""Load team repositories from CSV file.

Returns:
dict mapping repo short name to {url, base_branch, product, role}
"""
repos = {}
with open(TEAM_REPOS_CSV, newline="") as f:
reader = csv.DictReader(f)
for row in reader:
url = row["repo_url"].rstrip(".git")
# Extract short name from URL (e.g., "cert-manager-operator" from URL)
short_name = url.split("/")[-1]
repos[short_name] = {
"url": url,
"base_branch": row["base_branch"],
"product": row["product"],
"role": row["role"],
}
return repos


TEAM_REPOS = load_team_repos()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Module-load CSV parsing can cause startup failures.

load_team_repos() is called at module import time (line 67). If team-repos.csv is missing or malformed, the entire module fails to import, making debugging harder. Consider lazy loading or graceful error handling.

Additionally, there's no validation that required CSV columns exist, which could cause KeyError at runtime if the CSV schema changes.

Proposed improvement for robustness
 def load_team_repos() -> dict[str, dict]:
     """Load team repositories from CSV file."""
     repos = {}
+    if not TEAM_REPOS_CSV.exists():
+        logging.warning(f"Team repos CSV not found: {TEAM_REPOS_CSV}")
+        return repos
     with open(TEAM_REPOS_CSV, newline="") as f:
         reader = csv.DictReader(f)
+        required_cols = {"repo_url", "base_branch", "product", "role"}
+        if not required_cols.issubset(reader.fieldnames or []):
+            logging.warning(f"Missing required columns in {TEAM_REPOS_CSV}")
+            return repos
         for row in reader:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/agent.py` around lines 45 - 67, The module currently calls
load_team_repos() at import time (creating TEAM_REPOS), which will crash the
importer if TEAM_REPOS_CSV is missing/malformed and also does not validate
required columns; change this to lazy or resilient loading by removing the
top-level TEAM_REPOS = load_team_repos() and instead provide a safe accessor
(e.g., get_team_repos()) that calls load_team_repos() on first use, wraps file
I/O in try/except to handle FileNotFoundError/IOError and CSV parsing errors,
and add explicit validation inside load_team_repos() for required keys
("repo_url","base_branch","product","role") (raise or skip rows with a clear
logged error) so callers don’t get KeyError from missing columns.

Comment on lines +336 to +348
return WorkflowResult(
output="\n".join(output_parts),
cost_usd=cost_usd,
conversation=conversation,
)
except Exception as exc:
conv_logger.info(f"[error] {traceback.format_exc()}")
return WorkflowResult(
output="",
cost_usd=cost_usd,
error=str(exc),
conversation=conversation,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the structure and find the files
git ls-files | grep -E "(agent|server\.py)" | head -20

Repository: shiftweek/oape-ai-e2e

Length of output: 97


🏁 Script executed:

# Get the run_workflow function in agent.py
rg -n "def run_workflow" --type=py -A 150 | head -200

Repository: shiftweek/oape-ai-e2e

Length of output: 9185


🏁 Script executed:

# Check WorkflowResult class definition
rg -n "class WorkflowResult" --type=py -A 20

Repository: shiftweek/oape-ai-e2e

Length of output: 956


🏁 Script executed:

# Look at server.py lines 250-258
sed -n '250,258p' server/server.py

Repository: shiftweek/oape-ai-e2e

Length of output: 329


🏁 Script executed:

# Search for any references to prs in the codebase
rg -n "\.prs\s*=" --type=py -B 3 -A 3

Repository: shiftweek/oape-ai-e2e

Length of output: 47


🏁 Script executed:

# Search for PRResult usage and how prs might be populated
rg -n "PRResult" --type=py -B 2 -A 2

Repository: shiftweek/oape-ai-e2e

Length of output: 490


🏁 Script executed:

# Get more context around server.py lines 250-258
sed -n '240,270p' server/server.py

Repository: shiftweek/oape-ai-e2e

Length of output: 1020


🏁 Script executed:

# Check if there's any parsing logic to extract PRs from conversation
rg -n "PR\|prs\|pull request" server/server.py -B 2 -A 2 | head -50

Repository: shiftweek/oape-ai-e2e

Length of output: 47


🏁 Script executed:

# Check the entire exception handling in run_workflow to see full return statement
sed -n '341,350p' server/agent.py

Repository: shiftweek/oape-ai-e2e

Length of output: 319


Populate prs list in WorkflowResult returned from run_workflow.

The prs field defaults to an empty list and is never set in either the success path (lines 336-340) or exception path (lines 343-348). However, server/server.py (lines 250-258) expects result.prs to contain PR information:

jobs[job_id]["prs"] = [
    {
        "pr_number": pr.pr_number,
        "pr_url": pr.pr_url,
        "branch_name": pr.branch_name,
        "title": pr.title,
    }
    for pr in result.prs
]

The workflow system prompt instructs the Claude agent to create PRs, and the conversation captures tool results from these operations. The prs list should be populated by parsing PR creation information from the conversation messages or tool results, otherwise the server will always report an empty PR list to clients.

🧰 Tools
🪛 Ruff (0.15.0)

[warning] 341-341: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/agent.py` around lines 336 - 348, The WorkflowResult returned by
run_workflow currently never sets the prs field; update both the success and
exception return paths in run_workflow so WorkflowResult(prs=...) is populated
by extracting PR metadata from the conversation/tool results. Locate the
conversation variable and parse its messages/tool outputs (e.g., messages
produced by the PR-creation tool or assistant responses that contain pr_number,
pr_url, branch_name, title) into dicts matching the server's expected shape,
then pass that list into WorkflowResult(prs=parsed_prs, output=...,
cost_usd=..., ...) in both the normal return and the exception return; keep the
parsing robust to missing fields and default to an empty list when none are
found.

Comment on lines +124 to +150
async function loadRepos() {
try {
const res = await fetch('/repos');
if (res.ok) {
const data = await res.json();
// Group repositories by product
const groups = {};
data.repositories.forEach(repo => {
if (!groups[repo.product]) groups[repo.product] = [];
groups[repo.product].push(repo);
});
Object.entries(groups).forEach(([product, repos]) => {
const optgroup = document.createElement('optgroup');
optgroup.label = product;
repos.forEach(repo => {
const option = document.createElement('option');
option.value = repo.short_name;
option.textContent = `${repo.short_name} — ${repo.role} (${repo.base_branch})`;
optgroup.appendChild(option);
});
repoSelect.appendChild(optgroup);
});
}
} catch (err) {
console.error('Failed to load repositories:', err);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

No user feedback when repository loading fails.

The loadRepos() function silently logs errors to console (line 148) but doesn't inform the user. If the /repos endpoint fails, the dropdown remains with only the placeholder option, and users won't know why.

Proposed fix to show error state
   } catch (err) {
     console.error('Failed to load repositories:', err);
+    const option = document.createElement('option');
+    option.value = '';
+    option.textContent = '(Failed to load repositories)';
+    option.disabled = true;
+    repoSelect.appendChild(option);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function loadRepos() {
try {
const res = await fetch('/repos');
if (res.ok) {
const data = await res.json();
// Group repositories by product
const groups = {};
data.repositories.forEach(repo => {
if (!groups[repo.product]) groups[repo.product] = [];
groups[repo.product].push(repo);
});
Object.entries(groups).forEach(([product, repos]) => {
const optgroup = document.createElement('optgroup');
optgroup.label = product;
repos.forEach(repo => {
const option = document.createElement('option');
option.value = repo.short_name;
option.textContent = `${repo.short_name} — ${repo.role} (${repo.base_branch})`;
optgroup.appendChild(option);
});
repoSelect.appendChild(optgroup);
});
}
} catch (err) {
console.error('Failed to load repositories:', err);
}
}
async function loadRepos() {
try {
const res = await fetch('/repos');
if (res.ok) {
const data = await res.json();
// Group repositories by product
const groups = {};
data.repositories.forEach(repo => {
if (!groups[repo.product]) groups[repo.product] = [];
groups[repo.product].push(repo);
});
Object.entries(groups).forEach(([product, repos]) => {
const optgroup = document.createElement('optgroup');
optgroup.label = product;
repos.forEach(repo => {
const option = document.createElement('option');
option.value = repo.short_name;
option.textContent = `${repo.short_name} — ${repo.role} (${repo.base_branch})`;
optgroup.appendChild(option);
});
repoSelect.appendChild(optgroup);
});
}
} catch (err) {
console.error('Failed to load repositories:', err);
const option = document.createElement('option');
option.value = '';
option.textContent = '(Failed to load repositories)';
option.disabled = true;
repoSelect.appendChild(option);
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/homepage.html` around lines 124 - 150, The loadRepos() function
currently only logs errors to console; update it to surface failures to the user
by handling both non-ok responses and exceptions: inside loadRepos() (and the
res.ok branch) detect failure and update the UI—e.g., set repoSelect.disabled =
true, replace or append a visible error message element (create or reuse an
element with id/class like repoError) with a friendly message and optional
details (err.message or res.status), and ensure the placeholder option remains
if desired; also clear any previous optgroups on success. Use the existing
symbols loadRepos, repoSelect, and the fetch('/repos') response to locate the
code to modify.

Comment on lines +219 to +225
prItemsEl.innerHTML = result.prs.map((pr, i) => `
<div class="pr-item">
<strong>PR #${i + 1}:</strong>
<a href="${escapeHtml(pr.pr_url)}" target="_blank">${escapeHtml(pr.title || pr.pr_url)}</a>
<span style="color:#888">(${escapeHtml(pr.branch_name || '')})</span>
</div>
`).join('');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential XSS if pr.pr_url contains malicious content.

While escapeHtml is used for text content, the href attribute at line 222 uses escapeHtml(pr.pr_url) which only escapes HTML entities. A malicious URL like javascript:alert(1) would not be escaped. Consider validating that URLs start with https://.

Proposed fix to validate PR URLs
+function isValidUrl(url) {
+  return url && (url.startsWith('https://') || url.startsWith('http://'));
+}
+
 if (result.prs && result.prs.length > 0) {
   prListEl.style.display = 'block';
   prItemsEl.innerHTML = result.prs.map((pr, i) => `
     <div class="pr-item">
       <strong>PR #${i + 1}:</strong>
-      <a href="${escapeHtml(pr.pr_url)}" target="_blank">${escapeHtml(pr.title || pr.pr_url)}</a>
+      ${isValidUrl(pr.pr_url) 
+        ? `<a href="${escapeHtml(pr.pr_url)}" target="_blank">${escapeHtml(pr.title || pr.pr_url)}</a>`
+        : escapeHtml(pr.title || pr.pr_url)}
       <span style="color:`#888`">(${escapeHtml(pr.branch_name || '')})</span>
     </div>
   `).join('');
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/homepage.html` around lines 219 - 225, The injected href uses
escapeHtml(pr.pr_url) but that doesn't prevent dangerous schemes (e.g.,
javascript:); update the rendering to validate and normalize pr.pr_url before
inserting into prItemsEl.innerHTML: in the code that maps PRs (the block using
prItemsEl.innerHTML and escapeHtml), add a helper call (e.g., normalizePrUrl or
verifyPrUrl) that returns a safe URL only if it starts with "https://" (or
"http://", per policy) otherwise returns a safe placeholder (like "#") or an
explicitly prefixed "https://"; use that sanitized value for the href and keep
escapeHtml(pr.title) for link text, and also ensure anchors include
rel="noopener noreferrer" when target="_blank".

Comment on lines 98 to +129
@app.post("/submit")
async def submit_job(
async def submit_workflow_job(
ep_url: str = Form(...),
repo: str = Form(...),
cwd: str = Form(default=""),
):
"""Validate inputs, create a workflow background job, and return its ID.

This runs the full 3-PR workflow:
- PR #1: init → api-generate → api-generate-tests → review → raise PR
- PR #2: api-implement → review → raise PR
- PR #3: e2e-generate → review → raise PR
"""
_validate_ep_url(ep_url)
working_dir = _resolve_working_dir(cwd)

job_id = uuid.uuid4().hex[:12]
jobs[job_id] = {
"status": "running",
"mode": "workflow",
"ep_url": ep_url,
"repo": repo,
"cwd": working_dir,
"conversation": [],
"message_event": asyncio.Condition(),
"output": "",
"cost_usd": 0.0,
"error": None,
"prs": [],
}
asyncio.create_task(_run_workflow_job(job_id, ep_url, repo, working_dir))
return {"job_id": job_id}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Duplicated job initialization between submit_workflow_job and api_workflow.

Both endpoints (lines 114-128 and 321-335) create identical job structures and start the same background task. This violates DRY and risks divergence. Extract a shared helper.

Proposed refactor
+def _create_workflow_job(ep_url: str, repo: str, working_dir: str) -> str:
+    """Create a workflow job and start background task. Returns job_id."""
+    job_id = uuid.uuid4().hex[:12]
+    jobs[job_id] = {
+        "status": "running",
+        "mode": "workflow",
+        "ep_url": ep_url,
+        "repo": repo,
+        "cwd": working_dir,
+        "conversation": [],
+        "message_event": asyncio.Condition(),
+        "output": "",
+        "cost_usd": 0.0,
+        "error": None,
+        "prs": [],
+    }
+    asyncio.create_task(_run_workflow_job(job_id, ep_url, repo, working_dir))
+    return job_id
+

 `@app.post`("/submit")
 async def submit_workflow_job(
     ep_url: str = Form(...),
     repo: str = Form(...),
     cwd: str = Form(default=""),
 ):
     _validate_ep_url(ep_url)
     working_dir = _resolve_working_dir(cwd)
-    job_id = uuid.uuid4().hex[:12]
-    jobs[job_id] = { ... }
-    asyncio.create_task(_run_workflow_job(job_id, ep_url, repo, working_dir))
+    job_id = _create_workflow_job(ep_url, repo, working_dir)
     return {"job_id": job_id}

Also applies to: 299-342

🧰 Tools
🪛 Ruff (0.15.0)

[warning] 128-128: Store a reference to the return value of asyncio.create_task

(RUF006)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/server.py` around lines 98 - 129, Both submit_workflow_job and
api_workflow duplicate building the job dict and starting the same background
task; extract a helper (e.g., create_job_and_start_task) that accepts job
parameters (ep_url, repo, working_dir, mode) and returns job_id. Move
construction of the job object (status, mode, ep_url, repo, cwd, conversation,
message_event, output, cost_usd, error, prs) and the
asyncio.create_task(_run_workflow_job(...)) call into that helper, then replace
the duplicated blocks in submit_workflow_job and api_workflow to call the new
helper and return the job_id. Ensure the helper updates the shared jobs mapping
and reuses the existing _run_workflow_job function signature.

@swghosh
Copy link
Collaborator Author

swghosh commented Feb 18, 2026

closing in favour of #21

@swghosh swghosh closed this Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant