Skip to content

Critical Fixes: NameError in estimate_completion_tokens & Agent Infinite Loop (State Staleness) #49

@faisal-fida

Description

@faisal-fida

While running the Agent V2 with the Qwen-8B model, I encountered two critical issues that prevent the agent from functioning correctly. One causes a hard crash, and the other causes an infinite loop where the agent fails to "remember" previous errors during a multi-step execution.

Issue 1: NameError Crash

Location: backend/app/ai/agent_v2.py -> estimate_completion_tokens

The Bug:
The code attempts to pass build_id to the AgentV2 constructor, but build_id is not defined in the estimate_completion_tokens function signature. This raises a NameError and crashes the application when a report title generation or token estimation is triggered.

Error Log:

ERROR | root:estimate_completion_tokens:288 - Unexpected error in estimate_completion_tokens: name 'build_id' is not defined

Proposed Fix:
Update the function signature to accept build_id as an optional argument.

# In backend/app/ai/agent_v2.py

async def estimate_completion_tokens(
    self,
    db: AsyncSession,
    report_id: str,
    completion_data: CompletionCreate,
    current_user: User,
    organization: Organization,
    external_user_id: str = None,
    external_platform: str = None,
    build_id: str = None,  # <--- ADD THIS PARAMETER
) -> CompletionContextEstimateSchema:

Issue 2: Infinite Loop / "Amnesia Bug"

Location: backend/app/ai/agent_v2.py -> Main Execution Loop

The Bug:
When the agent fails a step (e.g., returns invalid JSON, missing action, or a tool error) and the loop continues to retry, the agent often repeats the exact same mistake indefinitely.

Root Cause:
The variable history_summary is calculated once before the while loop starts. Inside the loop, if an error occurs, we add an observation to tool_observations, but history_summary is never recalculated. Consequently, the prompt passed to the LLM in the next iteration contains the stale history, so the LLM is unaware of the previous error.

Evidence:
Logs show prompt_tokens remaining static (e.g., 6376 -> 6376 -> 6376) across 10 iterations, proving the context is not growing.

Proposed Fix:
Recalculate history_summary at the start of every loop iteration to ensure the prompt includes the latest observations (including errors).

# In backend/app/ai/agent_v2.py inside the main loop

while not analysis_complete:
    # ... existing code ...

    # ADD THIS: Recalculate history to include recent observations/errors
    if self.context_hub.observation_builder.tool_observations:
         history_summary = self.context_hub.get_history_summary(
            self.context_hub.observation_builder.to_dict()
        )

    # ... prompt building logic follows ...

Impact

Applying these two fixes resolved the application crash and allowed the agent to properly iterate, "remember" its planning errors, and successfully exit the reasoning loop.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions