Skip to content

Comments

Update LS to Support Agent Evaluations in BI#686

Merged
LakshanWeerasinghe merged 28 commits intoballerina-platform:mainfrom
dan-niles:bi-test-creation-updates
Feb 17, 2026
Merged

Update LS to Support Agent Evaluations in BI#686
LakshanWeerasinghe merged 28 commits intoballerina-platform:mainfrom
dan-niles:bi-test-creation-updates

Conversation

@dan-niles
Copy link
Contributor

@dan-niles dan-niles commented Feb 10, 2026

Purpose

Related to wso2/product-ballerina-integrator#2333

Changes Include:

  • Update test creation with support for additional parameters for @test:Config - dataProvider, after, before, dependsOn, runs, minPassRate
  • Add support for automatically creating a data provider function when users provide an eval set file.
  • Adds support for reusing AI Agent instances from BI side panel.
  • Move chat agent back to module level (Required for writing agent evaluations)

This pull request enhances the Ballerina Language Server to add BI-side support for agent-based evaluations and to expand test configuration and eval-set workflows. Main outcomes:

  • Agent instance management and reuse

    • Adds AGENTS and AGENT_RUN node kinds and a new AgentRunBuilder to construct agent run calls with inferred-type handling and unique variable naming.
    • Adds getAvailableAgents API and JSON-RPC endpoint so BI can list and reuse agent instances from the side panel.
    • Simplifies agent metadata and available-node resources to support "Create or reuse an Agent."
  • Service and module integration

    • Moves chat agent usage to module scope and simplifies generated service code (AiChatServiceBuilder) by removing initialization scaffolding and version-gated logic, enabling agent evaluations to be written at module level.
  • Test configuration and eval-set support

    • Extends @test:Config to include dataProvider, dataProviderMode, evalSetFile, dependsOn, after, before, runs, and minPassRate.
    • Automatically generates a data-provider function when an eval-set file is provided (evalSet mode) and adds support for ai:ConversationThread parameters in test functions.
    • Enhances TestManagerService and Utils with helpers to locate, create, and update data-provider functions, extract/update evalSet paths, and detect AI imports. Adds templates and sample resources for eval-set workflows.
  • Public API and model updates

    • Adds AI-related constants and public helpers (e.g., eval-set provider template, AI import check).
    • Expands Annotation.ConfigAnnotationBuilder and related models to expose the new Config fields.
    • Updates test resources and samples to cover minPassRate and eval-set scenarios.

Overall, the changes enable BI to create and reuse AI agents for evaluations, simplify agent usage at module level, and provide richer, eval-set-driven test configuration and generation.

@coderabbitai
Copy link

coderabbitai bot commented Feb 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added AGENTS and AGENT_RUN node kinds and an AgentRunBuilder; exposed a core API to list agents; introduced an LS endpoint for available agents; simplified service AI agent initialization; and added extensive test-manager evalSet/data-provider and test-config handling.

Changes

Cohort / File(s) Summary
Core node kinds & builders
flow-model-generator/modules/flow-model-generator-core/src/main/java/.../model/NodeKind.java, .../NodeBuilder.java, .../node/AgentRunBuilder.java
Added AGENTS and AGENT_RUN enum values, implemented AgentRunBuilder, and registered it in NodeBuilder for agent run node construction (templating, inferred-type handling, param filtering, text-edit aggregation).
Available nodes generator & LS API
flow-model-generator/modules/flow-model-generator-core/src/main/java/.../AvailableNodesGenerator.java, flow-model-generator/modules/flow-model-generator-ls-extension/src/main/java/.../FlowModelGeneratorService.java
Rewrote agent category assembly to emit AGENTS, added helper to find agent category by symbol, added public JsonArray getAvailableAgents(LinePosition), and added LS JSON-RPC getAvailableAgents(...) delegating to the core method.
Available-nodes test resources
flow-model-generator/modules/flow-model-generator-ls-extension/src/test/resources/available_nodes/config/*.json, .../agents_manager/config/available_agents.json
Standardized Agent entries: description → "Create or reuse an Agent." and codedata simplified from detailed AGENT_CALL payload to { "node": "AGENTS" }.
Service model generator (AI chat)
service-model-generator/modules/service-model-generator-ls-extension/src/main/java/.../AiChatServiceBuilder.java, test templates .../add_ai_service_*.json
Removed multi-step/init scaffolding and version gating; simplified agent call generation to use direct agent variable reference and updated service member/template generation accordingly.
Test-manager constants
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/.../Constants.java
Added AI-related public constants (e.g., MODULE_AI, IMPORT_AI_STMT, data-provider/evalSet defaults, quoting/isolated keyword, AI_CONVERSATION_THREAD_TYPE).
Test-manager service & utils (evalSet support)
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/.../TestManagerService.java, .../Utils.java
Threaded ModulePartNode through models; added evalSet data-provider handling (AI import checks, generation/insertion of evalSet provider templates, ai:Trace parameter injection, evalSet extraction/updating) and many helper utilities; updated public method signatures.
Annotation model (Config) changes
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/.../model/Annotation.java
Extended ConfigAnnotationBuilder with properties and setters: dataProvider, dataProviderMode, evalSetFile, dependsOn, after, before, runs, minPassRate; build() now aggregates full property list.
Test resources, examples & patches
test-manager-service/modules/test-manager-service-ls-extension/src/test/resources/**
Added/updated JSON configs and sample sources demonstrating new config fields (dataProvider, dataProviderMode, evalSetFile, runs, minPassRate), evalSet-related edits, and new sample Ballerina project files used by tests.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant LS as FlowModelGeneratorService\n(LS)
    participant Core as AvailableNodesGenerator\n(Core)
    participant Builder as AgentRunBuilder

    Client->>LS: getAvailableAgents(request)
    LS->>Core: generator.getAvailableAgents(position)
    Core->>Core: locate AGENTS category via predicate
    Core->>Builder: map symbol -> NodeKind.AGENT_RUN / AGENTS
    Builder-->>Core: return built node metadata & text edits
    Core-->>LS: JsonArray of agent items
    LS-->>Client: FlowModelAvailableNodesResponse
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇
I nibbled through enums and stitched a run,
Agents now gather, one by one.
EvalSets hum, test threads align,
Builders craft the callable line.
A rabbit hops — the changes are done.

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete. While it lists specific changes, it is missing several required template sections including Goals, Approach, User stories, Release note, Documentation, Training, Certification, Marketing, Automation tests, Security checks, Samples, Related PRs, Migrations, Test environment, and Learning. Complete the PR description by filling out all required template sections, including Goals, Approach, User stories, Release notes, Documentation links, and other mandatory sections as specified in the template.
Docstring Coverage ⚠️ Warning Docstring coverage is 30.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change - adding support for agent evaluations in the BI language server, which is the primary objective of this PR.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@dan-niles dan-niles force-pushed the bi-test-creation-updates branch from dbce35d to e531e4d Compare February 12, 2026 08:01
@dan-niles dan-niles force-pushed the bi-test-creation-updates branch from 9433192 to 290214b Compare February 12, 2026 08:44
@dan-niles dan-niles marked this pull request as ready for review February 12, 2026 09:38
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In
`@flow-model-generator/modules/flow-model-generator-core/src/main/java/io/ballerina/flowmodelgenerator/core/AvailableNodesGenerator.java`:
- Around line 420-422: The Agent type detection currently uses only
className.equals("Agent") in AvailableNodesGenerator (mapping to
NodeKind.AGENT_RUN) and in getAgent(), causing user-defined Agents to be
misclassified; add a shared helper (e.g., isAiAgent(Symbol classSymbol) similar
to isAiKnowledgeBase/isAiModelProvider) that checks the symbol's module/org
(restrict to the Ballerina AI module/org) and replace the two name-only checks
with calls to this helper, and update the NodeBuilder/NodeKind mapping code in
AvailableNodesGenerator to use isAiAgent(...) when deciding AGENT_RUN.

In
`@flow-model-generator/modules/flow-model-generator-core/src/main/java/io/ballerina/flowmodelgenerator/core/model/node/AgentRunBuilder.java`:
- Around line 99-110: overrideVariableName in AgentRunBuilder currently calls
NameUtil.generateVariableName with the hardcoded type "string", causing generic
names; change it to derive the type hint from the actual agent call/return type
(use the inferred return type available in the TemplateContext or the agent call
builder), e.g., obtain the operation's return type symbol/name and pass that to
NameUtil.generateVariableName instead of "string", falling back to a contextual
default like "agentResponse" if no type can be determined; apply the same fix
pattern to AgentCallBuilder where it also uses "string" and keep using
Property.VARIABLE_KEY, TemplateContext.getAllVisibleSymbolNames(),
NameUtil.generateVariableName(...) and AiUtils.createUpdatedProperty(...) to
update the property.

In
`@service-model-generator/modules/service-model-generator-ls-extension/src/main/java/io/ballerina/servicemodelgenerator/extension/builder/service/AiChatServiceBuilder.java`:
- Around line 102-105: The generated service references agentVarName
(constructed from agentName + "Agent") without ensuring that symbol exists,
causing potential undefined references in AiChatServiceBuilder; update the
builder to validate or provide a fallback before emitting code by checking for
the agent variable in the module/namespace where buildServiceNodeBody and
getServiceMembers are used, and if missing either inject a minimal agent
declaration (e.g., a default agent instance) or switch to a documented default
agent name; ensure this check happens prior to calling
buildServiceNodeStr/buildServiceNodeBody and replicate the same guard where
agentVarName is later referenced (around the code paths corresponding to lines
~151-154) so generated code never references an uninitialized agentVarName.

In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/model/Annotation.java`:
- Around line 68-71: The UI label casing is inconsistent: the method evalSetFile
currently calls value("Evalset File", ...) but elsewhere the default label uses
"EvalSet File"; update the label string in evalSetFile to "EvalSet File" to
match the default and fix the inconsistency, and apply the same correction to
the other setter method that uses "Evalset File" (the similar occurrence around
lines 141-145) so both use "EvalSet File" when calling value(...).

In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java`:
- Around line 584-598: The code in TestManagerService currently checks
functionName.contains("loadConversationThreads") which is too loose; update the
check in the FunctionCallExpressionNode handling to match the exact function
identifier (e.g., the fully qualified name or the exact source string) instead
of contains — use equals or compare the qualified name obtained from
funcCall.functionName().toSourceCode().trim() against the expected
"ai:loadConversationThreads" (or the exact unqualified "loadConversationThreads"
if appropriate) so only the intended function is matched; adjust the condition
where functionName is computed and used to return the argument lineRange
accordingly.
- Around line 237-243: Update the inline comment above
addAiConversationThreadParameter(request.function()) to correctly name the type
being added: replace "Add ai:Trace parameter" with "Add ai:ConversationThread
parameter" (referencing Constants.AI_CONVERSATION_THREAD_TYPE) so the comment
matches the actual type; ensure nearby comments around
updateDataProviderField(request.function(), dataProviderFunctionName) remain
accurate.

In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java`:
- Around line 289-305: The current check using
functionName.contains("loadConversationThreads") is too broad; in Utils.java
replace that condition with a precise match such as
functionName.equals("loadConversationThreads") or, to account for
module-qualified names, functionName.equals("ai:loadConversationThreads") or
functionName.endsWith(":loadConversationThreads"), so only the intended
ai:loadConversationThreads call is matched in the FunctionCallExpressionNode
handling code.
🧹 Nitpick comments (5)
flow-model-generator/modules/flow-model-generator-core/src/main/java/io/ballerina/flowmodelgenerator/core/model/node/AgentRunBuilder.java (1)

43-48: Update class Javadoc to reflect actual purpose.

The class documentation currently says "Represents a function call node" which appears to be copied from another builder. It should accurately describe this class's purpose.

📝 Suggested documentation fix
 /**
- * Represents a function call node.
+ * Represents an agent run node builder for constructing and emitting source code for Ballerina agent run calls.
  *
  * `@since` 1.5.1
  */
 public class AgentRunBuilder extends CallBuilder {
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java (2)

228-235: Consider extracting default evalSetFile as a constant.

The hardcoded "session.json" fallback could be extracted to a constant in the Constants class for better maintainability and discoverability.

♻️ Suggested refactor

In Constants.java:

public static final String DEFAULT_EVALSET_FILE = "session.json";

Then update this code:

                     String evalSetFile = getEvalSetFile(request.function());
                     if (evalSetFile == null || evalSetFile.isEmpty()) {
-                        evalSetFile = "session.json"; // Default fallback
+                        evalSetFile = Constants.DEFAULT_EVALSET_FILE;
                     }

256-270: Consider extracting common annotation field extraction logic.

The methods getDataProviderMode, getEvalSetFile, and getDataProviderName share nearly identical iteration logic over annotations. Consider extracting a common helper method.

♻️ Suggested refactor
private String getConfigFieldValue(
        io.ballerina.testmanagerservice.extension.model.TestFunction function,
        String fieldName) {
    if (function.annotations() == null) {
        return null;
    }
    for (Annotation annotation : function.annotations()) {
        if ("Config".equals(annotation.name())) {
            for (Property field : annotation.fields()) {
                if (fieldName.equals(field.originalName())) {
                    return field.value() != null 
                        ? field.value().toString().replaceAll("\"", "") 
                        : null;
                }
            }
        }
    }
    return null;
}

Then simplify:

private String getDataProviderMode(...) {
    return getConfigFieldValue(function, "dataProviderMode");
}

private String getEvalSetFile(...) {
    return getConfigFieldValue(function, "evalSetFile");
}

private String getDataProviderName(...) {
    return getConfigFieldValue(function, "dataProvider");
}
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java (2)

230-308: Consider consolidating duplicated AST traversal logic.

The methods extractEvalSetFileFromDataProvider, extractFromStatements, and extractFromExpression share very similar traversal logic with findEvalSetFilePathLocation, findFilePathInStatements, and findFilePathInExpression in TestManagerService.java. The main difference is the return type (String vs Optional).

Consider creating a common visitor or helper that can return both the extracted value and its location, reducing duplication.


532-555: Consider using String.format for template generation.

The method works correctly, but using String.format or a text block could improve readability for the generated function template.

♻️ Alternative using text block (Java 15+)
public static String getEvalSetDataProviderFunctionTemplate(String functionName, String evalSetFilePath) {
    return """
            
            
            isolated function %s() returns map<[%s]>|error {
            \treturn ai:loadConversationThreads("%s");
            }""".formatted(functionName, Constants.AI_CONVERSATION_THREAD_TYPE, evalSetFilePath);
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/test/resources/update_test_function/config/update_evalset_file_config1.json`:
- Around line 145-156: The expected newText for the `@test`:Config block is
missing the dataProviderMode field, so update the newText value in the JSON
snippet to include dataProviderMode: evalSet alongside groups and dataProvider;
locate the JSON entry where "newText" contains "@test:Config{\n groups:
[\"evalset\"],\n dataProvider: loadEvalSetData\n}" and modify it to include
"dataProviderMode: evalSet" in the config block so the test validates insertion
of that field.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/model/Annotation.java (1)

34-66: ⚠️ Potential issue | 🟠 Major

Keep dataProviderMode out of Config annotation fields.

dataProviderMode is being added as a regular Property and defaulted in build(), which will surface it in the Config annotation model/UI. Since this field is internal-only, move it out of the annotation fields list (e.g., store in codedata or a dedicated metadata structure) or ensure it’s hidden/non-editable and excluded from generated annotations.
Based on learnings: In the test manager service for Ballerina, dataProviderMode is an internal metadata field used to determine whether to load an evalset using a data provider. It is NOT a field of the test:Config annotation and should not appear in the generated test configuration annotations.

Also applies to: 133-139

🤖 Fix all issues with AI agents
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java`:
- Around line 270-278: The method addAiConversationThreadParameter currently
always appends a new ai:ConversationThread param and can create duplicate
parameters; update it to first inspect TestFunction.parameters() (guarding
against null/empty) and iterate existing FunctionParameter entries to see if any
have type equal to Constants.AI_CONVERSATION_THREAD_TYPE or variable/name equal
to "thread", and return early if found; only build and add the new
FunctionParameter via FunctionParameter.FunctionParameterBuilder when no
existing parameter matches.
🧹 Nitpick comments (1)
flow-model-generator/modules/flow-model-generator-core/src/main/java/io/ballerina/flowmodelgenerator/core/AvailableNodesGenerator.java (1)

467-475: Consider simplifying by removing redundant try-catch.

The try-catch wrapping isAgentClass is redundant since getCategory already catches RuntimeException at line 462. Other similar methods like getKnowledgeBase, getVectorStore, and getChunkers (lines 489-502) use method references directly without try-catch, providing a consistent pattern.

♻️ Suggested simplification to match existing pattern
 private Optional<Category> getAgent(Symbol symbol) {
-    return getCategory(symbol, classSymbol -> {
-        try {
-            return isAgentClass(classSymbol);
-        } catch (Exception e) {
-            return false;
-        }
-    });
+    return getCategory(symbol, CommonUtils::isAgentClass);
 }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java`:
- Around line 276-302: The updateDataProviderField method currently only
replaces an existing dataProvider Property; modify it to also add a new
dataProvider Property when none exists: inside updateDataProviderField
(operating on TestFunction.function, Annotation named "Config" and Property),
after iterating fields if no "dataProvider" originalName was found, create a new
Property via Property.PropertyBuilder copying relevant
metadata/codedata/valueType/valueTypeConstraint/placeholder/optional/editable/advanced,
set value(functionName) and originalName("dataProvider"), and append it to
annotation.fields(); ensure this logic runs only for the "Config" annotation and
preserves existing field order semantics.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java (1)

110-116: ⚠️ Potential issue | 🟡 Minor

Handle empty Optional to avoid NoSuchElementException.

Line 112 calls .get() on the Optional returned by semanticModel.symbol(annotationNode) without checking if it's present. If the symbol is not found, this will throw NoSuchElementException.

🛡️ Suggested fix
     public static Annotation getAnnotationModel(AnnotationNode annotationNode, SemanticModel semanticModel,
                                                 ModulePartNode modulePartNode) {
-        AnnotationSymbol annotationSymbol = (AnnotationSymbol) semanticModel.symbol(annotationNode).get();
-        String annotName = annotationSymbol.getName().orElse("");
-        if (annotName.isEmpty()) {
+        Optional<AnnotationSymbol> symbolOpt = semanticModel.symbol(annotationNode)
+                .filter(s -> s instanceof AnnotationSymbol)
+                .map(s -> (AnnotationSymbol) s);
+        if (symbolOpt.isEmpty()) {
+            return null;
+        }
+        String annotName = symbolOpt.get().getName().orElse("");
+        if (annotName.isEmpty()) {
             return null;
         }
🧹 Nitpick comments (2)
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java (2)

607-619: Consider adding null check for range parameter.

Since getAnnotationsRange can return null (when annotations list is empty), callers passing its result directly to this method would cause a NullPointerException. Consider adding a defensive null check or documenting the precondition.

♻️ Suggested improvement
     public static String extractTextFromRange(io.ballerina.tools.text.TextDocument textDocument, LineRange range) {
+        if (range == null) {
+            return "";
+        }
         int start = textDocument.textPositionFrom(range.startLine());
         int end = textDocument.textPositionFrom(range.endLine());
         String text = textDocument.toString().substring(start, end);
         return text.replaceAll("^\"|\"$", "").trim();
     }

628-651: Consider avoiding string concatenation inside StringBuilder.append().

Lines 642 and 647 use string concatenation ("map<[" + ... + "]>|error" and "return ai:loadConversationThreads(\"" + ... + "\");") inside append() calls, which creates intermediate String objects. For consistency with the rest of the method, consider using multiple append() calls.

♻️ Suggested improvement
                 .append(Constants.KEYWORD_RETURNS)
                 .append(Constants.SPACE)
-                .append("map<[" + Constants.AI_CONVERSATION_THREAD_TYPE + "]>|error")
+                .append("map<[")
+                .append(Constants.AI_CONVERSATION_THREAD_TYPE)
+                .append("]>|error")
                 .append(Constants.SPACE)
                 .append(Constants.OPEN_CURLY_BRACE)
                 .append(Constants.LINE_SEPARATOR)
                 .append(Constants.TAB_SEPARATOR)
-                .append("return ai:loadConversationThreads(\"" + evalSetFilePath + "\");")
+                .append("return ai:loadConversationThreads(\"")
+                .append(evalSetFilePath)
+                .append("\");")
                 .append(Constants.LINE_SEPARATOR)

Copy link
Member

@MohamedSabthar MohamedSabthar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dan-niles is there a place we conditionally enable eval features based on ai module version and lang version here? Or is it done in the extension?

@dan-niles
Copy link
Contributor Author

@dan-niles is there a place we conditionally enable eval features based on ai module version and lang version here? Or is it done in the extension?

@MohamedSabthar I have implemented it on the extension side.

MohamedSabthar
MohamedSabthar previously approved these changes Feb 13, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the Ballerina Language Server to support agent-based evaluations in the Ballerina Integrator (BI). The primary goal is to enable BI-side creation and reuse of AI agents for evaluations, simplify agent usage at the module level, and provide richer test configuration with eval-set-driven workflows.

Changes:

  • Extended @test:Config annotation to support additional parameters (dataProvider, dataProviderMode, evalSetFile, dependsOn, after, before, runs, minPassRate) for comprehensive test configuration
  • Added support for automatically generating data provider functions when users provide an eval-set file, enabling agent-based evaluation workflows
  • Introduced new node kinds (AGENTS, AGENT_RUN) and AgentRunBuilder to construct agent run calls with inferred-type handling and unique variable naming
  • Moved AI agent usage from service-level to module-level scope by removing initialization scaffolding in AiChatServiceBuilder, enabling agent evaluations to be written at module level
  • Added getAvailableAgents API and JSON-RPC endpoint to support agent instance reuse from the BI side panel

Reviewed changes

Copilot reviewed 61 out of 61 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test-manager-service/.../Annotation.java Extended ConfigAnnotationBuilder with new fields for test configuration (dataProvider, evalSetFile, runs, minPassRate, etc.)
test-manager-service/.../Utils.java Added helper methods for evalSet handling, including extraction and updating of evalSet file paths, AI import checks, and evalSet data provider template generation
test-manager-service/.../TestManagerService.java Enhanced addTestFunction and updateTestFunction to support evalSet mode, including automatic generation of data provider functions and parameter injection
test-manager-service/.../Constants.java Added constants for AI module, evalSet mode, and data provider handling
service-model-generator/.../AiChatServiceBuilder.java Simplified agent handling by removing service-level initialization and version checks, enabling module-level agent usage
flow-model-generator/.../AgentRunBuilder.java New builder class for agent run calls with inferred-type handling and unique variable naming
flow-model-generator/.../AgentCallBuilder.java Updated to use cached templates and improved type descriptor labeling
flow-model-generator/.../NodeKind.java Added AGENTS and AGENT_RUN node kinds
flow-model-generator/.../AvailableNodesGenerator.java Updated to support agent availability queries and simplified agent node metadata
flow-model-generator/.../AiUtils.java Added utility methods for property manipulation with updated labels
Test resource files Added comprehensive test resources for minPassRate and evalSet features

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@LakshanWeerasinghe LakshanWeerasinghe merged commit 444c978 into ballerina-platform:main Feb 17, 2026
4 checks passed
@dan-niles dan-niles deleted the bi-test-creation-updates branch February 17, 2026 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants