Update LS to Support Agent Evaluations in BI#686
Update LS to Support Agent Evaluations in BI#686LakshanWeerasinghe merged 28 commits intoballerina-platform:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdded AGENTS and AGENT_RUN node kinds and an AgentRunBuilder; exposed a core API to list agents; introduced an LS endpoint for available agents; simplified service AI agent initialization; and added extensive test-manager evalSet/data-provider and test-config handling. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant LS as FlowModelGeneratorService\n(LS)
participant Core as AvailableNodesGenerator\n(Core)
participant Builder as AgentRunBuilder
Client->>LS: getAvailableAgents(request)
LS->>Core: generator.getAvailableAgents(position)
Core->>Core: locate AGENTS category via predicate
Core->>Builder: map symbol -> NodeKind.AGENT_RUN / AGENTS
Builder-->>Core: return built node metadata & text edits
Core-->>LS: JsonArray of agent items
LS-->>Client: FlowModelAvailableNodesResponse
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
dbce35d to
e531e4d
Compare
9433192 to
290214b
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In
`@flow-model-generator/modules/flow-model-generator-core/src/main/java/io/ballerina/flowmodelgenerator/core/AvailableNodesGenerator.java`:
- Around line 420-422: The Agent type detection currently uses only
className.equals("Agent") in AvailableNodesGenerator (mapping to
NodeKind.AGENT_RUN) and in getAgent(), causing user-defined Agents to be
misclassified; add a shared helper (e.g., isAiAgent(Symbol classSymbol) similar
to isAiKnowledgeBase/isAiModelProvider) that checks the symbol's module/org
(restrict to the Ballerina AI module/org) and replace the two name-only checks
with calls to this helper, and update the NodeBuilder/NodeKind mapping code in
AvailableNodesGenerator to use isAiAgent(...) when deciding AGENT_RUN.
In
`@flow-model-generator/modules/flow-model-generator-core/src/main/java/io/ballerina/flowmodelgenerator/core/model/node/AgentRunBuilder.java`:
- Around line 99-110: overrideVariableName in AgentRunBuilder currently calls
NameUtil.generateVariableName with the hardcoded type "string", causing generic
names; change it to derive the type hint from the actual agent call/return type
(use the inferred return type available in the TemplateContext or the agent call
builder), e.g., obtain the operation's return type symbol/name and pass that to
NameUtil.generateVariableName instead of "string", falling back to a contextual
default like "agentResponse" if no type can be determined; apply the same fix
pattern to AgentCallBuilder where it also uses "string" and keep using
Property.VARIABLE_KEY, TemplateContext.getAllVisibleSymbolNames(),
NameUtil.generateVariableName(...) and AiUtils.createUpdatedProperty(...) to
update the property.
In
`@service-model-generator/modules/service-model-generator-ls-extension/src/main/java/io/ballerina/servicemodelgenerator/extension/builder/service/AiChatServiceBuilder.java`:
- Around line 102-105: The generated service references agentVarName
(constructed from agentName + "Agent") without ensuring that symbol exists,
causing potential undefined references in AiChatServiceBuilder; update the
builder to validate or provide a fallback before emitting code by checking for
the agent variable in the module/namespace where buildServiceNodeBody and
getServiceMembers are used, and if missing either inject a minimal agent
declaration (e.g., a default agent instance) or switch to a documented default
agent name; ensure this check happens prior to calling
buildServiceNodeStr/buildServiceNodeBody and replicate the same guard where
agentVarName is later referenced (around the code paths corresponding to lines
~151-154) so generated code never references an uninitialized agentVarName.
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/model/Annotation.java`:
- Around line 68-71: The UI label casing is inconsistent: the method evalSetFile
currently calls value("Evalset File", ...) but elsewhere the default label uses
"EvalSet File"; update the label string in evalSetFile to "EvalSet File" to
match the default and fix the inconsistency, and apply the same correction to
the other setter method that uses "Evalset File" (the similar occurrence around
lines 141-145) so both use "EvalSet File" when calling value(...).
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java`:
- Around line 584-598: The code in TestManagerService currently checks
functionName.contains("loadConversationThreads") which is too loose; update the
check in the FunctionCallExpressionNode handling to match the exact function
identifier (e.g., the fully qualified name or the exact source string) instead
of contains — use equals or compare the qualified name obtained from
funcCall.functionName().toSourceCode().trim() against the expected
"ai:loadConversationThreads" (or the exact unqualified "loadConversationThreads"
if appropriate) so only the intended function is matched; adjust the condition
where functionName is computed and used to return the argument lineRange
accordingly.
- Around line 237-243: Update the inline comment above
addAiConversationThreadParameter(request.function()) to correctly name the type
being added: replace "Add ai:Trace parameter" with "Add ai:ConversationThread
parameter" (referencing Constants.AI_CONVERSATION_THREAD_TYPE) so the comment
matches the actual type; ensure nearby comments around
updateDataProviderField(request.function(), dataProviderFunctionName) remain
accurate.
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java`:
- Around line 289-305: The current check using
functionName.contains("loadConversationThreads") is too broad; in Utils.java
replace that condition with a precise match such as
functionName.equals("loadConversationThreads") or, to account for
module-qualified names, functionName.equals("ai:loadConversationThreads") or
functionName.endsWith(":loadConversationThreads"), so only the intended
ai:loadConversationThreads call is matched in the FunctionCallExpressionNode
handling code.
🧹 Nitpick comments (5)
flow-model-generator/modules/flow-model-generator-core/src/main/java/io/ballerina/flowmodelgenerator/core/model/node/AgentRunBuilder.java (1)
43-48: Update class Javadoc to reflect actual purpose.The class documentation currently says "Represents a function call node" which appears to be copied from another builder. It should accurately describe this class's purpose.
📝 Suggested documentation fix
/** - * Represents a function call node. + * Represents an agent run node builder for constructing and emitting source code for Ballerina agent run calls. * * `@since` 1.5.1 */ public class AgentRunBuilder extends CallBuilder {test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java (2)
228-235: Consider extracting default evalSetFile as a constant.The hardcoded
"session.json"fallback could be extracted to a constant in theConstantsclass for better maintainability and discoverability.♻️ Suggested refactor
In
Constants.java:public static final String DEFAULT_EVALSET_FILE = "session.json";Then update this code:
String evalSetFile = getEvalSetFile(request.function()); if (evalSetFile == null || evalSetFile.isEmpty()) { - evalSetFile = "session.json"; // Default fallback + evalSetFile = Constants.DEFAULT_EVALSET_FILE; }
256-270: Consider extracting common annotation field extraction logic.The methods
getDataProviderMode,getEvalSetFile, andgetDataProviderNameshare nearly identical iteration logic over annotations. Consider extracting a common helper method.♻️ Suggested refactor
private String getConfigFieldValue( io.ballerina.testmanagerservice.extension.model.TestFunction function, String fieldName) { if (function.annotations() == null) { return null; } for (Annotation annotation : function.annotations()) { if ("Config".equals(annotation.name())) { for (Property field : annotation.fields()) { if (fieldName.equals(field.originalName())) { return field.value() != null ? field.value().toString().replaceAll("\"", "") : null; } } } } return null; }Then simplify:
private String getDataProviderMode(...) { return getConfigFieldValue(function, "dataProviderMode"); } private String getEvalSetFile(...) { return getConfigFieldValue(function, "evalSetFile"); } private String getDataProviderName(...) { return getConfigFieldValue(function, "dataProvider"); }test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java (2)
230-308: Consider consolidating duplicated AST traversal logic.The methods
extractEvalSetFileFromDataProvider,extractFromStatements, andextractFromExpressionshare very similar traversal logic withfindEvalSetFilePathLocation,findFilePathInStatements, andfindFilePathInExpressioninTestManagerService.java. The main difference is the return type (String vs Optional).Consider creating a common visitor or helper that can return both the extracted value and its location, reducing duplication.
532-555: Consider using String.format for template generation.The method works correctly, but using
String.formator a text block could improve readability for the generated function template.♻️ Alternative using text block (Java 15+)
public static String getEvalSetDataProviderFunctionTemplate(String functionName, String evalSetFilePath) { return """ isolated function %s() returns map<[%s]>|error { \treturn ai:loadConversationThreads("%s"); }""".formatted(functionName, Constants.AI_CONVERSATION_THREAD_TYPE, evalSetFilePath); }
...nerator-core/src/main/java/io/ballerina/flowmodelgenerator/core/AvailableNodesGenerator.java
Outdated
Show resolved
Hide resolved
...ator-core/src/main/java/io/ballerina/flowmodelgenerator/core/model/node/AgentRunBuilder.java
Show resolved
Hide resolved
.../java/io/ballerina/servicemodelgenerator/extension/builder/service/AiChatServiceBuilder.java
Show resolved
Hide resolved
...e-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/model/Annotation.java
Show resolved
Hide resolved
...ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java
Show resolved
Hide resolved
...ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java
Outdated
Show resolved
Hide resolved
...ager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/test/resources/update_test_function/config/update_evalset_file_config1.json`:
- Around line 145-156: The expected newText for the `@test`:Config block is
missing the dataProviderMode field, so update the newText value in the JSON
snippet to include dataProviderMode: evalSet alongside groups and dataProvider;
locate the JSON entry where "newText" contains "@test:Config{\n groups:
[\"evalset\"],\n dataProvider: loadEvalSetData\n}" and modify it to include
"dataProviderMode: evalSet" in the config block so the test validates insertion
of that field.
...ls-extension/src/test/resources/update_test_function/config/update_evalset_file_config1.json
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/model/Annotation.java (1)
34-66:⚠️ Potential issue | 🟠 MajorKeep
dataProviderModeout of Config annotation fields.
dataProviderModeis being added as a regularPropertyand defaulted inbuild(), which will surface it in the Config annotation model/UI. Since this field is internal-only, move it out of the annotationfieldslist (e.g., store in codedata or a dedicated metadata structure) or ensure it’s hidden/non-editable and excluded from generated annotations.
Based on learnings: In the test manager service for Ballerina,dataProviderModeis an internal metadata field used to determine whether to load an evalset using a data provider. It is NOT a field of thetest:Configannotation and should not appear in the generated test configuration annotations.Also applies to: 133-139
🤖 Fix all issues with AI agents
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java`:
- Around line 270-278: The method addAiConversationThreadParameter currently
always appends a new ai:ConversationThread param and can create duplicate
parameters; update it to first inspect TestFunction.parameters() (guarding
against null/empty) and iterate existing FunctionParameter entries to see if any
have type equal to Constants.AI_CONVERSATION_THREAD_TYPE or variable/name equal
to "thread", and return early if found; only build and add the new
FunctionParameter via FunctionParameter.FunctionParameterBuilder when no
existing parameter matches.
🧹 Nitpick comments (1)
flow-model-generator/modules/flow-model-generator-core/src/main/java/io/ballerina/flowmodelgenerator/core/AvailableNodesGenerator.java (1)
467-475: Consider simplifying by removing redundant try-catch.The try-catch wrapping
isAgentClassis redundant sincegetCategoryalready catchesRuntimeExceptionat line 462. Other similar methods likegetKnowledgeBase,getVectorStore, andgetChunkers(lines 489-502) use method references directly without try-catch, providing a consistent pattern.♻️ Suggested simplification to match existing pattern
private Optional<Category> getAgent(Symbol symbol) { - return getCategory(symbol, classSymbol -> { - try { - return isAgentClass(classSymbol); - } catch (Exception e) { - return false; - } - }); + return getCategory(symbol, CommonUtils::isAgentClass); }
...ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java`:
- Around line 276-302: The updateDataProviderField method currently only
replaces an existing dataProvider Property; modify it to also add a new
dataProvider Property when none exists: inside updateDataProviderField
(operating on TestFunction.function, Annotation named "Config" and Property),
after iterating fields if no "dataProvider" originalName was found, create a new
Property via Property.PropertyBuilder copying relevant
metadata/codedata/valueType/valueTypeConstraint/placeholder/optional/editable/advanced,
set value(functionName) and originalName("dataProvider"), and append it to
annotation.fields(); ensure this logic runs only for the "Config" annotation and
preserves existing field order semantics.
...ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/TestManagerService.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java (1)
110-116:⚠️ Potential issue | 🟡 MinorHandle empty Optional to avoid
NoSuchElementException.Line 112 calls
.get()on the Optional returned bysemanticModel.symbol(annotationNode)without checking if it's present. If the symbol is not found, this will throwNoSuchElementException.🛡️ Suggested fix
public static Annotation getAnnotationModel(AnnotationNode annotationNode, SemanticModel semanticModel, ModulePartNode modulePartNode) { - AnnotationSymbol annotationSymbol = (AnnotationSymbol) semanticModel.symbol(annotationNode).get(); - String annotName = annotationSymbol.getName().orElse(""); - if (annotName.isEmpty()) { + Optional<AnnotationSymbol> symbolOpt = semanticModel.symbol(annotationNode) + .filter(s -> s instanceof AnnotationSymbol) + .map(s -> (AnnotationSymbol) s); + if (symbolOpt.isEmpty()) { + return null; + } + String annotName = symbolOpt.get().getName().orElse(""); + if (annotName.isEmpty()) { return null; }
🧹 Nitpick comments (2)
test-manager-service/modules/test-manager-service-ls-extension/src/main/java/io/ballerina/testmanagerservice/extension/Utils.java (2)
607-619: Consider adding null check forrangeparameter.Since
getAnnotationsRangecan returnnull(when annotations list is empty), callers passing its result directly to this method would cause aNullPointerException. Consider adding a defensive null check or documenting the precondition.♻️ Suggested improvement
public static String extractTextFromRange(io.ballerina.tools.text.TextDocument textDocument, LineRange range) { + if (range == null) { + return ""; + } int start = textDocument.textPositionFrom(range.startLine()); int end = textDocument.textPositionFrom(range.endLine()); String text = textDocument.toString().substring(start, end); return text.replaceAll("^\"|\"$", "").trim(); }
628-651: Consider avoiding string concatenation insideStringBuilder.append().Lines 642 and 647 use string concatenation (
"map<[" + ... + "]>|error"and"return ai:loadConversationThreads(\"" + ... + "\");") insideappend()calls, which creates intermediate String objects. For consistency with the rest of the method, consider using multipleappend()calls.♻️ Suggested improvement
.append(Constants.KEYWORD_RETURNS) .append(Constants.SPACE) - .append("map<[" + Constants.AI_CONVERSATION_THREAD_TYPE + "]>|error") + .append("map<[") + .append(Constants.AI_CONVERSATION_THREAD_TYPE) + .append("]>|error") .append(Constants.SPACE) .append(Constants.OPEN_CURLY_BRACE) .append(Constants.LINE_SEPARATOR) .append(Constants.TAB_SEPARATOR) - .append("return ai:loadConversationThreads(\"" + evalSetFilePath + "\");") + .append("return ai:loadConversationThreads(\"") + .append(evalSetFilePath) + .append("\");") .append(Constants.LINE_SEPARATOR)
MohamedSabthar
left a comment
There was a problem hiding this comment.
@dan-niles is there a place we conditionally enable eval features based on ai module version and lang version here? Or is it done in the extension?
...ator-core/src/main/java/io/ballerina/flowmodelgenerator/core/model/node/AgentRunBuilder.java
Outdated
Show resolved
Hide resolved
...ator-core/src/main/java/io/ballerina/flowmodelgenerator/core/model/node/AgentRunBuilder.java
Outdated
Show resolved
Hide resolved
...ator-core/src/main/java/io/ballerina/flowmodelgenerator/core/model/node/AgentRunBuilder.java
Outdated
Show resolved
Hide resolved
@MohamedSabthar I have implemented it on the extension side. |
There was a problem hiding this comment.
Pull request overview
This pull request enhances the Ballerina Language Server to support agent-based evaluations in the Ballerina Integrator (BI). The primary goal is to enable BI-side creation and reuse of AI agents for evaluations, simplify agent usage at the module level, and provide richer test configuration with eval-set-driven workflows.
Changes:
- Extended
@test:Configannotation to support additional parameters (dataProvider,dataProviderMode,evalSetFile,dependsOn,after,before,runs,minPassRate) for comprehensive test configuration - Added support for automatically generating data provider functions when users provide an eval-set file, enabling agent-based evaluation workflows
- Introduced new node kinds (
AGENTS,AGENT_RUN) andAgentRunBuilderto construct agent run calls with inferred-type handling and unique variable naming - Moved AI agent usage from service-level to module-level scope by removing initialization scaffolding in
AiChatServiceBuilder, enabling agent evaluations to be written at module level - Added
getAvailableAgentsAPI and JSON-RPC endpoint to support agent instance reuse from the BI side panel
Reviewed changes
Copilot reviewed 61 out of 61 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test-manager-service/.../Annotation.java | Extended ConfigAnnotationBuilder with new fields for test configuration (dataProvider, evalSetFile, runs, minPassRate, etc.) |
| test-manager-service/.../Utils.java | Added helper methods for evalSet handling, including extraction and updating of evalSet file paths, AI import checks, and evalSet data provider template generation |
| test-manager-service/.../TestManagerService.java | Enhanced addTestFunction and updateTestFunction to support evalSet mode, including automatic generation of data provider functions and parameter injection |
| test-manager-service/.../Constants.java | Added constants for AI module, evalSet mode, and data provider handling |
| service-model-generator/.../AiChatServiceBuilder.java | Simplified agent handling by removing service-level initialization and version checks, enabling module-level agent usage |
| flow-model-generator/.../AgentRunBuilder.java | New builder class for agent run calls with inferred-type handling and unique variable naming |
| flow-model-generator/.../AgentCallBuilder.java | Updated to use cached templates and improved type descriptor labeling |
| flow-model-generator/.../NodeKind.java | Added AGENTS and AGENT_RUN node kinds |
| flow-model-generator/.../AvailableNodesGenerator.java | Updated to support agent availability queries and simplified agent node metadata |
| flow-model-generator/.../AiUtils.java | Added utility methods for property manipulation with updated labels |
| Test resource files | Added comprehensive test resources for minPassRate and evalSet features |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Purpose
Related to wso2/product-ballerina-integrator#2333
Changes Include:
@test:Config-dataProvider,after,before,dependsOn,runs,minPassRateThis pull request enhances the Ballerina Language Server to add BI-side support for agent-based evaluations and to expand test configuration and eval-set workflows. Main outcomes:
Agent instance management and reuse
Service and module integration
Test configuration and eval-set support
Public API and model updates
Overall, the changes enable BI to create and reuse AI agents for evaluations, simplify agent usage at module level, and provide richer, eval-set-driven test configuration and generation.