Implement custom LS API for CodeMap generation in BI Copilot#649
Implement custom LS API for CodeMap generation in BI Copilot#649yasithrashan wants to merge 33 commits intoballerina-platform:mainfrom
Conversation
9aead0c to
caa5ccd
Compare
There was a problem hiding this comment.
Pull request overview
This PR implements a custom Language Server API to generate a CodeMap for Ballerina Copilot, reducing token usage by sending structured code artifacts instead of full source code to the LLM.
Changes:
- Introduced CodeMap generation API with request/response models and service endpoint
- Implemented AST transformation to extract structured artifacts (services, functions, types, listeners, connections, etc.)
- Added comprehensive test coverage with multiple Ballerina project scenarios
Reviewed changes
Copilot reviewed 57 out of 57 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| CodeMapArtifact.java | Data model for code artifacts with builder pattern |
| CodeMapFile.java | Record representing a file with its artifacts |
| CodeMapGenerator.java | Main generator logic iterating through project modules |
| CodeMapNodeTransformer.java | AST visitor extracting artifacts from syntax nodes |
| CodeMapRequest.java | Request model with project path |
| CodeMapResponse.java | Response model with extracted files |
| DesignModelGeneratorService.java | Added codeMap endpoint to existing service |
| module-info.java | Exported codemap package |
| CodeMapGeneratorTest.java | Test class with data provider pattern |
| testng.xml | Added test class to test suite |
| codemap/source/* | Test Ballerina projects for various scenarios |
| codemap/config/* | Expected JSON output for test validation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...model-generator-ls-extension/src/test/resources/codemap/source/http_service/http_service.bal
Outdated
Show resolved
Hide resolved
...rchitecture-model-generator-ls-extension/src/test/resources/codemap/config/http_service.json
Outdated
Show resolved
Hide resolved
| import java.util.List; | ||
| import java.util.Map; | ||
|
|
||
| public record CodeMapArtifact(String name, String type, LineRange lineRange, List<String> modifiers, |
There was a problem hiding this comment.
Children can be a optional field, give a default for that
| import java.util.List; | ||
| import java.util.Map; | ||
|
|
||
| public record CodeMapArtifact(String name, String type, LineRange lineRange, List<String> modifiers, |
There was a problem hiding this comment.
Please give some samples for modifiers, propertirs, children. What are they?
| import java.util.List; | ||
| import java.util.Map; | ||
|
|
||
| public record CodeMapArtifact(String name, String type, LineRange lineRange, List<String> modifiers, |
There was a problem hiding this comment.
Where is the filename capture in here?
There was a problem hiding this comment.
The filename is not captured directly. In the CodeMapGenerator.java file, within the
generateCodeMap function, we get the file name using:
String fileName = document.name();Then, in the same function, after extracting the artifacts from a single file, we wrap
everything into a CodeMapFile
CodeMapFile codeMapFile = new CodeMapFile(fileName, relativeFilePath, artifacts);254bb0e to
2b2b5bc
Compare
| import java.util.Collections; | ||
| import java.util.List; | ||
|
|
||
| public record CodeMapFile(String fileName, String relativeFilePath, List<CodeMapArtifact> artifacts) { |
There was a problem hiding this comment.
Missing java doc.
Add java docs for all the public classes.
There was a problem hiding this comment.
@yasithrashan This is mandatory.
All public constructs should have Javadocs while explaining the main functioanlity and inputs and outputs
And also please mark public only the necessary ones
| @@ -0,0 +1,68 @@ | |||
| /* | |||
| * Copyright (c) 2025, WSO2 LLC. (http://www.wso2.com) | |||
There was a problem hiding this comment.
| * Copyright (c) 2025, WSO2 LLC. (http://www.wso2.com) | |
| * Copyright (c) 2026, WSO2 LLC. (http://www.wso2.com) |
Apply this to other applicable areas.
| * Tracks changed files per project for incremental code map generation. | ||
| * This singleton maintains a thread-safe record of file changes between API calls. | ||
| * | ||
| * @since 1.0.0 |
There was a problem hiding this comment.
| * @since 1.0.0 | |
| * @since 1.6.0 |
Apply this to other applicable areas.
| if (instance == null) { | ||
| instance = new ChangedFilesTracker(); | ||
| } | ||
| return instance; |
There was a problem hiding this comment.
Making the method synchronized may be overkill, since after the first call all subsequent calls are read-only. Use the holder method for handling the Singleton.
|
|
||
| // Get and clear the list of changed files for a project. | ||
| public List<String> getAndClearChangedFiles(String projectKey) { | ||
| Set<String> files = changedFilesMap.remove(projectKey); |
There was a problem hiding this comment.
We should implement a rollback mechanism for this instruction. If the LS fails to respond, the state becomes desynchronized for the extension because files are being removed prematurely. Given the low likelihood of this error, adding a note is sufficient for now.
| extractDocumentation(functionDefinitionNode.metadata()).ifPresent(functionBuilder::documentation); | ||
| extractComments(functionDefinitionNode).ifPresent(functionBuilder::comment); | ||
|
|
||
| functionBuilder.type("FUNCTION"); |
There was a problem hiding this comment.
Extract the strings used to define the JSON fields to the class level.
There was a problem hiding this comment.
Do we need these changes for this PR?
|
|
||
| if (!files.equals(testConfig.output())) { | ||
| TestConfig updatedConfig = new TestConfig(testConfig.description(), testConfig.source(), files); | ||
| updateConfig(configJsonPath, updatedConfig); |
There was a problem hiding this comment.
| updateConfig(configJsonPath, updatedConfig); | |
| // updateConfig(configJsonPath, updatedConfig); |
| * | ||
| * @since 1.0.0 | ||
| */ | ||
| public class PublishCodeMapSubscriberTest extends AbstractLSTest { |
There was a problem hiding this comment.
Need to cover following scenarios if not included:
- Multiple File Accumulation - Trigger onEvent for two different files within the same project sequentially.
- Consecutive Events for Same File - Trigger onEvent multiple times for the same file.
- Same File in Different Modules: Trigger events for both
types.balandmodules/mod1/types.bal. - State Clearing - Verify that retrieving the changes "consumes" them, ensuring subsequent calls don't return stale data.
- Project Isolation: Multiple Project - Verify that changes in "Project A" do not leak into "Project B".
| * | ||
| * @since 1.0.0 | ||
| */ | ||
| public class CodeMapGeneratorTest extends AbstractLSTest { |
There was a problem hiding this comment.
Add a single test case with a Ballerina project as the source. Keep the code minimal, since the goal is only to verify that the API works for Ballerina projects, not to validate different construct types.
|
|
||
| public static synchronized ChangedFilesTracker getInstance() { | ||
| if (instance == null) { | ||
| instance = new ChangedFilesTracker(); |
There was a problem hiding this comment.
public static ChangedFilesTracker getInstance() {
// First check (no locking)
if (instance == null) {
// Only synchronize if it looks like we need to create it
synchronized (ChangedFilesTracker.class) {
// Second check (inside the lock)
if (instance == null) {
instance = new ChangedFilesTracker();
}
}
}
return instance;
}
| public List<String> getAndClearChangedFiles(String projectKey) { | ||
| Set<String> files = changedFilesMap.remove(projectKey); | ||
| if (files == null || files.isEmpty()) { | ||
| return Collections.emptyList(); |
|
|
||
| public Builder modifiers(List<String> modifiers) { | ||
| if (!modifiers.isEmpty()) { | ||
| this.properties.put("modifiers", new ArrayList<>(modifiers)); |
There was a problem hiding this comment.
This kind of literals should be defines as constants
| } | ||
|
|
||
| public Builder config(String config) { | ||
| return addProperty("config", config); |
There was a problem hiding this comment.
Is the config is a simple element?
| ModuleInfo moduleInfo = createModuleInfo(module.descriptor()); | ||
|
|
||
| // Iterate through each document in the module | ||
| for (var documentId : module.documentIds()) { |
There was a problem hiding this comment.
This may not contains resources
There was a problem hiding this comment.
In here what I mean by resources is,
In a Ballerina project there can be a folder called resources.
It may contains images and other necessaru resources
Basically its not a bal file, But images and other resources necessaru for the code
|
|
||
| for (var moduleId : currentPackage.moduleIds()) { | ||
| Module module = currentPackage.module(moduleId); | ||
| SemanticModel semanticModel = |
There was a problem hiding this comment.
Handle scearios where the semanticModel = null
| PackageUtil.getCompilation(currentPackage).getSemanticModel(moduleId); | ||
| ModuleInfo moduleInfo = createModuleInfo(module.descriptor()); | ||
|
|
||
| for (var documentId : module.documentIds()) { |
There was a problem hiding this comment.
Add a variable isTest = true/false, Depends on tat add test files also into here. It will be helpful when copilot generating tests
| private final ModuleInfo moduleInfo; | ||
| private final boolean extractComments; | ||
|
|
||
| private static final String AUTOMATION_FUNCTION_NAME = "automation"; |
There was a problem hiding this comment.
Please check, there is no syntax called automation
| // Build full import name | ||
| String fullImportName = orgName.isEmpty() ? moduleName : orgName + "/" + moduleName; | ||
| if (alias.isPresent()) { | ||
| fullImportName += " as " + alias.get(); |
| } else if (firstExpression != null) { | ||
| return firstExpression.toSourceCode().strip(); | ||
| } else { | ||
| return ""; |
There was a problem hiding this comment.
There is no meaning of empty string as service name
| return Optional.of(classSymbol); | ||
| } | ||
| } catch (Throwable e) { | ||
| // Ignore |
| .filter(doc -> !doc.isEmpty()); | ||
| } | ||
|
|
||
| private Optional<String> extractComments(Node node) { |
There was a problem hiding this comment.
| private Optional<String> extractComments(Node node) { | |
| private Optional<String> extractInlineComments(Node node) { |
| .getAndClearChangedFiles(projectKey); | ||
|
|
||
| if (changedFiles.isEmpty()) { | ||
| // No changes tracked, return empty response |
| import ballerina/tcp; | ||
|
|
||
| // HTTP connection | ||
| final http:Client httpConnection = check new ("https://api.example.com", { |
There was a problem hiding this comment.
Add test for Url as varaibales as well
1f255a3 to
0d9f588
Compare
0d9f588 to
0304b48
Compare
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
c486070 to
e760d4c
Compare
Purpose
Related to: wso2/product-ballerina-integrator#2192
Ballerina Copilot currently sends the full project code to the LLM, which causes high token usage and slow responses.
To fix this, we generate a
CodeMapof the project using a custom Language Server (LS) API and send this CodeMap to the LLM so it can understand and navigate the codebase without needing the full source.Goals
Approach