Feat/agentic modifications#1234
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThis PR introduces a major architectural refactoring centered on JAF (JSON Agentic Framework) integration for agent-based chat. Key changes include: refactoring the frontend Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant ChatBox as ChatBox
participant Server as Server
participant JAFProvider as JAF Provider
participant Tools as Tools & MCP
participant Database as Database
participant Vespa as Vespa Search
Client->>ChatBox: Send message (HandleSendOptions)
ChatBox->>Server: POST /chat handleSend(options)
Server->>Server: Create AgentRunContext
Server->>Database: Get agent, user, KB selections
Server->>JAFProvider: Execute agent with context
activate JAFProvider
JAFProvider->>JAFProvider: Load images, resolve stops
JAFProvider->>JAFProvider: Build LLM prompt with tools
JAFProvider->>Tools: Call LLM to plan/execute
loop Tool Execution
Tools->>Tools: Select next tool (global search, KB, Gmail, etc.)
alt Tool is KB Search
Tools->>Vespa: Search with KB selections
Vespa-->>Tools: Return fragments
else Tool is MCP-based
Tools->>Tools: Build MCP context + synthetic fragments
Tools-->>Tools: Yield results
end
Tools->>JAFProvider: Stream tool_requests / tool_call events
JAFProvider->>Database: Insert tool execution record
JAFProvider->>Server: Stream reasoning step
Server->>Client: SSE: reasoning update
end
JAFProvider->>JAFProvider: Review tool outputs if needed
JAFProvider->>Vespa: Search for citations if required
JAFProvider->>Server: Stream citations + fragments
Server->>Client: SSE: citations, images
JAFProvider->>JAFProvider: Synthesize final answer
JAFProvider->>Server: Stream assistant_message
Server->>Client: SSE: final answer
JAFProvider->>Database: Insert assistant message + chat trace
JAFProvider-->>Server: Emit response metadata
deactivate JAFProvider
Server->>Client: Emit final event + close stream
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes
Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @Aayushjshah, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a major architectural shift in the agentic chat functionality by migrating to a JAF-based framework. This refactoring aims to enhance the modularity, extensibility, and maintainability of the agent system, particularly concerning tool orchestration and context management. It also includes a significant update to how image context is handled within the agent's reasoning process and a cleanup of old agent prompt generation logic. Additionally, a frontend refactor streamlines the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a significant and well-structured refactoring of the agentic architecture. Key improvements include the introduction of a comprehensive schema system for agent state and tools, which greatly enhances type safety and maintainability. The frontend is cleaned up by refactoring handleSend functions to use a single options object. A major backend change is the replacement of MessageWithToolsApi with a new, more modular MessageAgents flow, supported by new modules for handling knowledge base selections, resource access checks, and agent prompt creation. The addition of abortable agent execution via AbortController is a crucial improvement for user experience. The pre-commit hook has been commented out, which should be addressed. There are also opportunities to improve error message specificity in the Slack tool and ensure deterministic ID generation for synthetic fragments in the JAF adapter. Overall, this is a very positive change that modernizes the agentic framework.
| # #!/bin/sh | ||
| # set -u | ||
|
|
||
| # Function to check formatting and then format if necessary | ||
| check_and_format() { | ||
| # Temporarily disable exit on error | ||
| set +e | ||
| cd server | ||
| bunx @biomejs/biome check ../ --formatter-enabled=true --linter-enabled=false --vcs-use-ignore-file=true --organize-imports-enabled=false | ||
| CHECK_STATUS=$? | ||
| # set -e | ||
| if [ $CHECK_STATUS -ne 0 ]; then | ||
| echo "Formatting issues detected. Running formatter..." | ||
| # # Function to check formatting and then format if necessary | ||
| # check_and_format() { | ||
| # # Temporarily disable exit on error | ||
| # set +e | ||
| # cd server | ||
| # bunx @biomejs/biome check ../ --formatter-enabled=true --linter-enabled=false --vcs-use-ignore-file=true --organize-imports-enabled=false | ||
| # CHECK_STATUS=$? | ||
| # # set -e | ||
| # if [ $CHECK_STATUS -ne 0 ]; then | ||
| # echo "Formatting issues detected. Running formatter..." | ||
|
|
||
| # Format all applicable files, not just staged ones | ||
| bun run format | ||
| # # Format all applicable files, not just staged ones | ||
| # bun run format | ||
|
|
||
| echo "Files have been formatted. Please add them to staging and commit again." | ||
| exit 1 | ||
| fi | ||
| } | ||
| # echo "Files have been formatted. Please add them to staging and commit again." | ||
| # exit 1 | ||
| # fi | ||
| # } | ||
|
|
||
| # Run the check and format function | ||
| check_and_format | ||
| # # Run the check and format function | ||
| # check_and_format |
There was a problem hiding this comment.
The pre-commit hook script has been completely commented out. This disables the formatting checks that run before a commit, which can lead to inconsistent code formatting in the repository. Was this intentional? If this is a temporary change for development, please remember to re-enable it before merging. If the hook is no longer needed, this file should be removed to avoid confusion.
| name: info.metadata?.name || connectorId, | ||
| } as unknown as Citation["entity"], | ||
| } | ||
| const syntheticId = `${connectorId}:${toolName}:${Date.now()}` |
There was a problem hiding this comment.
The use of Date.now() to generate a syntheticId for fragments makes the ID non-deterministic. If the same tool is called with the same arguments multiple times, it will generate a different fragment ID each time. This could cause issues with caching, duplicate detection, or other logic that relies on stable identifiers. Consider creating a deterministic ID, for example, by hashing the connectorId, toolName, and the formattedContent.
| try { | ||
| normalizedTimestampRange = normalizeTimestampRange(scopedTimeRange) | ||
| } catch { | ||
| return ToolResponse.error( | ||
| ToolErrorCodes.MISSING_REQUIRED_FIELD, | ||
| "Please provide at least one filter (e.g., channelName, user, date range, or query) to scope the Slack message search.", | ||
| ToolErrorCodes.INVALID_INPUT, | ||
| "Invalid timeRange supplied. Provide ISO-8601 values for startTime and endTime.", | ||
| { toolName: "getSlackRelatedMessages" }, | ||
| ) | ||
| } |
There was a problem hiding this comment.
The catch block for normalizeTimestampRange currently swallows the specific error message (e.g., 'Invalid startTime' or 'Invalid endTime') and returns a generic error. This makes debugging more difficult. It would be more helpful to capture the specific error message from the thrown Error and include it in the ToolResponse.error to provide more context on what failed.
| try { | |
| normalizedTimestampRange = normalizeTimestampRange(scopedTimeRange) | |
| } catch { | |
| return ToolResponse.error( | |
| ToolErrorCodes.MISSING_REQUIRED_FIELD, | |
| "Please provide at least one filter (e.g., channelName, user, date range, or query) to scope the Slack message search.", | |
| ToolErrorCodes.INVALID_INPUT, | |
| "Invalid timeRange supplied. Provide ISO-8601 values for startTime and endTime.", | |
| { toolName: "getSlackRelatedMessages" }, | |
| ) | |
| } | |
| try { | |
| normalizedTimestampRange = normalizeTimestampRange(scopedTimeRange) | |
| } catch (e) { | |
| const errorMessage = e instanceof Error ? e.message : "Invalid timeRange value" | |
| return ToolResponse.error( | |
| ToolErrorCodes.INVALID_INPUT, | |
| `Invalid timeRange supplied: ${errorMessage}. Provide ISO-8601 values.`, | |
| { toolName: "getSlackRelatedMessages" }, | |
| ) | |
| } |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
frontend/src/components/ChatBox.tsx (1)
896-913: FixerrorMessagetemporal-dead-zone bug in upload error handlingIn
uploadFiles, thecatchblock referenceserrorMessageinside theAbortErrorbranch before it is declared in the subsequentelsebranch. At runtime this will throw (Cannot access 'errorMessage' before initialization) whenever an upload is aborted.Refactor to compute
errorMessageonce at the top of thecatchblock:- } catch (error) { - if (error instanceof Error && error.name === "AbortError") { + } catch (error) { + const errorMessage = + error instanceof Error ? error.message : "Upload failed" + + if (error instanceof Error && error.name === "AbortError") { setSelectedFiles((prev) => prev.map((f) => f.id === selectedFile.id ? { ...f, uploading: false, uploadError: errorMessage, } : f, ), ) return null } - const errorMessage = - error instanceof Error ? error.message : "Upload failed" setSelectedFiles((prev) =>This keeps behavior the same while removing the TDZ/runtime error.
🧹 Nitpick comments (19)
server/api/chat/tools/slack/getSlackMessages.ts (1)
108-133: Fallback logic and normalization look correct.The fallback to a 72-hour window when neither scope filters nor query are provided is a good defensive measure. The normalization with
Date.parsecorrectly validates ISO-8601 strings.One minor observation: the
catchblock at line 127 swallows the error without logging it. Consider adding a debug log for troubleshooting invalid inputs.try { normalizedTimestampRange = normalizeTimestampRange(scopedTimeRange) - } catch { + } catch (err) { + Logger.debug({ err, scopedTimeRange }, "[getSlackRelatedMessages] Invalid timeRange") return ToolResponse.error(server/api/chat/jaf-adapter.ts (1)
148-148: Consider using a more robust unique ID.
Date.now()for the synthetic fragment ID could produce duplicates if multiple MCP tools execute within the same millisecond. Consider appending a random suffix or using a UUID/CUID.- const syntheticId = `${connectorId}:${toolName}:${Date.now()}` + const syntheticId = `${connectorId}:${toolName}:${Date.now()}-${Math.random().toString(36).slice(2, 8)}`server/api/chat/agents.ts (1)
145-191: Consider stronger typing forcreateMockAgentFromFormData.The function uses
anytypes for all parameters (agentPromptPayload,user,workspace). This reduces type safety and could lead to runtime errors if the shape of these objects changes.Consider defining explicit types or at least using
unknownwith runtime validation.-const createMockAgentFromFormData = ( - agentPromptPayload: any, - user: any, - workspace: any, +const createMockAgentFromFormData = ( + agentPromptPayload: Record<string, unknown>, + user: { id: number; email: string; timeZone?: string }, + workspace: { id: number; externalId: string }, email: string, ): { agentForDb: SelectAgent; agentPromptForLLM: string } => {server/api/chat/tool-schemas.ts (4)
59-62: TimestampRangeSchema should validate that endTime >= startTime.Currently, the schema doesn't validate temporal ordering. Invalid ranges like
{ startTime: "2025-12-31", endTime: "2024-01-01" }would pass validation.Consider adding a
.refine()check:export const TimestampRangeSchema = z.object({ startTime: z.string().optional().describe("ISO 8601 date string"), endTime: z.string().optional().describe("ISO 8601 date string"), }).optional().refine( (range) => { if (!range?.startTime || !range?.endTime) return true return new Date(range.startTime) <= new Date(range.endTime) }, { message: "startTime must be before or equal to endTime" } )
168-175: Inconsistency:queryis optional in SearchGlobalInputSchema but required in other search schemas.
SearchGlobalInputSchemahasquery: z.string().optional()whileSearchGmailInputSchema(line 179),SearchDriveInputSchema(line 193), etc., havequery: z.string()(required). This inconsistency may confuse consumers.Consider making this consistent or documenting the rationale for the difference.
662-677:validateToolInputthrows on unknown tools instead of returning an error.This is inconsistent with the function's return type which suggests it handles errors gracefully via
{ success: false; error: ... }. An unknown tool name results in an exception that the caller must catch separately.Consider returning an error result instead:
export function validateToolInput<T>( toolName: string, input: unknown ): { success: true; data: T } | { success: false; error: z.ZodError } { const schema = getToolSchema(toolName) if (!schema) { - throw new Error(`Tool schema not found: ${toolName}`) + const error = new z.ZodError([{ + code: "custom", + path: ["toolName"], + message: `Tool schema not found: ${toolName}`, + }]) + return { success: false, error } }
734-738: Unused loop variable warning.The
_prefix convention indicates unused variables, but the linter may still flag this. Consider using a more explicit destructuring pattern.export function getToolsByCategory(category: ToolCategory): string[] { return Object.entries(TOOL_SCHEMAS) - .filter(([_, schema]) => schema.category === category) - .map(([name, _]) => name) + .filter(([, schema]) => schema.category === category) + .map(([name]) => name) }server/api/chat/agent-schemas.ts (2)
9-9: Unused import:JAFMessage.The
JAFMessagetype is imported but not used anywhere in this file.-import type { Message as JAFMessage } from "@xynehq/jaf"
172-179: Document runtime-only nature of Set and Map fields in AgentRunContext to prevent future serialization bugs.
seenDocuments: Set<string>,turnFragments: Map<number, ...>,imagesByTurn: Map<number, ...>,enabledTools: Set<string>, andfailedTools: Map<string, ...>won't serialize withJSON.stringify(). While these fields are currently used only at runtime and not serialized, explicitly documenting them as runtime-only would prevent accidental serialization attempts if the code evolves to cache, log, or transmit this context in the future. Consider adding a comment noting these fields should never be serialized, or refactor to use arrays/plain objects if serialization becomes needed.server/config.ts (1)
90-90: MakedelegationAgenticenvironment-driven and properly typed as boolean for consistency
IMAGE_CONTEXT_CONFIGas an exported constant and in the default config is clear and reusable.For
delegationAgentic, setting it to the string literal"true"diverges from the pattern used for similar flags likeuseLegacyServiceAccountSyncanduseLegacySlackSync. While the string evaluates as truthy in theif (config.delegation_agentic)check at line 5153 ofmessage-agents.ts, the implementation is inconsistent and obscures intent. Align with existing patterns:-let delegationAgentic = "true" +const delegationAgentic = + process.env.DELEGATION_AGENTIC === "true" + ? true + : process.env.DELEGATION_AGENTIC === "false" + ? false + : true // default to enabledThis keeps behavior configurable, makes the boolean type explicit, and matches the style of adjacent config flags.
server/tests/messageAgentsFragments.test.ts (1)
89-267: Solid coverage for agent context helpers; consider tightening the dedupe assertionThis suite nicely exercises context mutation, review‑prompt composition, delegated fragments, image selection, and MCP tool synthesis. For
getRecentImagesFromContext, you might add a duplicatefileNamein eithercurrentTurnArtifacts.imagesorrecentImagesand assert it appears only once inselectedto lock in the deduplication behavior.frontend/src/components/ChatBox.tsx (1)
188-197: Options-objecthandleSendrefactor looks good; update the prop comment for clarityThe new
HandleSendOptionstype and the switch tohandleSend(options)inhandleSendMessageare wired correctly and help keep the parameter surface maintainable. One small nit: the inline comment onhandleSendstill describes the old positional arguments.Consider updating that comment only:
- handleSend: (options: HandleSendOptions) => void // Expects agentId string and optional fileIds + handleSend: (options: HandleSendOptions) => void // Expects full send context (message, attachments, model, agent/tools, etc.)No behavioral changes needed here.
Also applies to: 205-205, 2172-2181
server/api/chat/tools/utils.ts (1)
30-68: Fragment-only return type is consistent; consider markingsearchContextas intentionally unusedChanging
formatSearchToolResponseto returnMinimalAgentFragment[]and to return[]when there are no hits matches the fragment-centric pipeline and looks correct.
searchContextis now unused; to avoid lint noise and clarify intent you could either:
- Prefix it with
_:-export async function formatSearchToolResponse( - searchResults: VespaSearchResponse | null, - searchContext: { +export async function formatSearchToolResponse( + searchResults: VespaSearchResponse | null, + _searchContext: {or, if no longer needed anywhere, remove the parameter entirely and update callers.
server/api/chat/jaf-provider.ts (1)
79-98: Prefer structured logger overconsole.warnfor image-handling warningsThe new image handling pipeline (
parseImageFileName,buildLanguageModelImageParts, and attachment into the last user message) is well thought out: you validate docId/imageNumber for path traversal, enforceMAX_IMAGE_BYTES, and restrict MIME types.One inconsistency: these paths currently use
console.warnfor invalid references, oversize/unsupported images, and missing user messages, while the rest of the file usesLogger.debug/getLoggerWithChild. For better observability and log routing, consider switching these to the shared logger, e.g.:- if (!match) { - console.warn(`[JAF Provider] Invalid image reference: ${imageName}`) + if (!match) { + Logger.warn( + { imageName }, + "[JAF Provider] Invalid image reference" + )and similarly for the other
console.warnsites in the image-loading and attachment code. This will keep image-related issues visible in your existing logging stack.Also applies to: 100-183, 420-431
frontend/src/routes/_authenticated/chat.tsx (4)
673-721: IncludeselectedModelin effect dependencies for completenessThe
q-driven effect now passesselectedModelintohandleSend, butchatParams.selectedModelis not listed in the dependency array. Becauseqis the main trigger and is cleared after first use, this is unlikely to cause real bugs, but it does create a small mismatch between used values and dependencies.Consider adding
chatParams.selectedModelto the dependency array to keep the effect self-consistent:- }, [ - chatParams.q, - chatParams.sources, - chatParams.agentId, - chatParams.toolsList, - chatParams.metadata, - router, - ]) + }, [ + chatParams.q, + chatParams.sources, + chatParams.agentId, + chatParams.toolsList, + chatParams.metadata, + chatParams.selectedModel, + router, + ])
946-991: Options-basedhandleSendrefactor is sound; agent fallback logic is correctThe move to
HandleSendOptionsis clean, and theagentIdToUse = agentId ?? chatParams.agentIdfallback preserves URL-driven agent selection when the caller doesn’t explicitly override it. PassingselectedKbItems || []intostartStreamalso avoids having to special-case undefined downstream.Only minor thought: if
isFollowupis conceptually boolean everywhere, you might normalize it on the way in (const followup = !!isFollowup) before passing it tostartStream, but that’s purely optional given TypeScript coverage.
2135-2157: VirtualizedMessageshandleSendtyping now matches ChatBox but remains unusedPropagating the
handleSend: (options: HandleSendOptions) => voidsignature intoVirtualizedMessagesPropskeeps the type surface consistent withChatBox. WithinVirtualizedMessagesthe prop is still not used (follow-ups go throughchatBoxRef.current?.sendMessage), which is fine but slightly redundant.If this prop isn’t going to be used here, consider dropping it from
VirtualizedMessagesPropsto keep the interface lean; otherwise, it’s ready for future inline send patterns.Also applies to: 2167-2203
2855-2918: toolsList and metadata query parsing are generally robust; double-check toolsList shapeThe
toolsListtransformer is defensive and handles both array and JSON-string inputs, validating againsttoolsListItemSchemaand gracefully falling back toundefinedon parse errors. AddingagentId,selectedModel, andshareTokenintochatParamsalso aligns the router search state with the new agentic flows.One thing to verify: callers that push
toolsListinto the URL/search state are indeed sending either an already-validated array or a JSON-encoded array that matchestoolsListItemSchema. Otherwise,toolsListwill silently becomeundefined, and tools configured in the UI won’t be reflected in the initial send.If you want stronger guarantees, you could log/track parse failures in the
catchblocks, but the current behavior is acceptable for a user-facing query surface.server/api/chat/resource-access.ts (1)
20-46: Connector state aggregation is coherent; consider logging insidehasConnector/hasSyncJob
UserConnectorStateandcreateEmptyConnectorStatemap cleanly onto the connector/sync-job checks ingetUserConnectorState, andgetConnectorGateStatuscorrectly interprets those booleans for each app family. UsingisConnectorReadyForAppto gate onConnectorStatus(with special handling for Slack) is a good abstraction.Both
hasConnectorandhasSyncJobcurrently swallow all errors and just returnfalse. That’s reasonable for a user-facing “resource access summary” (fail closed), but it can hide operational/debugging signals if, for example, the DB or Vespa layer is misconfigured.If observability becomes an issue, you might log at debug level inside the
catchblocks (with some rate limiting) so infra problems are distinguishable from “user has no connector/sync-job”.Also applies to: 73-113, 316-391, 483-517
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
server/bun.lockbis excluded by!**/bun.lockb
📒 Files selected for processing (40)
.husky/pre-commit(1 hunks)frontend/src/components/ChatBox.tsx(2 hunks)frontend/src/components/DocumentChat.tsx(3 hunks)frontend/src/components/EnhancedReasoning.tsx(1 hunks)frontend/src/routes/_authenticated/agent.tsx(3 hunks)frontend/src/routes/_authenticated/chat.tsx(6 hunks)frontend/src/routes/_authenticated/index.tsx(2 hunks)server/.env.default(0 hunks)server/ai/agentPrompts.ts(0 hunks)server/api/agent/workflowAgentUtils.ts(5 hunks)server/api/chat/agent-schemas.ts(1 hunks)server/api/chat/agent-stop.ts(1 hunks)server/api/chat/agentPromptCreation.ts(1 hunks)server/api/chat/agentSteps.ts(0 hunks)server/api/chat/agents.ts(2 hunks)server/api/chat/chat.ts(12 hunks)server/api/chat/citation-utils.ts(1 hunks)server/api/chat/jaf-adapter.ts(3 hunks)server/api/chat/jaf-provider.ts(7 hunks)server/api/chat/jaf-stream.ts(0 hunks)server/api/chat/knowledgeBaseSelections.ts(1 hunks)server/api/chat/resource-access.ts(1 hunks)server/api/chat/runContextUtils.ts(1 hunks)server/api/chat/stream.ts(1 hunks)server/api/chat/tool-schemas.ts(1 hunks)server/api/chat/tools/global/index.ts(11 hunks)server/api/chat/tools/google/calendar.ts(3 hunks)server/api/chat/tools/google/contacts.ts(2 hunks)server/api/chat/tools/google/drive.ts(3 hunks)server/api/chat/tools/google/gmail.ts(3 hunks)server/api/chat/tools/knowledgeBase.ts(1 hunks)server/api/chat/tools/slack/getSlackMessages.ts(8 hunks)server/api/chat/tools/types.ts(1 hunks)server/api/chat/tools/utils.ts(2 hunks)server/api/chat/types.ts(1 hunks)server/api/chat/utils.ts(1 hunks)server/config.ts(2 hunks)server/package.json(1 hunks)server/tests/messageAgentsFragments.test.ts(1 hunks)server/types.ts(1 hunks)
💤 Files with no reviewable changes (4)
- server/.env.default
- server/api/chat/jaf-stream.ts
- server/api/chat/agentSteps.ts
- server/ai/agentPrompts.ts
🧰 Additional context used
🧠 Learnings (17)
📓 Common learnings
Learnt from: MayankBansal2004
Repo: xynehq/xyne PR: 757
File: server/db/schema/knowledgeBase.ts:151-183
Timestamp: 2025-09-02T07:03:18.671Z
Learning: MayankBansal2004 prefers to keep formatting changes separate from functional database improvements, focusing PRs on their stated objectives rather than mixing concerns.
Learnt from: Asrani-Aman
Repo: xynehq/xyne PR: 1163
File: server/api/chat/agents.ts:3941-3991
Timestamp: 2025-11-06T09:49:42.273Z
Learning: In server/api/chat/agents.ts dual RAG path and similar agent message flows, when `agentForDb.appIntegrations` is parsed, the outer function (agents.ts) only needs to extract high-level routing info like `agentAppEnums`. The detailed per-app selections (`selectedItems`) are intentionally extracted inside the inner RAG functions (`generateAnswerFromDualRag`, `UnderstandMessageAndAnswer` in chat.ts) by parsing the `agentPrompt` parameter. This is the established architectural pattern: outer function handles routing, inner function handles detailed selection parsing via `parseAppSelections(agentPromptData.appIntegrations)` and passes `selectedItem` to `searchVespaAgent` for per-app item scoping.
📚 Learning: 2025-06-16T11:56:22.752Z
Learnt from: oindrila-b
Repo: xynehq/xyne PR: 545
File: server/integrations/google/index.ts:137-145
Timestamp: 2025-06-16T11:56:22.752Z
Learning: In server/integrations/google/index.ts, both Logger (base logger) and loggerWithChild (factory for email-scoped child loggers) are intentionally maintained to handle different logging scenarios - one for when email context is not available and one for when it is available.
Applied to files:
server/types.tsserver/api/chat/jaf-provider.tsserver/api/chat/chat.ts
📚 Learning: 2025-11-06T09:49:42.273Z
Learnt from: Asrani-Aman
Repo: xynehq/xyne PR: 1163
File: server/api/chat/agents.ts:3941-3991
Timestamp: 2025-11-06T09:49:42.273Z
Learning: In server/api/chat/agents.ts dual RAG path and similar agent message flows, when `agentForDb.appIntegrations` is parsed, the outer function (agents.ts) only needs to extract high-level routing info like `agentAppEnums`. The detailed per-app selections (`selectedItems`) are intentionally extracted inside the inner RAG functions (`generateAnswerFromDualRag`, `UnderstandMessageAndAnswer` in chat.ts) by parsing the `agentPrompt` parameter. This is the established architectural pattern: outer function handles routing, inner function handles detailed selection parsing via `parseAppSelections(agentPromptData.appIntegrations)` and passes `selectedItem` to `searchVespaAgent` for per-app item scoping.
Applied to files:
server/api/chat/tools/google/drive.tsserver/api/chat/tools/google/calendar.tsserver/api/chat/tools/google/gmail.tsserver/api/chat/knowledgeBaseSelections.tsserver/api/chat/tools/utils.tsfrontend/src/routes/_authenticated/agent.tsxserver/api/chat/resource-access.tsserver/api/chat/jaf-adapter.tsserver/api/chat/agentPromptCreation.tsserver/api/chat/agents.tsserver/api/chat/tools/global/index.tsfrontend/src/routes/_authenticated/chat.tsxserver/api/chat/chat.tsserver/api/chat/agent-schemas.ts
📚 Learning: 2025-10-07T13:55:21.033Z
Learnt from: Ravishekhar7870
Repo: xynehq/xyne PR: 1059
File: frontend/src/routes/_authenticated/agent.tsx:3642-3692
Timestamp: 2025-10-07T13:55:21.033Z
Learning: In the agent.tsx file, folder navigation from Google Drive search results is now intended functionality - users should be able to navigate into searched folders via the GoogleDriveNavigation component.
Applied to files:
server/api/chat/tools/google/drive.tsserver/api/chat/tools/google/calendar.ts
📚 Learning: 2025-05-28T10:47:41.020Z
Learnt from: naSim087
Repo: xynehq/xyne PR: 484
File: server/integrations/google/sync.ts:222-222
Timestamp: 2025-05-28T10:47:41.020Z
Learning: The functions `handleGoogleDriveChange` and `getDriveChanges` in `server/integrations/google/sync.ts` are intentionally exported for future changes, even though they are not currently being imported by other modules.
Applied to files:
server/api/chat/tools/google/drive.ts
📚 Learning: 2025-08-11T08:44:08.412Z
Learnt from: MayankBansal2004
Repo: xynehq/xyne PR: 719
File: server/api/chat/chat.ts:4023-4029
Timestamp: 2025-08-11T08:44:08.412Z
Learning: In server/api/chat/chat.ts and similar files, MayankBansal2004 prefers not to add explicit type definitions for metadata.usage structures when the receiving array is already typed and default values ensure type safety through the || 0 pattern.
Applied to files:
server/api/chat/types.tsserver/api/chat/tools/google/contacts.tsfrontend/src/routes/_authenticated/index.tsxserver/api/chat/tools/slack/getSlackMessages.tsfrontend/src/routes/_authenticated/agent.tsxfrontend/src/components/ChatBox.tsxserver/api/chat/jaf-adapter.tsserver/api/chat/agents.tsserver/api/chat/tools/types.tsfrontend/src/routes/_authenticated/chat.tsx
📚 Learning: 2025-05-28T10:55:46.701Z
Learnt from: naSim087
Repo: xynehq/xyne PR: 484
File: server/integrations/google/gmail-worker.ts:293-294
Timestamp: 2025-05-28T10:55:46.701Z
Learning: There are two separate `parseMail` functions in the codebase: one in `server/integrations/google/gmail-worker.ts` with signature `(email, gmail, client, userEmail)` returning `{ mailData, insertedPdfCount, exist }`, and another in `server/integrations/google/gmail/index.ts` with signature `(email, gmail, userEmail, client, tracker?)` returning `{ mailData, exist }`. Each file calls its own local version correctly.
Applied to files:
server/api/chat/tools/google/contacts.tsserver/api/chat/tools/google/gmail.ts
📚 Learning: 2025-10-16T18:43:30.448Z
Learnt from: Himanshvarma
Repo: xynehq/xyne PR: 1122
File: frontend/src/utils/chatUtils.tsx:40-47
Timestamp: 2025-10-16T18:43:30.448Z
Learning: KB item citations (K[n_m] format) use 0-based indexing for the document index in frontend/src/utils/chatUtils.tsx, unlike regular citations which use 1-based indexing. When processing textToKbItemCitationIndex replacements, the raw parsed index should be used directly with citationUrls without subtracting 1.
Applied to files:
server/api/chat/citation-utils.tsserver/api/chat/chat.ts
📚 Learning: 2025-06-17T08:55:40.153Z
Learnt from: oindrila-b
Repo: xynehq/xyne PR: 557
File: server/integrations/google/gmail-worker.ts:313-317
Timestamp: 2025-06-17T08:55:40.153Z
Learning: In server/integrations/google/gmail-worker.ts, the log message intentionally shows totalMails (cumulative count) while sendStatsUpdate uses insertedMessagesInBatch (batch count). This is by design - logs show total progress for human readability while stats updates track batch progress for system functionality.
Applied to files:
server/api/chat/tools/google/gmail.tsserver/api/chat/tools/slack/getSlackMessages.ts
📚 Learning: 2025-06-10T05:40:04.427Z
Learnt from: naSim087
Repo: xynehq/xyne PR: 525
File: frontend/src/routes/_authenticated/admin/integrations/slack.tsx:141-148
Timestamp: 2025-06-10T05:40:04.427Z
Learning: In frontend/src/routes/_authenticated/admin/integrations/slack.tsx, the ConnectAction enum and related connectAction state (lines 141-148, 469-471) are intentionally kept for future development, even though they appear unused in the current implementation.
Applied to files:
frontend/src/routes/_authenticated/index.tsxfrontend/src/routes/_authenticated/chat.tsx
📚 Learning: 2025-07-10T08:02:41.059Z
Learnt from: naSim087
Repo: xynehq/xyne PR: 651
File: server/tests/extractEmailHeadersVal.test.ts:49-58
Timestamp: 2025-07-10T08:02:41.059Z
Learning: The test file `server/tests/extractEmailHeadersVal.test.ts` is primarily designed to verify email extraction from headers like to, from, cc, and bcc. Message-ID related functionality refactoring is planned for a future PR.
Applied to files:
server/tests/messageAgentsFragments.test.ts
📚 Learning: 2025-09-08T16:18:07.128Z
Learnt from: Aayushjshah
Repo: xynehq/xyne PR: 787
File: server/api/chat/mapper.ts:631-636
Timestamp: 2025-09-08T16:18:07.128Z
Learning: In server/api/chat/mapper.ts, the filter_query parameter for getSlackThreads should remain required: false (optional) in the mapper definition, even if it differs from the AgentTool definition in server/api/chat/tools.ts. This is an intentional design decision where parameter requirements can differ between the mapper and tool layers.
Applied to files:
server/api/chat/tools/slack/getSlackMessages.ts
📚 Learning: 2025-05-23T12:10:25.006Z
Learnt from: MayankBansal2004
Repo: xynehq/xyne PR: 460
File: frontend/src/components/ChatBox.tsx:230-231
Timestamp: 2025-05-23T12:10:25.006Z
Learning: The `showSourcesButton` state variable in ChatBox.tsx is deliberately being kept for future use when the backend support is ready, even though it's currently only being used in a conditional render.
Applied to files:
frontend/src/components/ChatBox.tsxfrontend/src/components/DocumentChat.tsx
📚 Learning: 2025-07-29T09:29:29.401Z
Learnt from: MayankBansal2004
Repo: xynehq/xyne PR: 691
File: server/db/sharedAgentUsage.ts:742-816
Timestamp: 2025-07-29T09:29:29.401Z
Learning: In server/db/sharedAgentUsage.ts, the getAgentFeedbackMessages function intentionally retrieves shareToken from the feedback JSON (feedbackData?.share_chat) rather than from the shared_chats table. This is a privacy design where share tokens are only stored in feedback JSON when users explicitly allow sharing, ensuring consent-based token exposure.
Applied to files:
server/api/chat/resource-access.tsserver/api/chat/agents.ts
📚 Learning: 2025-06-23T07:43:59.494Z
Learnt from: rahul1841
Repo: xynehq/xyne PR: 582
File: frontend/src/routes/_authenticated/agent.tsx:1654-1654
Timestamp: 2025-06-23T07:43:59.494Z
Learning: In the agent test interface dropdown (frontend/src/routes/_authenticated/agent.tsx), the agent selection dropdown specifically uses `allAgentsList` only and should only check `allAgentsList.length > 0` for visibility. It does not need to check other agent lists like `madeByMeAgentsList` or `sharedToMeAgentsList` because the dropdown is designed to show agents from the "all agents" tab only.
Applied to files:
server/api/chat/agents.ts
📚 Learning: 2025-10-27T07:43:40.563Z
Learnt from: junaid-shirur
Repo: xynehq/xyne PR: 1149
File: server/api/chat/tools.ts:1930-1937
Timestamp: 2025-10-27T07:43:40.563Z
Learning: In Vespa pagination (used in searchGoogleApps and related functions in server/api/chat/tools.ts), the limit parameter should be calculated as (params.limit || config.VespaPageSize) + offset, where offset is params.offset || 0. This is the correct behavior for Vespa's pagination system - the limit includes the offset value to properly fetch paginated results.
Applied to files:
server/api/chat/tools/global/index.ts
📚 Learning: 2025-11-06T11:56:48.044Z
Learnt from: Asrani-Aman
Repo: xynehq/xyne PR: 1163
File: server/api/chat/chat.ts:2310-2315
Timestamp: 2025-11-06T11:56:48.044Z
Learning: Repo: xynehq/xyne PR: 1163 — Scope clarification: Only Dual RAG path changes are in-scope; do not modify generateAnswerFromGivenContext in this PR.
Applied to files:
server/api/chat/chat.ts
🧬 Code graph analysis (22)
server/api/chat/tools/google/drive.ts (1)
server/api/chat/tools/utils.ts (1)
formatSearchToolResponse(30-68)
server/api/chat/tools/knowledgeBase.ts (5)
server/api/chat/tool-schemas.ts (2)
SearchKnowledgeBaseToolParams(205-213)SearchKnowledgeBaseInputSchema(215-247)server/api/chat/knowledgeBaseSelections.ts (2)
KnowledgeBaseSelection(33-37)buildKnowledgeBaseCollectionSelections(39-72)server/api/chat/tools/utils.ts (1)
parseAgentAppIntegrations(70-150)server/api/chat/tools/global/index.ts (1)
executeVespaSearch(246-369)server/shared/types.ts (1)
Apps(40-40)
server/api/chat/types.ts (2)
server/shared/types.ts (1)
Citation(117-117)frontend/src/components/CitationLink.tsx (1)
Citation(11-17)
server/api/chat/tools/google/contacts.ts (1)
server/api/chat/tools/utils.ts (1)
formatSearchToolResponse(30-68)
server/api/chat/tools/google/calendar.ts (1)
server/api/chat/tools/utils.ts (1)
formatSearchToolResponse(30-68)
server/api/chat/tools/google/gmail.ts (1)
server/api/chat/tools/utils.ts (1)
formatSearchToolResponse(30-68)
frontend/src/routes/_authenticated/index.tsx (1)
frontend/src/components/ChatBox.tsx (1)
HandleSendOptions(188-197)
server/api/chat/knowledgeBaseSelections.ts (5)
server/db/user.ts (1)
getUserByEmail(150-159)server/db/knowledgeBase.ts (1)
getCollectionsByOwner(52-70)server/db/schema/knowledgeBase.ts (2)
collectionItems(73-144)collections(20-70)server/search/utils.ts (1)
expandSheetIds(25-44)server/api/chat/utils.ts (2)
isAppSelectionMap(1112-1133)parseAppSelections(226-288)
server/tests/messageAgentsFragments.test.ts (6)
server/api/chat/types.ts (1)
MinimalAgentFragment(103-110)server/shared/types.ts (1)
Apps(40-40)server/api/chat/agent-schemas.ts (1)
AgentRunContext(141-216)server/api/chat/message-agents.ts (3)
afterToolExecutionHook(1408-1895)buildReviewPromptFromContext(2162-2219)buildDelegatedAgentFragments(1898-2007)server/api/chat/runContextUtils.ts (1)
getRecentImagesFromContext(9-50)server/api/chat/jaf-adapter.ts (1)
buildMCPJAFTools(66-174)
server/api/chat/tools/utils.ts (2)
server/api/chat/types.ts (1)
MinimalAgentFragment(103-110)server/shared/types.ts (1)
VespaSearchResults(60-60)
server/api/chat/runContextUtils.ts (3)
server/logger/index.ts (2)
getLoggerWithChild(192-200)Subsystem(15-15)server/api/chat/agent-schemas.ts (1)
AgentRunContext(141-216)server/config.ts (1)
IMAGE_CONTEXT_CONFIG(96-101)
frontend/src/routes/_authenticated/agent.tsx (1)
frontend/src/components/ChatBox.tsx (1)
HandleSendOptions(188-197)
frontend/src/components/ChatBox.tsx (2)
server/shared/types.ts (1)
AttachmentMetadata(748-748)frontend/src/types.ts (1)
ToolsListItem(64-67)
server/api/chat/resource-access.ts (8)
server/api/chat/tool-schemas.ts (2)
ResourceAccessItem(337-337)ResourceAccessSummary(338-338)server/shared/types.ts (2)
Apps(40-40)VespaFile(63-63)server/types.ts (1)
TxnOrClient(340-340)server/api/chat/knowledgeBaseSelections.ts (1)
buildKnowledgeBaseCollectionSelections(39-72)server/api/chat/utils.ts (2)
isAppSelectionMap(1112-1133)parseAppSelections(226-288)server/search/vespa.ts (2)
ifDocumentsExistInChatContainer(207-208)getDocumentOrNull(23-23)server/db/connector.ts (1)
getConnectorByAppAndEmailId(449-477)server/db/syncJob.ts (1)
getAppSyncJobsByEmail(47-63)
server/api/chat/jaf-adapter.ts (3)
server/api/chat/agent-schemas.ts (1)
AgentRunContext(141-216)server/api/chat/types.ts (1)
Citation(80-80)server/shared/types.ts (2)
Citation(117-117)Apps(40-40)
server/api/chat/agents.ts (3)
server/logger/index.ts (3)
getLogger(36-93)Subsystem(15-15)getLoggerWithChild(192-200)server/db/schema/agents.ts (2)
SelectAgent(170-170)SelectPublicAgent(179-179)server/shared/types.ts (1)
SelectPublicAgent(123-123)
frontend/src/components/DocumentChat.tsx (1)
frontend/src/components/ChatBox.tsx (1)
HandleSendOptions(188-197)
server/api/chat/tools/types.ts (1)
server/api/chat/agent-schemas.ts (1)
AgentRunContext(141-216)
server/api/chat/tools/global/index.ts (3)
server/api/chat/knowledgeBaseSelections.ts (2)
buildKnowledgeBaseCollectionSelections(39-72)KnowledgeBaseSelection(33-37)server/search/vespa.ts (2)
searchVespaAgent(94-127)searchVespa(27-92)server/api/chat/tools/utils.ts (1)
formatSearchToolResponse(30-68)
server/api/chat/tool-schemas.ts (2)
server/api/chat/agent-schemas.ts (4)
SubTaskSchema(330-338)ToolReviewFindingSchema(374-380)ListCustomAgentsInputSchema(309-322)RunPublicAgentInputSchema(345-360)server/shared/types.ts (1)
Apps(40-40)
frontend/src/routes/_authenticated/chat.tsx (1)
frontend/src/components/ChatBox.tsx (1)
HandleSendOptions(188-197)
server/api/chat/chat.ts (2)
server/api/chat/knowledgeBaseSelections.ts (2)
buildKnowledgeBaseCollectionSelections(39-72)PathExtractedInfo(13-17)server/api/chat/message-agents.ts (1)
MessageAgents(3179-4935)
| const stopController = activeStream.stopController | ||
| if (stopController && !stopController.signal.aborted) { | ||
| stopController.abort( | ||
| `[StopStreamingApi] Stop requested for chat ${streamKey}.`, | ||
| ) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n 'activeStreams\.set' server/api -C3Repository: xynehq/xyne
Length of output: 2762
Add stopController to all activeStreams.set calls for consistent abort support
The abort controller integration at lines 7551-7556 is well-implemented, but it's only effective for streams where stopController was attached. Currently, only message-agents.ts:3483 includes stopController in the activeStreams.set call. The other four calls—in agents.ts:414, agents.ts:1299, chat.ts:5100, and chat.ts:6620—only set { stream } without the controller, so the StopStreaming endpoint cannot abort those streams.
Pass a stopController to all activeStreams.set() calls to ensure graceful cancellation works uniformly across all producers.
🤖 Prompt for AI Agents
In server/api/chat/chat.ts around lines 7551 to 7556, the abort logic expects
activeStreams entries to include a stopController but several places create
activeStreams entries without it; update the other activeStreams.set calls
(agents.ts lines ~414 and ~1299, chat.ts lines ~5100 and ~6620) to create an
AbortController when starting a stream, attach it as stopController on the
activeStreams.set value (e.g., { stream, stopController }), and pass
stopController.signal into any APIs or producers that accept an AbortSignal so
the existing stopController.abort(...) call will correctly cancel those streams.
| const text = splitGroupedCitationsWithSpaces(textInput) | ||
| let match: RegExpExecArray | null | ||
| let imgMatch: RegExpExecArray | null | ||
| let citationsProcessed = 0 | ||
| let imageCitationsProcessed = 0 | ||
| let citationsYielded = 0 | ||
| let imageCitationsYielded = 0 | ||
|
|
||
| while ( | ||
| (match = textToCitationIndex.exec(text)) !== null || | ||
| (imgMatch = textToImageCitationIndex.exec(text)) !== null | ||
| ) { |
There was a problem hiding this comment.
Reset global regex lastIndex before scanning to avoid missed citations across calls
textToCitationIndex / textToImageCitationIndex are imported regexes and are likely global (/g) patterns. Using .exec directly on shared regex instances without resetting lastIndex means that subsequent calls to checkAndYieldCitationsForAgent can start from the previous end position and silently skip matches, especially when processing multiple messages in a session.
Add an explicit reset before the loop:
try {
+ // Reset global regex state before scanning this text
+ textToCitationIndex.lastIndex = 0
+ textToImageCitationIndex.lastIndex = 0
+
span.setAttribute("text_input_length", textInput.length)
span.setAttribute("results_count", fragments.length)This keeps behavior deterministic across invocations while preserving the current text/image citation handling.
[scratchpad_end] -->
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/api/chat/citation-utils.ts around lines 50 to 61, the loop uses
imported global regexes textToCitationIndex and textToImageCitationIndex with
.exec, which can retain a stale lastIndex and cause missed matches on subsequent
calls; reset both regexes' lastIndex to 0 immediately before the while loop (or
create fresh RegExp instances from them) so each invocation starts scanning from
the beginning of the input, ensuring deterministic citation detection across
calls.
| if (newFragments.length === 0 && formattedContent) { | ||
| const syntheticSource: Citation = { | ||
| docId: connectorId, | ||
| title: info.metadata?.name || `Connector ${connectorId}`, | ||
| url: "", | ||
| app: Apps.MCP, | ||
| entity: { | ||
| type: "mcp", | ||
| connectorId, | ||
| name: info.metadata?.name || connectorId, | ||
| } as unknown as Citation["entity"], | ||
| } | ||
| const syntheticId = `${connectorId}:${toolName}:${Date.now()}` | ||
| newFragments.push({ | ||
| id: syntheticId, | ||
| content: formattedContent, | ||
| source: syntheticSource, | ||
| confidence: 0.7, | ||
| }) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check for Apps.MCP in the codebase
rg "Apps\." --type=ts | head -20Repository: xynehq/xyne
Length of output: 1459
🏁 Script executed:
# Search for Apps enum definition
rg -B2 -A10 "enum Apps|const Apps" --type=ts -g '!node_modules'Repository: xynehq/xyne
Length of output: 37
🏁 Script executed:
# Search for Citation type definition
rg -B2 -A10 "type Citation|interface Citation" --type=ts -g '!node_modules'Repository: xynehq/xyne
Length of output: 8111
🏁 Script executed:
# Check the imports in the file under review
head -20 server/api/chat/jaf-adapter.tsRepository: xynehq/xyne
Length of output: 828
🏁 Script executed:
# Find MinimalCitationSchema definition
rg -B5 -A15 "MinimalCitationSchema" --type=ts -g '!node_modules'Repository: xynehq/xyne
Length of output: 1700
🏁 Script executed:
# Search for Apps enum definition to see what values are available
rg "export.*Apps|export const Apps|export enum Apps" --type=ts -g '!node_modules'Repository: xynehq/xyne
Length of output: 979
🏁 Script executed:
# Check if there are any references to MCP in the codebase
rg -i "mcp" --type=ts -g '!node_modules' | head -30Repository: xynehq/xyne
Length of output: 2119
🏁 Script executed:
# Find entitySchema definition
rg -B5 -A15 "entitySchema" --type=ts -g '!node_modules'Repository: xynehq/xyne
Length of output: 11039
🏁 Script executed:
# Check server/shared/types.ts for the Apps enum
rg -B2 -A50 "enum.*Apps|export.*Apps.*=" server/shared/types.ts --type=tsRepository: xynehq/xyne
Length of output: 37
🏁 Script executed:
# Look for the specific entity type definition
rg -B2 -A10 "type.*entity|entity.*type" server/api/chat/types.ts --type=tsRepository: xynehq/xyne
Length of output: 353
🏁 Script executed:
# Search for MCP entity type
rg -i "mcp.*entity|entity.*mcp" --type=ts -g '!node_modules'Repository: xynehq/xyne
Length of output: 143
🏁 Script executed:
# Check if there's any test or usage of this synthetic citation
rg "syntheticSource|syntheticId" --type=ts -g '!node_modules'Repository: xynehq/xyne
Length of output: 379
🏁 Script executed:
# Look at the test file that was mentioned earlier
cat -n server/tests/messageAgentsFragments.test.ts | head -100Repository: xynehq/xyne
Length of output: 3318
🌐 Web query:
@xyne/vespa-ts Apps enum MCP
💡 Result:
Do you mean the Apps TypeScript enum/type in the @xyne/vespa-ts package (i.e., its values or how to import/use it)?
🏁 Script executed:
# Also check if there's a package.json to see the version
cat package.json | grep -A5 -B5 "vespa-ts"Repository: xynehq/xyne
Length of output: 99
🏁 Script executed:
# Find package.json in subdirectories
find . -name "package.json" -type f | head -5Repository: xynehq/xyne
Length of output: 100
🏁 Script executed:
# Check if there's a types file or documentation for the entity
rg "type.*MCP|MCP.*type" --type=ts -g '!node_modules' | grep -i entityRepository: xynehq/xyne
Length of output: 37
🏁 Script executed:
# Look for how other apps handle synthetic citations or similar patterns
rg "synth|synthetic" --type=ts -g '!node_modules' -i | head -20Repository: xynehq/xyne
Length of output: 1870
🏁 Script executed:
# Look at the full test for MCP
rg -A30 "buildMCPJAFTools synthesizes fragments when MCP response only has text" server/tests/messageAgentsFragments.test.ts --type=tsRepository: xynehq/xyne
Length of output: 922
🏁 Script executed:
# Check what entity value is used in the actual jaf-adapter code more carefully
sed -n '136,155p' server/api/chat/jaf-adapter.ts | cat -nRepository: xynehq/xyne
Length of output: 983
Add an MCP entity schema or use a valid existing entity type for the synthetic citation.
The synthetic citation uses type: "mcp" at line 147, but entitySchema does not include an MCP entity type. The entitySchema union only validates: System, People, File, Notion, Mail, Event, MailAttachment, Chat, DataSource, WebSearch, and KnowledgeBase entities. The force cast as unknown as Citation["entity"] at line 151 bypasses Zod validation, which means this Citation will fail validation if passed through MinimalCitationSchema downstream. Either define a new MCP entity schema in server/shared/types.ts and add it to the entitySchema union, or use one of the existing entity types that fits the MCP tool response context.
| async function buildUserOwnedSelections( | ||
| email: string, | ||
| explicitSelections: string[], | ||
| ): Promise<KnowledgeBaseSelection[]> { | ||
| const [user] = await getUserByEmail(db, email) | ||
| if (!user) return [] | ||
|
|
||
| const ownedCollections = await getCollectionsByOwner(db, user.id) | ||
| if (!ownedCollections.length) return [] | ||
|
|
||
| const ownedCollectionIds = new Set( | ||
| ownedCollections.map((c) => Number(c.id)), | ||
| ) | ||
|
|
||
| const prefixedSelections = explicitSelections.length | ||
| ? await filterPrefixedIdsToOwner(explicitSelections, ownedCollectionIds, user.id) | ||
| : Array.from(ownedCollectionIds).map((id) => `cl-${id}`) | ||
|
|
||
| return convertPrefixedIdsToSelections(prefixedSelections) | ||
| } |
There was a problem hiding this comment.
Bug: treating collection IDs as numbers breaks user-owned & all-accessible KB selections
Here we’re coercing collection IDs to numbers even though the KB schema defines collections.id as a UUID/string. That leads to NaN values and prevents correct ownership filtering:
buildUserOwnedSelectionsdoes:const ownedCollectionIds = new Set(ownedCollections.map((c) => Number(c.id)))- When
explicitSelectionsis empty, it generates["cl-NaN"].
filterPrefixedIdsToOwnerexpectsownedCollectionIds: Set<number>and compares againstNumber(collectionId), so anycl-{uuid}IDs from appIntegrations will never be recognized as owned.- In practice this means:
- User‑owned scope will often yield bogus
collectionIdslike"NaN", and - All‑accessible scope (which builds on
buildUserOwnedSelections) also loses real collections.
- User‑owned scope will often yield bogus
This will cause the new KB flows to either miss valid user collections or send unusable IDs to Vespa.
You can fix this by treating IDs as strings end‑to‑end and comparing on string equality:
async function buildUserOwnedSelections(
email: string,
explicitSelections: string[],
): Promise<KnowledgeBaseSelection[]> {
const [user] = await getUserByEmail(db, email)
if (!user) return []
const ownedCollections = await getCollectionsByOwner(db, user.id)
if (!ownedCollections.length) return []
- const ownedCollectionIds = new Set(
- ownedCollections.map((c) => Number(c.id)),
- )
+ const ownedCollectionIds = new Set(
+ ownedCollections.map((c) => String(c.id)),
+ )
const prefixedSelections = explicitSelections.length
- ? await filterPrefixedIdsToOwner(explicitSelections, ownedCollectionIds, user.id)
- : Array.from(ownedCollectionIds).map((id) => `cl-${id}`)
+ ? await filterPrefixedIdsToOwner(
+ explicitSelections,
+ ownedCollectionIds,
+ user.id,
+ )
+ : Array.from(ownedCollectionIds).map((id) => `cl-${id}`)
return convertPrefixedIdsToSelections(prefixedSelections)
}
-async function filterPrefixedIdsToOwner(
- prefixedIds: string[],
- ownedCollectionIds: Set<number>,
+async function filterPrefixedIdsToOwner(
+ prefixedIds: string[],
+ ownedCollectionIds: Set<string>,
ownerId: number,
): Promise<string[]> {
@@
- } else if (id.startsWith("cl-")) {
- const collectionId = id.replace(/^cl[-_]/, "")
- const numericCollectionId = Number(collectionId)
- if (
- !Number.isNaN(numericCollectionId) &&
- ownedCollectionIds.has(numericCollectionId)
- ) {
- collectionIds.push(collectionId)
- }
+ } else if (id.startsWith("cl-")) {
+ const collectionId = id.replace(/^cl[-_]/, "")
+ if (ownedCollectionIds.has(collectionId)) {
+ collectionIds.push(collectionId)
+ }
}
})This keeps the ownership check aligned with the actual schema and ensures UserOwned and AllAccessible scopes correctly produce real collectionIds.
Also applies to: 138-172
🤖 Prompt for AI Agents
In server/api/chat/knowledgeBaseSelections.ts around lines 74-93 (and similarly
138-172), the code coerces collection IDs to numbers causing NaN and broken
ownership checks; update the function(s) to treat collection IDs as strings
end-to-end: stop using Number(c.id) when building ownedCollectionIds (use the
raw string id and a Set<string>), update any filterPrefixedIdsToOwner
usages/signature to accept Set<string> and compare IDs using string equality,
and when producing prefixed IDs use `cl-${id}` with the string id so user-owned
and all-accessible scopes emit correct collectionIds.
Description
Testing
Additional Notes
Summary by CodeRabbit
New Features
Refactor
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.