[DERCBOT-1609] Improving RAG System - Part 1#1913

Open

assouktim wants to merge 7 commits intotheopenconversationkit:masterfrom

CreditMutuelArkea:feature/dercbot-1609

Contributor

assouktim commented Sep 4, 2025

In this first part of improvement of the RAG system, we proceeded as follows:

Structuring the LLM response to obtain not only the answer, but also other information such as: the status of the response, the subject of the conversation, and the actual context used.
Modifying the default prompt for condensing questions.
Preparing the code for the multiple search stage.

Benvii requested changes

View reviewed changes

Member

Benvii left a comment

Thanks for this PR, see my comments.

bot/admin/web/src/app/rag/rag-settings/models/engines-configurations.ts Show resolved Hide resolved

.../orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/rag/rag_models.py Outdated

+              class ChunkSentences(BaseModel):
+                  chunk: Optional[str] = None
+                  sentences: Optional[List[str]] = None

Member

Benvii Sep 8, 2025

Add a description of what theses sentences are, I assume it's the part of the chunk used to formulate the answer ? Or maybe it's the sentence form the answer related to this chunk ?

It's not describe in the prompt template you gave on JIRA.

.../orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/rag/rag_models.py Outdated

+              class ChunkSentences(BaseModel):
+                  chunk: Optional[str] = None
+                  sentences: Optional[List[str]] = None
+                  reason: Optional[str] = None

Member

Benvii Sep 8, 2025

Same question here, what's this reason field used for ?

Contributor Author

assouktim Sep 12, 2025

Reason why the chunk was not used (e.g., irrelevant, general background).

.../orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/rag/rag_models.py Outdated

-                  footnotes: set[Footnote] = Field(description='Set of footnotes')
+              class ChunkSentences(BaseModel):
+                  chunk: Optional[str] = None

Member

Benvii Sep 8, 2025

How are chunk linked to their source / footnotes ? As I understand there is no link between them and we can't link them in the first implementation.

Contributor Author

assouktim Sep 12, 2025

Chunks are linked to footnotes through their ID:
ChunkInfos.chunk matches Document.metadata["id"]. (i renamed ChunkSetences to ChunkInfos)
See the instruction that returns RAGResponse (on rag_chain.py).

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Outdated

+                  logger.debug('RAG chain - Use chat history: %s', len(message_history.messages) > 0)
+                  logger.debug('RAG chain - Use RAGCallbackHandler for debugging : %s', debug)
+                  callback_handlers = get_callback_handlers(request, custom_observability_handler, debug)

Member

Benvii Sep 8, 2025

Isn't it easier to return here a Tuple or an object (with 2 fields {records_callback_handler: [], observability_handler: [] }, which explicitly split the 2 kind of handlers that are instanciated in get_callback_handlers, instead of having to split them back in 2 categories RAGCallbackHandler and LangfuseCallbackHandler ?

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Outdated

+                          None
+                      )
+                  observability_handler = next(
+                      (x for x in callback_handlers if isinstance(x, LangfuseCallbackHandler)),

Member

Benvii Sep 8, 2025

Mentioning langfuse here isn't really generic if we have a arize phoenix handler in the future (which is used by SNCF Connect).

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py

                   ]
+              def get_llm_answer(rag_chain_output) -> LLMAnswer:
+                  return LLMAnswer(**json.loads(rag_chain_output.strip().removeprefix("```json").removesuffix("```").strip()))

Member

Benvii Sep 8, 2025

Is this manual parsing required ? Langchain doesn't have any parser for that kind of markdown parsing ?

I think you can call this instead : https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html#langchain_core.output_parsers.json.JsonOutputParser.parse

Contributor Author

assouktim Sep 12, 2025 •

edited

Loading

No, it's not the return of LLM, but what the log handler records. It's a string that it sees.
We do use JsonOutputParser in RAG chain.

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py

-                  # Construct the RAG chain using the prompt and LLM,
-                  # This chain will consume the documents retrieved by the retriever as input.
-                  rag_chain = construct_rag_chain(question_answering_llm, rag_prompt)
+                  if question_condensing_llm is not None:

Member

Benvii Sep 8, 2025

Suggested change

      
                if question_condensing_llm is not None:
          
                # Fallback in case of missing condensing LLM setting using the answering LLM setting.
          
                if question_condensing_llm is not None:

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Outdated

+                          {
+                              "context": lambda x: json.dumps([
+                                  {
+                                      "chunk_id": doc.metadata['id'],

Member

Benvii Sep 8, 2025

Great to have a chunk_id here but I don't see where the LLM should reuse this chunk_id in it's reply as it's not used / present in the output type, ChunkSentences doesn't have an ID field.

Contributor Author

assouktim Sep 12, 2025

We explain that on prompt :

If explicit chunk identifiers are present in the context, use them; otherwise assign sequential numbers starting at 1.
For each chunk object:
- "chunk": "<chunk_identifier_or_sequential_number>"

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Outdated

-                          'question': lambda inputs: inputs[
-                              'question'
-                          ],  # Override the user's original question with the condensed one
+                          "context": lambda x: json.dumps([

Member

Benvii Sep 8, 2025

It seems to be code duplicated here, I don't see construct_rag_chain used anywhere in the code base .. how is it different from create_rag_chain ?

I see that create_rag_chain is imported in the tooling run_experiment.py script but it's unused 😬, actually it's execute_rag_chain that is used in the tooling so construct_rag_chain should be removed.

assouktim force-pushed the feature/dercbot-1609 branch 2 times, most recently from 11e255a to 6004fc4 Compare

September 15, 2025 09:16

Contributor Author

assouktim commented Sep 15, 2025

@Benvii It may be best not to merge immediately and instead wait until the second part. This way, we can test and evaluate all the changes at once, rather than having to repeat the tests, and impacting all prompts multiple times.

zigzago assigned Benvii

scezen reviewed

View reviewed changes

bot/admin/web/src/app/rag/rag-settings/models/engines-configurations.ts Outdated Show resolved Hide resolved


          [DERCBOT-1609] Structuring the LLM response

153471d

assouktim force-pushed the feature/dercbot-1609 branch from 4f6c911 to 153471d Compare

February 3, 2026 14:34

assouktim and others added 6 commits

February 3, 2026 17:08


          [DERCBOT-1609] w

b4facf7


          [DERCBOT-1609] Structuring the LLM response

df02044


          Front modifications

6ba6d90


          Front modifications - Wip

2db19a1


          Remove documentsRequired Rag option + UX improvements

dacc9a5


          Front - Fix globalMessage footer display + Improve Rag debug display

bf8ccb0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet