[DERCBOT-1609] Improving RAG System - Part 1#1913
[DERCBOT-1609] Improving RAG System - Part 1#1913assouktim wants to merge 7 commits intotheopenconversationkit:masterfrom
Conversation
Benvii
left a comment
There was a problem hiding this comment.
Thanks for this PR, see my comments.
|
|
||
| class ChunkSentences(BaseModel): | ||
| chunk: Optional[str] = None | ||
| sentences: Optional[List[str]] = None |
There was a problem hiding this comment.
Add a description of what theses sentences are, I assume it's the part of the chunk used to formulate the answer ? Or maybe it's the sentence form the answer related to this chunk ?
It's not describe in the prompt template you gave on JIRA.
| class ChunkSentences(BaseModel): | ||
| chunk: Optional[str] = None | ||
| sentences: Optional[List[str]] = None | ||
| reason: Optional[str] = None |
There was a problem hiding this comment.
Same question here, what's this reason field used for ?
There was a problem hiding this comment.
Reason why the chunk was not used (e.g., irrelevant, general background).
| footnotes: set[Footnote] = Field(description='Set of footnotes') | ||
|
|
||
| class ChunkSentences(BaseModel): | ||
| chunk: Optional[str] = None |
There was a problem hiding this comment.
How are chunk linked to their source / footnotes ? As I understand there is no link between them and we can't link them in the first implementation.
There was a problem hiding this comment.
Chunks are linked to footnotes through their ID:
ChunkInfos.chunk matches Document.metadata["id"]. (i renamed ChunkSetences to ChunkInfos)
See the instruction that returns RAGResponse (on rag_chain.py).
| logger.debug('RAG chain - Use chat history: %s', len(message_history.messages) > 0) | ||
| logger.debug('RAG chain - Use RAGCallbackHandler for debugging : %s', debug) | ||
|
|
||
| callback_handlers = get_callback_handlers(request, custom_observability_handler, debug) |
There was a problem hiding this comment.
Isn't it easier to return here a Tuple or an object (with 2 fields {records_callback_handler: [], observability_handler: [] }, which explicitly split the 2 kind of handlers that are instanciated in get_callback_handlers, instead of having to split them back in 2 categories RAGCallbackHandler and LangfuseCallbackHandler ?
| None | ||
| ) | ||
| observability_handler = next( | ||
| (x for x in callback_handlers if isinstance(x, LangfuseCallbackHandler)), |
There was a problem hiding this comment.
Mentioning langfuse here isn't really generic if we have a arize phoenix handler in the future (which is used by SNCF Connect).
| ] | ||
|
|
||
| def get_llm_answer(rag_chain_output) -> LLMAnswer: | ||
| return LLMAnswer(**json.loads(rag_chain_output.strip().removeprefix("```json").removesuffix("```").strip())) |
There was a problem hiding this comment.
Is this manual parsing required ? Langchain doesn't have any parser for that kind of markdown parsing ?
I think you can call this instead : https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html#langchain_core.output_parsers.json.JsonOutputParser.parse
There was a problem hiding this comment.
No, it's not the return of LLM, but what the log handler records. It's a string that it sees.
We do use JsonOutputParser in RAG chain.
| # Construct the RAG chain using the prompt and LLM, | ||
| # This chain will consume the documents retrieved by the retriever as input. | ||
| rag_chain = construct_rag_chain(question_answering_llm, rag_prompt) | ||
| if question_condensing_llm is not None: |
There was a problem hiding this comment.
| if question_condensing_llm is not None: | |
| # Fallback in case of missing condensing LLM setting using the answering LLM setting. | |
| if question_condensing_llm is not None: |
| { | ||
| "context": lambda x: json.dumps([ | ||
| { | ||
| "chunk_id": doc.metadata['id'], |
There was a problem hiding this comment.
Great to have a chunk_id here but I don't see where the LLM should reuse this chunk_id in it's reply as it's not used / present in the output type, ChunkSentences doesn't have an ID field.
There was a problem hiding this comment.
We explain that on prompt :
- If explicit chunk identifiers are present in the context, use them; otherwise assign sequential numbers starting at 1.
- For each chunk object:
- "chunk": "<chunk_identifier_or_sequential_number>"
| 'question': lambda inputs: inputs[ | ||
| 'question' | ||
| ], # Override the user's original question with the condensed one | ||
| "context": lambda x: json.dumps([ |
There was a problem hiding this comment.
It seems to be code duplicated here, I don't see construct_rag_chain used anywhere in the code base .. how is it different from create_rag_chain ?
I see that create_rag_chain is imported in the tooling run_experiment.py script but it's unused 😬, actually it's execute_rag_chain that is used in the tooling so construct_rag_chain should be removed.
11e255a to
6004fc4
Compare
|
@Benvii It may be best not to merge immediately and instead wait until the second part. This way, we can test and evaluate all the changes at once, rather than having to repeat the tests, and impacting all prompts multiple times. |
bot/admin/web/src/app/rag/rag-settings/models/engines-configurations.ts
Outdated
Show resolved
Hide resolved
4f6c911 to
153471d
Compare
In this first part of improvement of the RAG system, we proceeded as follows: