Skip to content

GH-5255: Fix streaming to return tokens incrementally in SemanticCach…#5391

Merged
markpollack merged 1 commit intospring-projects:mainfrom
sobychacko:gh-5255
Feb 4, 2026
Merged

GH-5255: Fix streaming to return tokens incrementally in SemanticCach…#5391
markpollack merged 1 commit intospring-projects:mainfrom
sobychacko:gh-5255

Conversation

@sobychacko
Copy link
Contributor

…eAdvisor

Fixes: #5255

The adviseStream() method was using collectList().flatMapMany() which collected all response chunks before returning them to the user, defeating the purpose of streaming.

Replace with ChatClientMessageAggregator which:

  • Returns tokens to the user immediately as they arrive
  • Aggregates the response in the background
  • Caches asynchronously when the stream completes

Add testStreamingWithAdvisor() integration test that verifies:

  • Cache miss returns multiple chunks (true streaming)
  • Cache hit returns single chunk (Flux.just)
  • Cached content matches returned content

…in SemanticCacheAdvisor

Fixes: spring-projects#5255

The adviseStream() method was using collectList().flatMapMany() which
collected all response chunks before returning them to the user,
defeating the purpose of streaming.

Replace with ChatClientMessageAggregator which:
 - Returns tokens to the user immediately as they arrive
 - Aggregates the response in the background
 - Caches asynchronously when the stream completes

 Add testStreamingWithAdvisor() integration test that verifies:
 - Cache miss returns multiple chunks (true streaming)
 - Cache hit returns single chunk (Flux.just)
 - Cached content matches returned content

Signed-off-by: Soby Chacko <soby.chacko@broadcom.com>
@sobychacko sobychacko added this to the 2.0.0-M3 milestone Feb 2, 2026
@markpollack markpollack merged commit 9785cb2 into spring-projects:main Feb 4, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Redis] SemanticCacheAdvisor: Streaming collects all chunks before returning to user

2 participants