Skip to content

[Features] Support for Document and Collection Summaries #1095

@iziang

Description

@iziang

Description:
We propose to add support for generating summaries at both the document and collection levels, leveraging the existing asynchronous indexing task framework.

Feature Details:

  1. Document Summary Generation

    • Implement automatic content summarization for individual documents.
    • Treat the summary as a new type of index, managed alongside existing index types.
    • Integrate the summary generation process into the current indexing task framework.
  2. Collection Summary Generation

    • Enable collection-level summaries, which aggregate or synthesize information from document summaries within the collection.
    • Use the collection summary as the collection’s description.
    • Apply collection summaries in agent and MCP scenarios, allowing LLMs to automatically select the most relevant collection based on user queries.
  3. Task Framework Integration

    • Both document and collection summary generation should be handled by the asynchronous indexing task framework.
    • Support both manual and automatic triggers for summary generation and updates.

Benefits:

  • Enhanced search and retrieval experience for users.
  • Improved agent and MCP performance by providing concise, machine-readable collection descriptions.
  • Flexible and scalable summary management through the existing indexing infrastructure.

Additional Notes:

  • Collection summaries should be dependent on the availability of document summaries.
  • The design should ensure extensibility for future summary-related features.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions