Skip to content

[Feature Request] Add a small "knowledge retrieval failure modes" note for STaRK users (docs only) #33

@onestardao

Description

@onestardao

Hi STaRK team,

Thank you for releasing STaRK. A benchmark that combines text and knowledge relations is very helpful for studying retrieval and reasoning together.

I have been working on compact failure-mode maps for retrieval and RAG systems and recently contributed a robustness-related entry to Harvard MIMS Lab’s ToolUniverse. When working with STaRK-like settings, I often see recurring issues:

  • systems retrieve the right entities but miss key relations
  • spurious relations are retrieved and used, leading to incorrect reasoning
  • evaluation focuses on surface-form answers without checking relational grounding

I would like to propose a small, documentation-only note for STaRK users.

Proposed feature

Add a short markdown page under the docs, for example:

stark_knowledge_retrieval_failure_modes.md

The page could:

  1. Describe typical failure modes in text-plus-relation retrieval.
  2. For each mode, list:
    • example symptoms in STaRK evaluations
    • likely causes (indexing, representation, scoring)
    • simple diagnostics (e.g., inspecting retrieved relation sets).
  3. Suggest a small checklist for error analysis:
    • whether the right entities were retrieved
    • whether the right relations were retrieved
    • where the reasoning step failed.

Motivation

  • STaRK is a good testbed for understanding retrieval and relational reasoning together.
  • A short failure-mode note would help users interpret errors with more structure and design better follow-up experiments.
  • This is documentation only and can be kept concise.

If this seems useful and in scope, I would be happy to contribute a concise draft via PR.

Thank you for considering.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions