Skip to content

scale the spooler component #8

@m-rau

Description

@m-rau

Describe the enhancement

The kodosumi spooler represents the backend component to collect and store events of active flow executions. Current implementation uses sqlite3 files to store the execution event stream. Each flow execution correlates to a dedicated sqlite3 file.

The spooler therefore is a single point of failure (SPOF). Note that no events get lost on spooler failure and restart since the events remain in Ray's shared object store if the spooler does not gather the events. At restart the spooler will therefore continue spooling and continues to materialise the event stream to disk (sqlite3).

The problem is the spooler does not scale with the current design. To scale the spooler - i.e. run multiple and redundant spooler instances - a central data store is required to share the event stream across multiple spoolers.

suggested solutions and enhancements:

  • refactor the sqlite3 implementation to plug in different storage backends (i.e. Redis, MongoDB, PostgreSQL)
  • refactor the spooler component to "assign" flow executions and their event streams to a dedicated spooler instance

Alternatives considered

none

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions