Skip to content

Race condition with node joins #242

@LetThereBeDwight

Description

@LetThereBeDwight

I've only specifically looked at this for the ReplicatedCache adapter (might be the only place where it's really important), but we're trying to use it and during the sync process we're seeing that syncs weren't occurring, and new nodes weren't seeing other nodes. Putting a sleep of 500ms in the init process of the ReplicatedCache bootstrap showed us that we did (eventually) see the new nodes.

We moved the caches down the supervisor child list after the libcluster config/child was added to no effect without the sleep.

We tried explicitly starting the children after the initial supervisor start to no effect without the sleep. In this same pattern, we moved the sleep out of the bootstrap and into our application before adding the children, and the bootstrap was able to see all the nodes and properly sync.

Any thoughts here on how to better handle this or if something can be done while boostrapping the ReplicatedCache adapter to ensure that libcluster's join process has completed? I see #232 that tackles a maybe similar issue but not in an ideal way for allowing data syncs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions