Skip to content
This repository was archived by the owner on Jan 13, 2021. It is now read-only.
This repository was archived by the owner on Jan 13, 2021. It is now read-only.

Database Refactor - Better HA + Fault Tolerance #473

@bonedaddy

Description

@bonedaddy

Overview

In light of the temporary impact to IPFS HTTP API directory uploads caused by a database sync issue, we need to refactor the way our database tooling works. We need better HA, and Fault Tolerance so if another repeat of the incident happens, we can automatically fail-over to a working database.

Our current database system consists of three nodes all in logical replication, allowing us to conduct manual fail-over in the event of an incident, and ensures that we have backups of our databases, as well as hourly backups. However this isn't as smooth as it can be.

While this endeavour falls on my to accomplish, it has the help wanted label as this is an area of database administration I'm not familiar with, and would welcome community input.

End Goals

  • Multi-master replication
  • Automatic fail-over
  • Load balanced requests

Research

track research notes and such

Possible Implementations

Will contain analysis, pros, cons, etc... of the available solutions

Standby Databases

Clusters

DRBD (Distributed Replicated Block Device)

  • Corosync + Pacemaker + DRBD

Pgpool II

Citus CE

Postgres-XL

CockroachDB

Bucardo

Links

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions