Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-2897] Cross-region async replication #368

Closed
wants to merge 21 commits into from

Conversation

marceloneppel
Copy link
Member

@marceloneppel marceloneppel commented Jan 22, 2024

Issue

Connecting two PostgreSQL clusters through async replication in a cross-region setup is currently impossible.

Solution

Two endpoints were created to relate the Juju applications of the two PostgreSQL clusters (async-primary and async-replica). The relation exchanges information about the topology of the clusters (IP addresses and the endpoint of one of the sync_standby members of the main cluster - the cluster that will replicate its data to the other cluster, the secondary cluster) and also about the secrets from the main cluster.

To promote one cluster to be the main cluster, it's needed to call the promote-standby-cluster, which enables the replication between the clusters.

Also, if anything changes in the main cluster topology (like the sync_standby crashing), the endpoint of the main cluster is updated in the secondary cluster to keep replication working. If something happens with the standby_leader in the secondary cluster, another member takes that role, and the replication continues to work.

src/relations/async_replication.py contains all that logic, including the sharing of the IP addresses of the secondary cluster, which are needed to enable the replication connection from that cluster units to the main cluster through the pg_hba rules.

Stop all units to delete the cluster info from the Patroni K8S endpoint to start a new cluster replicating from the main cluster on standby. For that, it's necessary to coordinate the cluster to start the new cluster only after all units have been stopped and the cluster information is empty in the K8S endpoint. This is done in src/coordinator_ops.py.

Integration tests are implemented at #369.
Unit tests should be implemented in a separate PR too.

phvalguima and others added 7 commits October 24, 2023 01:18
Deploy two models, each with 1x postgresql
Then, configure async replication as follows:
  $ juju switch psql-1
  $ juju offer postgresql-k8s:async-primary async-primary  # async-primary is the relation provided by the leader
  $ juju switch psql-2
  $ juju consume admin/psql-1.async-primary  # consume the primary relation
  $ juju relate postgresql-k8s:async-replica async-primary  # Both units are now related, where postgresql-k8s in model psql-2 is the standby-leader

Now, run the action:
  $ juju run -m psql-1 postgresql-k8s/0 promote-standby-cluster  # move postgresql-k8s in model psql-1 to be the leader cluster

Run the following command to check status:
  $ PATRONI_KUBERNETES_LABELS='{application: patroni, cluster-name: patroni-postgresql-k8s}' \
    PATRONI_KUBERNETES_NAMESPACE=psql-2 \  # update to model number
    PATRONI_KUBERNETES_USE_ENDPOINTS=true \
    PATRONI_NAME=postgresql-k8s-0 \
    PATRONI_REPLICATION_USERNAME=replication \
    PATRONI_SCOPE=patroni-postgresql-k8s \
    PATRONI_SUPERUSER_USERNAME=operator \
      patronictl -c /var/lib/postgresql/data/patroni.yml list

Role should be "Standby leader" and State should be "Running".
… will stop their services before moving on and reconfiguring
…-async-replication

Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
…-async-replication

Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
@@ -116,6 +124,10 @@ postgresql:
- {{ 'hostssl' if enable_tls else 'host' }} all all 0.0.0.0/0 md5
{%- endif %}
- {{ 'hostssl' if enable_tls else 'host' }} replication replication 127.0.0.1/32 md5
- {{ 'hostssl' if enable_tls else 'host' }} replication replication 127.0.0.6/32 md5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just Q: why do we need to add 127.0.0.6 IP?

Comment on lines 214 to 220
# This unit is the leader, generate a new configuration and leave.
# There is nothing to do for the leader.
for attempt in Retrying(stop=stop_after_attempt(5), wait=wait_fixed(3)):
with attempt:
self.container.stop(self.charm._postgresql_service)
self.charm.update_config()
self.container.start(self.charm._postgresql_service)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you a liitle bit improve comment, it is not clear why we need to restart primary

# Store current data in a ZIP file, clean folder and generate configuration.
logger.info("Creating backup of pgdata folder")
self.container.exec(
f"tar -zcf /var/lib/postgresql/data/pgdata-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.zip /var/lib/postgresql/data/pgdata".split()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a little bit more faster

Suggested change
f"tar -zcf /var/lib/postgresql/data/pgdata-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.zip /var/lib/postgresql/data/pgdata".split()
f"tar -JcpSf /var/lib/postgresql/data/pgdata-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.tar.xz /var/lib/postgresql/data/pgdata".split()

Comment on lines 378 to 381
primary_relation = self.model.get_relation(ASYNC_PRIMARY_RELATION)
if not primary_relation:
event.fail("No primary relation")
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some explanation why relation needed during promotion. maybe call will work better

@@ -1688,7 +1688,6 @@ files = [
{file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"},
{file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"},
{file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"},
{file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a08c6f0fe150303c1c6b71ebcd7213c2858041a7e01975da3a99aed1e7a378ef"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is caused by python-poetry/poetry#6513

…-async-replication

Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to make this into a charm lib further down the line.

@taurus-forever
Copy link
Contributor

taurus-forever commented Feb 15, 2024

Code: LGTM. We can merge this to continue from here.

IMHO, we should revise and improve UI/UX.

  1. terraform compatibility:
  • create all necessary relations once (in a moment of deployment)
  • avoid async operations (e.g. execute A, wait for X, execute B).
  1. K.I.S.S. to switch the main cluster must be as easy as possible and do not depend on Juju (or other external tools)
  2. Fool protection. Disaster recovery is a stress time. The switching commands must be as simple as possible and avoid human mistakes. Ideally: one command.

Current UX:

> Deploy 2 postgresql clusters: psql-1 (main) and psql-2 (standby):

juju add-model psql-1
juju deploy postgresql-k8s --channel 14/edge/async --trust -n 3 --config profile=testing --base [email protected]
juju offer postgresql-k8s:async-primary async-primary
juju offer postgresql-k8s:database psql1database

juju add-model psql-2
juju deploy postgresql-k8s --channel 14/edge/async --trust -n 3 --config profile=testing --base [email protected]
# (skip this, historical cmd) juju deploy ./postgresql-k8s_ubuntu-22.04-amd64.charm --trust -n 3 --config profile=testing --resource postgresql-image=ghcr.io/canonical/charmed-postgresql@sha256:899f5455fa4557b7060990880fb0ae1fa3b21c0ab2e72ad863cc16d0b3f25fee
juju offer postgresql-k8s:async-primary async-primary # TODO: request --model option for `juju offer`
juju offer postgresql-k8s:database psql2database

juju consume psql-2.async-primary -m psql-1
juju consume psql-1.async-primary -m psql-2

> Replicate psql-1 (main) => psql-2 (standby):
juju relate postgresql-k8s:async-replica async-primary -m psql-2
juju run postgresql-k8s/leader promote-standby-cluster -m psql-1

> Add test app:
juju add-model app
juju deploy postgresql-test-app -m app
juju consume psql-1.psql1database -m app
juju consume psql-2.psql2database -m app
juju relate postgresql-test-app:first-database psql1database
juju run postgresql-test-app/leader start-continuous-writes

> Switch psql-1 -> 2

juju remove-relation postgresql-test-app:first-database psql1database -m app

juju remove-relation postgresql-k8s async-primary      -m psql-2
juju relate postgresql-k8s:async-replica async-primary -m psql-1
juju run postgresql-k8s/leader promote-standby-cluster -m psql-2

juju relate postgresql-test-app:first-database psql2database -m app


> Restore/switch psql-2 -> 1

juju remove-relation postgresql-test-app:first-database psql1database -m app

juju remove-relation postgresql-k8s async-primary      -m psql-1
juju relate postgresql-k8s:async-replica async-primary -m psql-2
juju run postgresql-k8s/leader promote-standby-cluster -m psql-1

juju relate postgresql-test-app:first-database psql2database -m app

Significant UX issues above:

  • a lot of relate/remove-relation calls => time + dependency on Juju in DR moment
  • constant jump of commands calls between models => human mistakes under the time pressure
  • application removal of relation to use new PRIMARY => we need pgbouncer to complete async support

Propose UI/UX (let's forget about test app and pgbouncer here for simplicity):

> Deploy 2 postgresql clusters: psql-1 (main) and psql-2 (standby):

juju add-model psql-1
juju deploy ./postgresql-k8s_ubuntu-22.04-amd64.charm --trust -n 3 --config profile=testing --resource postgresql-image=ghcr.io/canonical/charmed-postgresql@sha256:899f5455fa4557b7060990880fb0ae1fa3b21c0ab2e72ad863cc16d0b3f25fee
juju offer postgresql-k8s:async-standby async-standby

juju add-model psql-2
juju deploy ./postgresql-k8s_ubuntu-22.04-amd64.charm --trust -n 3 --config profile=testing --resource postgresql-image=ghcr.io/canonical/charmed-postgresql@sha256:899f5455fa4557b7060990880fb0ae1fa3b21c0ab2e72ad863cc16d0b3f25fee
juju offer postgresql-k8s:async-standby async-standby

juju consume psql-2.async-standby -m psql-1
juju consume psql-1.async-standby -m psql-2
juju relate postgresql-k8s:async-primary async-standby -m psql-1
juju relate postgresql-k8s:async-primary async-standby -m psql-2
  • we deployed two cluster both are offering them as standby and can stay in this position forever.
  • we created all relation upfront and all commands above are terraform compatible!
> Switch psql-1 -> 2

juju run postgresql-k8s/leader demote-cluster-leader  -m psql-1
juju run postgresql-k8s/leader promote-cluster-leader -m psql-2

> Restore/switch psql-2 -> 1 (when 2 is lost completely, cannot demote)

juju run postgresql-k8s/leader promote-cluster-as-leader -m psql-1
> ERROR: another leader `psql-2.postgresql-k8s` is a known leader, demote it first!

juju run postgresql-k8s/leader promote-cluster-as-leader -m psql-1 --force-promotion
> WARNING: another leader `psql-2.postgresql-k8s is a known leader, promoting us anyway due to '--force-promotion' ... `
  • one command => one action => one result
  • clearly define which model will be a leader: -m psql-1 => leader psql-1
  • action name is self-explanatory as promote-standby-cluster can be read as promote as standby but actually it makes new leader here.

Please check this and consider it as a future UX improvement => branch a ticket. Tnx!

Signed-off-by: Marcelo Henrique Neppel <[email protected]>
@taurus-forever
Copy link
Contributor

@marceloneppel I have re-tested the last commit (on Juju 3.4.0) using UX I posted above, the first switchover psql-1 => psql-2 stuck on psql-1 side in reinitialising replica (no changes over several days).

It can be Juju 3.4 issue, but no Juju issues noticed there. BTW Juju 3.4 has cross-model secrets fixed => reason to test there (test app can be in 3rd model app). Can you re-check it from your side?

Signed-off-by: Marcelo Henrique Neppel <[email protected]>
…-async-replication

Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
…-async-replication

Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
@marceloneppel
Copy link
Member Author

@marceloneppel I have re-tested the last commit (on Juju 3.4.0) using UX I posted above, the first switchover psql-1 => psql-2 stuck on psql-1 side in reinitialising replica (no changes over several days).

It can be Juju 3.4 issue, but no Juju issues noticed there. BTW Juju 3.4 has cross-model secrets fixed => reason to test there (test app can be in 3rd model app). Can you re-check it from your side?

Hi @taurus-forever! I fixed the issue that caused a unit to become stuck in the reinitialising replica state. Can you test it again? I already rebuilt the charm based on the last commit of this PR and uploaded it into the 14/edge/async channel.

@marceloneppel
Copy link
Member Author

It was superseded by #447.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants