[DPE-2897] Cross-region async replication #368

marceloneppel · 2024-01-22T12:49:53Z

Issue

Connecting two PostgreSQL clusters through async replication in a cross-region setup is currently impossible.

Solution

Two endpoints were created to relate the Juju applications of the two PostgreSQL clusters (async-primary and async-replica). The relation exchanges information about the topology of the clusters (IP addresses and the endpoint of one of the sync_standby members of the main cluster - the cluster that will replicate its data to the other cluster, the secondary cluster) and also about the secrets from the main cluster.

To promote one cluster to be the main cluster, it's needed to call the promote-standby-cluster, which enables the replication between the clusters.

Also, if anything changes in the main cluster topology (like the sync_standby crashing), the endpoint of the main cluster is updated in the secondary cluster to keep replication working. If something happens with the standby_leader in the secondary cluster, another member takes that role, and the replication continues to work.

src/relations/async_replication.py contains all that logic, including the sharing of the IP addresses of the secondary cluster, which are needed to enable the replication connection from that cluster units to the main cluster through the pg_hba rules.

Stop all units to delete the cluster info from the Patroni K8S endpoint to start a new cluster replicating from the main cluster on standby. For that, it's necessary to coordinate the cluster to start the new cluster only after all units have been stopped and the cluster information is empty in the K8S endpoint. This is done in src/coordinator_ops.py.

Integration tests are implemented at #369.
Unit tests should be implemented in a separate PR too.

Deploy two models, each with 1x postgresql Then, configure async replication as follows: $ juju switch psql-1 $ juju offer postgresql-k8s:async-primary async-primary # async-primary is the relation provided by the leader $ juju switch psql-2 $ juju consume admin/psql-1.async-primary # consume the primary relation $ juju relate postgresql-k8s:async-replica async-primary # Both units are now related, where postgresql-k8s in model psql-2 is the standby-leader Now, run the action: $ juju run -m psql-1 postgresql-k8s/0 promote-standby-cluster # move postgresql-k8s in model psql-1 to be the leader cluster Run the following command to check status: $ PATRONI_KUBERNETES_LABELS='{application: patroni, cluster-name: patroni-postgresql-k8s}' \ PATRONI_KUBERNETES_NAMESPACE=psql-2 \ # update to model number PATRONI_KUBERNETES_USE_ENDPOINTS=true \ PATRONI_NAME=postgresql-k8s-0 \ PATRONI_REPLICATION_USERNAME=replication \ PATRONI_SCOPE=patroni-postgresql-k8s \ PATRONI_SUPERUSER_USERNAME=operator \ patronictl -c /var/lib/postgresql/data/patroni.yml list Role should be "Standby leader" and State should be "Running".

… will stop their services before moving on and reconfiguring

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

delgod · 2024-01-25T10:41:06Z

templates/patroni.yml.j2

@@ -116,6 +124,10 @@ postgresql:
  - {{ 'hostssl' if enable_tls else 'host' }} all all 0.0.0.0/0 md5
  {%- endif %}
  - {{ 'hostssl' if enable_tls else 'host' }} replication replication 127.0.0.1/32 md5
+  - {{ 'hostssl' if enable_tls else 'host' }} replication replication 127.0.0.6/32 md5


Just Q: why do we need to add 127.0.0.6 IP?

delgod · 2024-02-08T09:44:38Z

src/relations/async_replication.py

+        # This unit is the leader, generate  a new configuration and leave.
+        # There is nothing to do for the leader.
+        for attempt in Retrying(stop=stop_after_attempt(5), wait=wait_fixed(3)):
+            with attempt:
+                self.container.stop(self.charm._postgresql_service)
+        self.charm.update_config()
+        self.container.start(self.charm._postgresql_service)


Could you a liitle bit improve comment, it is not clear why we need to restart primary

delgod · 2024-02-08T10:01:27Z

src/relations/async_replication.py

+                    # Store current data in a ZIP file, clean folder and generate configuration.
+                    logger.info("Creating backup of pgdata folder")
+                    self.container.exec(
+                        f"tar -zcf /var/lib/postgresql/data/pgdata-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.zip /var/lib/postgresql/data/pgdata".split()


maybe a little bit more faster

Suggested change

f"tar -zcf /var/lib/postgresql/data/pgdata-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.zip /var/lib/postgresql/data/pgdata".split()

f"tar -JcpSf /var/lib/postgresql/data/pgdata-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.tar.xz /var/lib/postgresql/data/pgdata".split()

delgod · 2024-02-08T11:01:09Z

src/relations/async_replication.py

+        primary_relation = self.model.get_relation(ASYNC_PRIMARY_RELATION)
+        if not primary_relation:
+            event.fail("No primary relation")
+            return


I need some explanation why relation needed during promotion. maybe call will work better

dragomirp · 2024-02-10T11:10:34Z

poetry.lock

@@ -1688,7 +1688,6 @@ files = [
    {file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"},
    {file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"},
    {file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"},
-    {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a08c6f0fe150303c1c6b71ebcd7213c2858041a7e01975da3a99aed1e7a378ef"},


I think this is caused by python-poetry/poetry#6513

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

dragomirp · 2024-02-13T11:50:55Z

src/coordinator_ops.py

It might make sense to make this into a charm lib further down the line.

taurus-forever · 2024-02-15T12:35:40Z

Code: LGTM. We can merge this to continue from here.

IMHO, we should revise and improve UI/UX.

terraform compatibility:

create all necessary relations once (in a moment of deployment)
avoid async operations (e.g. execute A, wait for X, execute B).

K.I.S.S. to switch the main cluster must be as easy as possible and do not depend on Juju (or other external tools)
Fool protection. Disaster recovery is a stress time. The switching commands must be as simple as possible and avoid human mistakes. Ideally: one command.

Current UX:

> Deploy 2 postgresql clusters: psql-1 (main) and psql-2 (standby):

juju add-model psql-1
juju deploy postgresql-k8s --channel 14/edge/async --trust -n 3 --config profile=testing --base [email protected]
juju offer postgresql-k8s:async-primary async-primary
juju offer postgresql-k8s:database psql1database

juju add-model psql-2
juju deploy postgresql-k8s --channel 14/edge/async --trust -n 3 --config profile=testing --base [email protected]
# (skip this, historical cmd) juju deploy ./postgresql-k8s_ubuntu-22.04-amd64.charm --trust -n 3 --config profile=testing --resource postgresql-image=ghcr.io/canonical/charmed-postgresql@sha256:899f5455fa4557b7060990880fb0ae1fa3b21c0ab2e72ad863cc16d0b3f25fee
juju offer postgresql-k8s:async-primary async-primary # TODO: request --model option for `juju offer`
juju offer postgresql-k8s:database psql2database

juju consume psql-2.async-primary -m psql-1
juju consume psql-1.async-primary -m psql-2

> Replicate psql-1 (main) => psql-2 (standby):
juju relate postgresql-k8s:async-replica async-primary -m psql-2
juju run postgresql-k8s/leader promote-standby-cluster -m psql-1

> Add test app:
juju add-model app
juju deploy postgresql-test-app -m app
juju consume psql-1.psql1database -m app
juju consume psql-2.psql2database -m app
juju relate postgresql-test-app:first-database psql1database
juju run postgresql-test-app/leader start-continuous-writes

> Switch psql-1 -> 2

juju remove-relation postgresql-test-app:first-database psql1database -m app

juju remove-relation postgresql-k8s async-primary      -m psql-2
juju relate postgresql-k8s:async-replica async-primary -m psql-1
juju run postgresql-k8s/leader promote-standby-cluster -m psql-2

juju relate postgresql-test-app:first-database psql2database -m app


> Restore/switch psql-2 -> 1

juju remove-relation postgresql-test-app:first-database psql1database -m app

juju remove-relation postgresql-k8s async-primary      -m psql-1
juju relate postgresql-k8s:async-replica async-primary -m psql-2
juju run postgresql-k8s/leader promote-standby-cluster -m psql-1

juju relate postgresql-test-app:first-database psql2database -m app

Significant UX issues above:

a lot of relate/remove-relation calls => time + dependency on Juju in DR moment
constant jump of commands calls between models => human mistakes under the time pressure
application removal of relation to use new PRIMARY => we need pgbouncer to complete async support

Propose UI/UX (let's forget about test app and pgbouncer here for simplicity):

> Deploy 2 postgresql clusters: psql-1 (main) and psql-2 (standby):

juju add-model psql-1
juju deploy ./postgresql-k8s_ubuntu-22.04-amd64.charm --trust -n 3 --config profile=testing --resource postgresql-image=ghcr.io/canonical/charmed-postgresql@sha256:899f5455fa4557b7060990880fb0ae1fa3b21c0ab2e72ad863cc16d0b3f25fee
juju offer postgresql-k8s:async-standby async-standby

juju add-model psql-2
juju deploy ./postgresql-k8s_ubuntu-22.04-amd64.charm --trust -n 3 --config profile=testing --resource postgresql-image=ghcr.io/canonical/charmed-postgresql@sha256:899f5455fa4557b7060990880fb0ae1fa3b21c0ab2e72ad863cc16d0b3f25fee
juju offer postgresql-k8s:async-standby async-standby

juju consume psql-2.async-standby -m psql-1
juju consume psql-1.async-standby -m psql-2
juju relate postgresql-k8s:async-primary async-standby -m psql-1
juju relate postgresql-k8s:async-primary async-standby -m psql-2

we deployed two cluster both are offering them as standby and can stay in this position forever.
we created all relation upfront and all commands above are terraform compatible!

> Switch psql-1 -> 2

juju run postgresql-k8s/leader demote-cluster-leader  -m psql-1
juju run postgresql-k8s/leader promote-cluster-leader -m psql-2

> Restore/switch psql-2 -> 1 (when 2 is lost completely, cannot demote)

juju run postgresql-k8s/leader promote-cluster-as-leader -m psql-1
> ERROR: another leader `psql-2.postgresql-k8s` is a known leader, demote it first!

juju run postgresql-k8s/leader promote-cluster-as-leader -m psql-1 --force-promotion
> WARNING: another leader `psql-2.postgresql-k8s is a known leader, promoting us anyway due to '--force-promotion' ... `

one command => one action => one result
clearly define which model will be a leader: -m psql-1 => leader psql-1
action name is self-explanatory as promote-standby-cluster can be read as promote as standby but actually it makes new leader here.

Please check this and consider it as a future UX improvement => branch a ticket. Tnx!

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

taurus-forever · 2024-02-28T11:44:38Z

@marceloneppel I have re-tested the last commit (on Juju 3.4.0) using UX I posted above, the first switchover psql-1 => psql-2 stuck on psql-1 side in reinitialising replica (no changes over several days).

It can be Juju 3.4 issue, but no Juju issues noticed there. BTW Juju 3.4 has cross-model secrets fixed => reason to test there (test app can be in 3rd model app). Can you re-check it from your side?

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

…er gets stuck Signed-off-by: Marcelo Henrique Neppel <[email protected]>

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

marceloneppel · 2024-04-01T20:58:18Z

@marceloneppel I have re-tested the last commit (on Juju 3.4.0) using UX I posted above, the first switchover psql-1 => psql-2 stuck on psql-1 side in reinitialising replica (no changes over several days).

It can be Juju 3.4 issue, but no Juju issues noticed there. BTW Juju 3.4 has cross-model secrets fixed => reason to test there (test app can be in 3rd model app). Can you re-check it from your side?

Hi @taurus-forever! I fixed the issue that caused a unit to become stuck in the reinitialising replica state. Can you test it again? I already rebuilt the charm based on the last commit of this PR and uploaded it into the 14/edge/async channel.

marceloneppel · 2024-04-23T02:57:39Z

It was superseded by #447.

phvalguima and others added 7 commits October 24, 2023 01:18

Add first draft of standby_cluster support

b6251b4

Add support for clusters with >1 untis

0a47ced

Add lint fix

2918489

Add a coordinator logic to make sure all units of the standby cluster…

73af835

… will stop their services before moving on and reconfiguring

Merge remote-tracking branch 'origin/main' into dpe-2897-cross-region…

fabf4c3

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Add extra needed details

f3cfd31

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

github-actions bot added the Libraries: Out of sync label Jan 22, 2024

Merge remote-tracking branch 'origin/main' into dpe-2897-cross-region…

6973451

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

github-actions bot added Libraries: OK and removed Libraries: Out of sync labels Jan 22, 2024

Add all secrets and add minor fixes to code

789217c

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

marceloneppel marked this pull request as ready for review January 23, 2024 12:49

marceloneppel requested review from delgod, taurus-forever and dragomirp January 23, 2024 12:49

marceloneppel mentioned this pull request Jan 23, 2024

[POC] Add standby/active multi-cluster support #301

Closed

delgod approved these changes Feb 8, 2024

View reviewed changes

dragomirp reviewed Feb 10, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into dpe-2897-cross-region…

b0d8ed3

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

dragomirp reviewed Feb 13, 2024

View reviewed changes

src/coordinator_ops.py Outdated

Copy link

Contributor

dragomirp Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to make this into a charm lib further down the line.

dragomirp approved these changes Feb 13, 2024

View reviewed changes

taurus-forever approved these changes Feb 15, 2024

View reviewed changes

Enable switchover to another cluster

ae1dcc0

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

github-actions bot added Libraries: Out of sync and removed Libraries: OK labels Feb 20, 2024

marceloneppel added 2 commits March 18, 2024 10:38

Merge remote-tracking branch 'origin/main' into dpe-2897-cross-region…

02a19f1

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Fix standby cluster trying to write to the database

f9d878d

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

marceloneppel added 8 commits March 18, 2024 15:15

Add missing parameter

1aa144d

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Remove previous cluster information to avoid that the secondary clust…

de6c552

…er gets stuck Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Merge remote-tracking branch 'origin/main' into dpe-2897-cross-region…

f52dfdc

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Format code

e5a41c5

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Fix check for standby cluster

fc5a613

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Remove unnecessary pgdata backup creation

c05200b

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Merge remote-tracking branch 'origin/main' into dpe-2897-cross-region…

91ce71a

…-async-replication Signed-off-by: Marcelo Henrique Neppel <[email protected]>

Handle replicas issues

2850ec7

Signed-off-by: Marcelo Henrique Neppel <[email protected]>

marceloneppel requested a review from taurus-forever April 2, 2024 11:48

marceloneppel closed this Apr 23, 2024

marceloneppel mentioned this pull request Apr 23, 2024

[DPE-2897] Cross-region async replication #447

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DPE-2897] Cross-region async replication #368

[DPE-2897] Cross-region async replication #368

marceloneppel commented Jan 22, 2024 •

edited

Loading

delgod Jan 25, 2024

delgod Feb 8, 2024

delgod Feb 8, 2024

delgod Feb 8, 2024

dragomirp Feb 10, 2024

dragomirp Feb 13, 2024

taurus-forever commented Feb 15, 2024 •

edited

Loading

taurus-forever commented Feb 28, 2024

marceloneppel commented Apr 1, 2024

marceloneppel commented Apr 23, 2024

	f"tar -zcf /var/lib/postgresql/data/pgdata-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.zip /var/lib/postgresql/data/pgdata".split()
	f"tar -JcpSf /var/lib/postgresql/data/pgdata-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.tar.xz /var/lib/postgresql/data/pgdata".split()

[DPE-2897] Cross-region async replication #368

[DPE-2897] Cross-region async replication #368

Conversation

marceloneppel commented Jan 22, 2024 • edited Loading

Issue

Solution

delgod Jan 25, 2024

Choose a reason for hiding this comment

delgod Feb 8, 2024

Choose a reason for hiding this comment

delgod Feb 8, 2024

Choose a reason for hiding this comment

delgod Feb 8, 2024

Choose a reason for hiding this comment

dragomirp Feb 10, 2024

Choose a reason for hiding this comment

dragomirp Feb 13, 2024

Choose a reason for hiding this comment

taurus-forever commented Feb 15, 2024 • edited Loading

taurus-forever commented Feb 28, 2024

marceloneppel commented Apr 1, 2024

marceloneppel commented Apr 23, 2024

marceloneppel commented Jan 22, 2024 •

edited

Loading

taurus-forever commented Feb 15, 2024 •

edited

Loading