-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DPE-2897] Cross-region async replication #447
Conversation
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
…ation Signed-off-by: Marcelo Henrique Neppel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have finished the initial testing phase, let's continue as followup UX improvements here.
filename = f"{POSTGRESQL_DATA_PATH}-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.tar.gz" | ||
self.container.exec( | ||
f"tar -zcf {filename} {POSTGRESQL_DATA_PATH}".split() | ||
).wait_output() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create followup tickets:
- pack archive in background (otherwise promotion to Standby will take a LOT of time if local DB is huge)
- warn users about available backups to clean (free disk space topic), goss is a good match here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ation Signed-off-by: Marcelo Henrique Neppel <[email protected]>
poetry.lock
Outdated
@@ -1645,7 +1645,6 @@ files = [ | |||
{file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"}, | |||
{file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"}, | |||
{file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"}, | |||
{file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a08c6f0fe150303c1c6b71ebcd7213c2858041a7e01975da3a99aed1e7a378ef"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can avoid those flyby changes by running poetry cache clear PyPI --all
and then poetry lock --no-update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. It did the trick.
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
* Add async replication implementation Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Backup standby pgdata folder Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Improve comments and logs Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Remove unused constant Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Remove warning log call and add optional type hint Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Revert poetry.lock Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Revert poetry.lock Signed-off-by: Marcelo Henrique Neppel <[email protected]> --------- Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Issue
It's not possible to replicate data between regions.
Solution
Implement cross-region async replication. This PR is a rebranded and more stable version of #368.
With this PR, it's no longer necessary to remove the relation and relate again when a switchover is needed.
Also, the names of the relations can easily be changed to others, like
cluster-one
andcluster-two
, for example, to avoid confusing users.Important changes:
src/relations/async_replication.py
contains the logic to make one cluster the primary and the other the standby. To make the standby cluster follow the primary cluster, the candidate for the standby cluster needs to be restarted._on_async_relation_changed
, which takes care of restarting the standby cluster units databases in order to make them replicate data from the primary cluster.The
127.0.0.6/32
address added to the Patroni configuration file is needed to allow Envoy to make different clusters communicate when using Istio.Passwords update will be implemented in another PR, as this one is already huge.
If the standby cluster has its relation removed, it goes to a
read-only mode
and can be promoted later to a normal cluster through thepromote-cluster
action.How to deploy: https://discourse.charmhub.io/t/charmed-postgresql-k8s-deploy-async-replication/13895
How to trigger a switchover: https://discourse.charmhub.io/t/charmed-postgresql-k8s-deploy-async-replication/13895
Additional instructions:
Integration tests: #448