Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock DB Upgrades #160

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Lock DB Upgrades #160

wants to merge 2 commits into from

Conversation

snadrus
Copy link
Contributor

@snadrus snadrus commented Aug 20, 2024

Allow max of 1 db upgrader at a time.

@snadrus snadrus requested a review from magik6k August 20, 2024 16:29
@snadrus
Copy link
Contributor Author

snadrus commented Aug 20, 2024

Well no form of db locking is going to work:
github.com/filecoin-project/curio/harmony/harmonydb.(*DB).upgrade.func1.1
/home/circleci/curio/harmony/harmonydb/harmonydb.go:256
- ERROR: ACCESS EXCLUSIVE lock mode not supported yet (SQLSTATE 0A000)

@snadrus
Copy link
Contributor Author

snadrus commented Aug 20, 2024

Is this a known problem @magik6k or a nice-to-have?
The only window of risk is on upgrade when a new schema arrives if 2 machines get there about the same time (single digit milliseconds). I think today it'll crash the "second" process (which should have a restart), then everything should be fine. I could "sleep on error and retry" to avoid that crash. WDYT?

@magik6k
Copy link
Collaborator

magik6k commented Aug 21, 2024

Is this a known problem @magik6k or a nice-to-have?

A problem that we didn't have any case that manifested in actual trouble, but I do foresee it being an issue at some point given that schema changes in YB do tend to take a while and are likely to be a large source of edge-cases given what YB does with postgres.

The only window of risk is on upgrade when a new schema arrives if 2 machines get there about the same time (single digit milliseconds).

It's usually much longer, in my deployments most migrations take 2-10s, depending on what they do

@LexLuthr
Copy link
Contributor

https://yugabytedb.tips/suppress-the-error/ advisory locks are not supported. I found out yesterday the hard way after using them for libp2p table.

@snadrus
Copy link
Contributor Author

snadrus commented Aug 21, 2024

I'd say we shelve this. If it becomes a problem, maybe the locks will arrive then and this code will be a drop-in fix.

@snadrus
Copy link
Contributor Author

snadrus commented Aug 21, 2024

One alternative is to have an upgrade 'task' that gets scheduled, deduped, claimed by those only with that version, and ran that way, but we can't start the task engine with any other task. It sounds more difficult.

@magik6k
Copy link
Collaborator

magik6k commented Oct 30, 2024

I believe I see a good way to do this:

  • Add a separate base_upgrades table
  • When we don't see an entry in the base table for a given upgrade, write the entry to the base_upgrades table
    • If that write conflicts, wait a bit (~10s) and go to step one. Either someone else has applied the update, or it failed so manual intervention is needed anyways
    • If it didn't conflict, we're now responsible for applying the schema, so do that.

@snadrus
Copy link
Contributor Author

snadrus commented Dec 5, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants