Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Remove whole table locks on push rule add/delete #16051

Merged
merged 11 commits into from
Nov 13, 2023

Conversation

Fizzadar
Copy link
Contributor

@Fizzadar Fizzadar commented Aug 2, 2023

The statements are already executed within a transaction thus a table level lock is unnecessary.

See: #16053

Signed-off-by: Nick @ Beeper (@Fizzadar)

Pull Request Checklist

The statements are already executed within a transaction thus a table
level lock is unnecessary.
@Fizzadar Fizzadar marked this pull request as ready for review August 2, 2023 14:27
@Fizzadar Fizzadar requested a review from a team as a code owner August 2, 2023 14:27
@clokep
Copy link
Member

clokep commented Aug 3, 2023

For reference, these were added in #578.

Comment on lines -399 to -454
# Lock the table since otherwise we'll have annoying races between the
# SELECT here and the UPSERT below.
self.database_engine.lock_table(txn, "push_rules")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to describe the race to ensure we have a shared understanding. Given two rules rule_A and rule_B with priorities 0 and 1, respectively. If you make two requests:

  1. Add a new push rule (rule_X) after rule_A, this should make rule_B a priority of 2; rule_X priority 1; and leave rule_A at 0.
  2. Add a new push rule (rule_Y) after rule_A, this should make rule_B a priority of 3; rule_X priority 2; rule_Y priority 1; and leave rule_A at 0.

This is in a transaction (I assume running at READ COMMITTED), so what happens if these race?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the winner will be applied and the second transaction will be replayed using the new updated data, per (added some breaks to make it easier to read, my brain hurts!):

UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress).

If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits .... it will attempt to apply its operation to the updated version of the row.

The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row.

If my understanding is correct READ COMMITTED will effectively correct the issue by the replay.

Copy link
Contributor Author

@Fizzadar Fizzadar Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said! Synapse's default is actually one better, REPEATABLE READ, in which case things are much simpler:

UPDATE, DELETE, MERGE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the transaction start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the repeatable read transaction will wait for the first updating transaction to commit ... if the first updater commits (and actually updated or deleted the row, not just locked it) then the repeatable read transaction will be rolled back with the message ERROR: could not serialize access due to concurrent update

Which synapse automatically retries, which would replay the transaction as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final note - there is an issue somewhere about switching to READ COMMITTED as the default, but it seems that would also suffice here in terms of the potential race conditions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I believe that you're correct (but my brain also hurts)!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree here. As things stand I don't see why we'd necessarily replay the transactions, as we may not have updated/deleted/locked any of the rows we SELECT against (and inserting a new row that would have been picked up by a SELECT isn't picked up by postgres except in SERIALIZABLE isolation AIUI).

I think what you want here is to run the selects with a FOR SHARE so that they do conflict with each other?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess most of the time the UPDATE will conflict, but if we have two requests to add a rule to the top of the push rules those transactions should conflict but won't?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense, will add FOR SHARE in 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, we probably need a FOR UPDATE instead as we need the SELECT statements to conflict with each other and FOR SHARE won't do that https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-ROWS

@clokep clokep requested a review from a team August 3, 2023 17:42
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs a FOR SHARE adding to the selects?

@Fizzadar
Copy link
Contributor Author

I think this is now correct; couple of sytest fails but they look unrelated?

@Fizzadar Fizzadar requested a review from erikjohnston August 11, 2023 17:24
SELECT * FROM push_rules
WHERE user_name = ? and priority_class = ?
FOR SHARE
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth doing a txn.execute for this SQL? 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also, you probably want to do this as part of the COUNT(*) query below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep good spot - 2ec17da

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still missing a tx.execute here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah, missed after rewriting it again in 2ec17da; fix: 376313e

Comment on lines -399 to -454
# Lock the table since otherwise we'll have annoying races between the
# SELECT here and the UPSERT below.
self.database_engine.lock_table(txn, "push_rules")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, we probably need a FOR UPDATE instead as we need the SELECT statements to conflict with each other and FOR SHARE won't do that https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-ROWS

@clokep clokep added the X-Awaiting-Changes A contributed PR which needs changes and re-review before it can be merged label Sep 25, 2023
@Fizzadar
Copy link
Contributor Author

Fizzadar commented Nov 5, 2023

Took a while, finally got round to fixing this up!

Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Fizzadar can you sign off please?

@Fizzadar
Copy link
Contributor Author

@erikjohnston updated PR comment, I think I had the wrong format for the signoff check!

@erikjohnston
Copy link
Member

@erikjohnston updated PR comment, I think I had the wrong format for the signoff check!

Oh, heh, I checked the comment and just didn't see it!

@erikjohnston
Copy link
Member

@Fizzadar Oh, there appears to be merge conflicts somehow? Can you merge in develop?

@clokep clokep removed their request for review November 13, 2023 13:54
# Conflicts:
#	synapse/storage/databases/main/push_rule.py
@Fizzadar
Copy link
Contributor Author

@erikjohnston merged!

@erikjohnston erikjohnston merged commit 0e36a57 into matrix-org:develop Nov 13, 2023
38 checks passed
@Fizzadar Fizzadar deleted the remove-push-rule-table-locks branch November 13, 2023 18:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Awaiting-Changes A contributed PR which needs changes and re-review before it can be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants