Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Yield during large v2 state res. #7735

Merged
merged 5 commits into from
Jun 24, 2020
Merged

Conversation

erikjohnston
Copy link
Member

State res v2 across large data sets can be very CPU intensive, and if
all the relevant events are in the cache the algorithm will run from
start to finish within a single reactor tick. This can result in
blocking the reactor tick for several seconds, which can have major
repercussions on other requests.

To fix this we simply add the occaisonal sleep(0) during iterations to
yield execution until the next reactor tick. The aim is to only do this
for large data sets so that we don't impact otherwise quick resolutions.

State res v2 across large data sets can be very CPU intensive, and if
all the relevant events are in the cache the algorithm will run from
start to finish within a single reactor tick. This can result in
blocking the reactor tick for several seconds, which can have major
repercussions on other requests.

To fix this we simply add the occaisonal `sleep(0)` during iterations to
yield execution until the next reactor tick. The aim is to only do this
for large data sets so that we don't impact otherwise quick resolutions.
@erikjohnston erikjohnston requested a review from a team June 23, 2020 14:42
synapse/state/v2.py Outdated Show resolved Hide resolved
synapse/state/v2.py Outdated Show resolved Hide resolved
@erikjohnston erikjohnston requested a review from a team June 23, 2020 16:57
yield _add_event_and_auth_chain_to_graph(
graph, room_id, event_id, event_map, state_res_store, auth_diff
)

# We yield occasionally when we're working with large data sets to
# ensure that we don't block the reactor loop for too long.
if idx != 0 and idx % _YIELD_AFTER_ITERATIONS == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: instead of checking idx != 0, why not make enumerate start at 1? The first yield would be 1 iteration early, but that's probably an acceptable tradeoff for avoiding a comparison on every iteration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, didn't realise that was a thing enumerate supports

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sans the improvement mentioned above, this lgtm

@erikjohnston erikjohnston merged commit 0e0a281 into develop Jun 24, 2020
@erikjohnston erikjohnston deleted the erikj/yield_during_state_res_v2 branch June 24, 2020 17:48
babolivier pushed a commit that referenced this pull request Sep 1, 2021
* commit 'dc80a0762':
  1.16.0rc1
  Back out MSC2625 implementation (#7761)
  Additional configuration options for auto-join rooms (#7763)
  Add some metrics for inbound and outbound federation processing times (#7755)
  Explain the purpose of the "tests" conditional dependency requirement (#7751)
  Add another yield point to state res v2 (#7746)
  Move flake8 to end. Don't exit script on failure (#7738)
  Make tox actions work on Debian 10 (#7703)
  Yield during large v2 state res. (#7735)
  add org.matrix.login.jwt so that m.login.jwt can be deprecated (#7675)
  Set Content-Length for Metrics requests (#7730)
  Sync ignored table names in synapse_port_db to current database schema (#7717)
  Allow local media to be marked as safe from being quarantined. (#7718)
  Convert directory handler to async/await (#7727)
  Speed up state res v2 across large state differences. (#7725)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants