Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ledger: add callback to clear state between commitRound retries #6190

Merged
merged 10 commits into from
Dec 10, 2024

Conversation

cce
Copy link
Contributor

@cce cce commented Dec 4, 2024

Summary

Some of our unit tests use an in-memory SQLite DB, rather than file-based SQLite, to make tests run faster. This requires enabling shared-cache mode so multiple goroutines can hold connections to the same in-memory DB. However, in shared cache mode, even read operations require table-level locks.

To handle these lock errors, our dbutil.go wrapper around SQLite transactions (AtomicContext) implements retry logic, where a provided function is retried multiple times when the error is sqlite3.ErrLocked or sqlite3.ErrBusy. Our unit tests that use concurrent connections to the same in-memory SQLite DB often show many retries before successfully committing, due to contention. In regular on-disk operation, shared cache mode is not enabled, and these errors do not occur.

The catchpointtracker's commitRound() function flushes a batch of round updates to a merkle trie. Unfortunately commitRound() cannot be safely retried inside an AtomicContext, because it updates the trie's SQLite table as well as an in-memory cache. When retries occur, the DB transaction are rolled back (along with other tracker's committed), but the in-memory data is not rolled back. This extra callback allows the catchpointtracker to clear state between retries of commitRound().

Related: #5568

Test Plan

This should only impact tests that use in-memory SQLite, used for faster test performance, and make them more reliable. A new test TestCatchpointTrackerFastRoundsDBRetry was added that tries to corrupt the merkle trie was added, and is flaky (fails most of the time depending on timing/luck) without this PR.

@cce cce added the Bug-Fix label Dec 4, 2024
@cce cce requested a review from algorandskiy December 4, 2024 21:30
@gmalouf gmalouf requested review from jannotti and gmalouf December 5, 2024 18:07
Copy link
Contributor

@gmalouf gmalouf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, looks okay overall.

Copy link

codecov bot commented Dec 5, 2024

Codecov Report

Attention: Patch coverage is 17.85714% with 46 lines in your changes missing coverage. Please review.

Project coverage is 51.85%. Comparing base (b7b3e5e) to head (6236a53).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
ledger/store/trackerdb/dualdriver/dualdriver.go 0.00% 16 Missing ⚠️
util/db/dbutil.go 25.00% 9 Missing ⚠️
...edger/store/trackerdb/sqlitedriver/sqlitedriver.go 0.00% 6 Missing ⚠️
ledger/tracker.go 53.84% 6 Missing ⚠️
ledger/catchpointtracker.go 0.00% 5 Missing ⚠️
...ger/store/trackerdb/pebbledbdriver/pebbledriver.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6190      +/-   ##
==========================================
- Coverage   51.88%   51.85%   -0.04%     
==========================================
  Files         639      639              
  Lines       85489    85508      +19     
==========================================
- Hits        44359    44339      -20     
- Misses      38320    38354      +34     
- Partials     2810     2815       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ledger/tracker.go Outdated Show resolved Hide resolved
@cce cce force-pushed the retry-rollback-ledger branch from 0feac4b to 49dad4c Compare December 6, 2024 02:32
@cce cce merged commit f87ae8a into algorand:master Dec 10, 2024
24 checks passed
@cce cce deleted the retry-rollback-ledger branch December 10, 2024 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants