Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8067. [Snapshot] Revisit locks on deletedTable and deletedDirTable #4701

Merged
merged 2 commits into from
May 18, 2023

Conversation

smengcl
Copy link
Contributor

@smengcl smengcl commented May 11, 2023

What changes were proposed in this pull request?

During snapshot creation, access to deletedTable and deletedDirectoryTable would need to be synchronized with KeyDeletingTask and DirDeletingTask to avoid out-of-order access (read/write) messing up either table.

Here are the code logics that justify the locks:

createOmSnapshotCheckpoint flow, called from OMSnapshotCreateResponse#addToDBBatch

  1. Acquire getTableLock(deletedDirectoryTable) write lock, then acquiregetTableLock(deletedTable) write lock
  2. In deletedTable, remove all keys with prefix matching snapshot scope path (bucket)
  3. In deletedDirectoryTable, remove all keys with prefix matching snapshot scope path (bucket)
  4. Release getTableLock(deletedTable) write lock, then releasegetTableLock(deletedDirectoryTable) write lock

KeyDeletingTask#call flow

  1. Acquire getTableLock(deletedTable) write lock
  2. getPendingDeletionKeys(): (currently) retrieves a number of keys from active DB's deletedTable
  3. processKeyDeletes(): delete key blocks with SCM client deleteKeyBlocks(), submits PurgeKeysRequest Ratis request which then removes successfully reclaimed keys from active deletedTable
  4. Release getTableLock(deletedTable) write lock

DirDeletingTask#call flow

  1. Acquire getTableLock(deletedDirectoryTable) write lock
  2. Iterate over active deletedDirectoryTable, prepare a list of PurgePathRequests, each contains immediate children (keys and dirs) under this directory.
  3. Acquire getTableLock(deletedTable) write lock Not needed. See HDDS-8067. [Snapshot] Revisit locks on deletedTable and deletedDirTable #4701 (comment)
  4. optimizeDirDeletesAndSubmitRequest(): recurse further into sub-dirs if batch limit pathLimitPerTask isn't reached. Q: Can we refactor the same dir expansion logic? One, Two, Three @aswinshakil
  5. Submit PurgePathRequests to Ratis
  6. Release getTableLock(deletedTable) write lock
  7. Release getTableLock(deletedDirectoryTable) write lock

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8067

How was this patch tested?

  • All existing tests should pass.

@smengcl smengcl added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label May 11, 2023
Copy link
Member

@aswinshakil aswinshakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the path @smengcl. I have one comment inline. Overall LGTM! 👍

Copy link
Member

@aswinshakil aswinshakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix @smengcl

@smengcl
Copy link
Contributor Author

smengcl commented May 18, 2023

Thanks @aswinshakil for the review and suggestion.

@sokui sokui mentioned this pull request Dec 11, 2024
@swamirishi swamirishi mentioned this pull request Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
snapshot https://issues.apache.org/jira/browse/HDDS-6517
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants