Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear local logged stores if input checkpoints are empty #1713

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

ajothomas
Copy link
Contributor

@ajothomas ajothomas commented Dec 18, 2024

Symptom:
If an external utility attempts to reset a job's state by clearing backups and checkpoints, the local state still remains. We need to clear the local rocksdb state of stateful jobs, if the inputs checkpoints don't exist.

Changes:

  • This PR checks if the input checkpoints for a container are empty and clears the local logged stores if they are.

Tests:

  • ./gradlew build
  • Added unit test in TestContainerStorageManager to verify that stores are deleted from logged store dir on empty checkpoints
  • Tested with a test job which had its checkpoints and backup cleared.
2024-12-18 17:57:55.891 [main] ContainerStorageManager [INFO] No checkpoints read. Attempting to clear logged stores.
2024-12-18 17:57:55.891 [main] ContainerStorageManager [INFO] Clearing store dir /export/content/data/samsa-yarn/logged-stores/aj-stateful-test-i100/beamStore from logged stores.
2024-12-18 17:57:55.897 [main] ContainerStorageManager [INFO] Clearing store dir /export/content/data/samsa-yarn/logged-stores/aj-stateful-test-i100/countState from logged stores.

API Changes:
None

Upgrade Instructions:
None

Usage Instructions:
None

Copy link
Contributor

@xinyuiscool xinyuiscool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One minor style fix.

@ajothomas ajothomas force-pushed the ResetStoresOnNoCheckpoints branch 6 times, most recently from ea4ec8c to f9e367a Compare December 18, 2024 21:09
@ajothomas ajothomas force-pushed the ResetStoresOnNoCheckpoints branch from f9e367a to 4e4509f Compare December 18, 2024 21:11
@ajothomas ajothomas merged commit ef8bc76 into apache:master Dec 19, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants