Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Implement better garbage collection of snapshots #10

Open
3 tasks
Tracked by #1
shreyas-s-rao opened this issue Nov 6, 2023 · 0 comments
Open
3 tasks
Tracked by #1

[Feature] Implement better garbage collection of snapshots #10

shreyas-s-rao opened this issue Nov 6, 2023 · 0 comments
Labels
kind/enhancement Enhancement, improvement, extension lifecycle/stale Nobody worked on this for 6 months (will further age)

Comments

@shreyas-s-rao
Copy link
Collaborator

How to categorize this issue?

/kind enhancement

What would you like to be added:
Steward should provide a garbage collection mechanism for the uploaded snapshots, to ensure that the object storage container stays as lean as possible, helping reduce costs. Garbage collection options provided to the user must be robust, ie, users must be offered flexibility to choose from different garbage collection strategies/policies, and hardcoding of policy values must be avoided.

Snapshot retention policies to be offered: (assume that a snapshot set refers to a full snapshot along with the delta snapshots on top of it, up till the next full snapshot; a snapshot set is used for restoration of the etcd, and the below policies are defined for snapshot sets rather than individual snapshots)

  1. Time-based: All snapshots sets will be retained for the time mentioned. Ex: retain for the last 12 hrs
  2. Count-based: All snapshots sets will be retained based on a maximum count. Ex: retain the last 20 snapshot sets
  3. Calendar-based: Using a calendar schedule, we define the following units: Hour, Day, Week and Month.
    Hour - is minute [00, 59]
    Day - is hour:minute [00:00 - 23:59]
    Week - is day-of-week [Monday - Sunday]
    Month - is day-of-month [0 - month-end]
    Each unit has has 2 aspects: Max per unit and Number of previous units, defined as X/Y, which tells us the maximum number of snapshot sets to retain (X) per every time unit for the previous Y time units.
    Example:
    Hour (max per hour/number of previous hrs) - 1/5
    Day - (max per day/number of previous days) - 2/7
    Week - (max per week/number of previous weeks) - 1/3
    Month - (max per month/number of previous months) - 2/5
    Let's assume that a full snapshot schedule of once every 30 mins.
    Let's also assume that garbage collection at 15:10 hrs UTC on 13-Sep-2023. The above schedule will then be interpreted in the following manner:
    * All snapshot sets in the current hour (15:00-15:59) will be retained, since the current hour is not considered for garbage collection.
    * All snapshot sets from the previous hour for last 5 hours (10:00 - 14:59) a max of 1 (latest) for every hour will be retained. This will retain a total of 5 snapshot sets (1 per hour, for the previous 5 hours)
    * Once the hour schedule is handled, then it considers the day schedule. This starts from the previous day computed from reference point - 10:00. Start day is the previous day which is 12th Sep from which it applies the rules for day which states that take the latest 2 snapshot sets per day for previous 7 days starting 12th Sep all the way upto 6th. This will retain a total of 12 snapshot sets (2 per day for the previous 6 days)
    * Once the day schedule is handled, then it considers week schedule. The week before is 28th Aug - 3rd Sep from where it starts and takes 1 latest snapshot per week for a total previous 3 weeks upto 14th Aug. This will retain a total of 3 snapshot sets (1 per week for the previous 3 weeks)
    * Once the week schedule is handled, then it considers monthly schedule. This starts from previous month which is July and takes a latest of 2 snapshots per month for previous 5 months starting July backward till March. This will retain a total of 10 snapshot sets (2 per month for the previous 5 months)

Why is this needed:
Part of #1

Task List:

  • Snapshot garbage collection with time-based retention policy
  • Snapshot garbage collection with count-based retention policy
  • Snapshot garbage collection with calendar-based retention policy
@shreyas-s-rao shreyas-s-rao added the kind/enhancement Enhancement, improvement, extension label Nov 6, 2023
@shreyas-s-rao shreyas-s-rao changed the title [Feature] Implement robust garbage collection of snapshots [Feature] Implement better garbage collection of snapshots Nov 6, 2023
@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Enhancement, improvement, extension lifecycle/stale Nobody worked on this for 6 months (will further age)
Projects
None yet
Development

No branches or pull requests

2 participants