Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create overview page for the suspicious replica recoverer daemon #373

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ChristophAmes
Copy link

The suspicious replica recoverer daemon isn't easy to understand at first glance. Therefore, an overview page for the suspicious replica recoverer daemon will be created, which will explain the general structure of the daemon and what the goals of each step are.

@ChristophAmes ChristophAmes self-assigned this Aug 28, 2024
@cserf
Copy link

cserf commented Aug 30, 2024

The text is OK. I would add a schema describing the state machine similar to https://github.com/rucio/documentation/blob/main/website/static/img/request_state_transition_chart.svg even if it might be a bit complicated to fit everything in just one figure

-**ignore**: this is the default policy. Datatypes and scopes can be explicitly set to be ignored, which highlights that a decision has purposefully been made to not perform any actions on these replicas. This is done to prevent mistakes in the future.
-**declare bad**: this dictates that any associated datatypes or scopes will be declared `BAD` by the daemon.
-**dry run**: this policy makes the daemon handle the replicas as if they were to be declared `BAD`, but at the final step, no actions are taken. This results in log messages with which it becomes possible to see how many replicas of the given datatype/scope would be declared `BAD` by the daemon.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should mention that the poilcy is a json file and show an example

@voetberg
Copy link
Contributor

Can you add a reference to this page here? Could be useful to also understand why replicas become suspicious in the first place and where the recoverer daemon comes in

@bari12
Copy link
Member

bari12 commented Oct 22, 2024

Ping @ChristophAmes would you mind still doing these changes?

@haozturk
Copy link

haozturk commented Nov 7, 2024

I spent a while to read this daemon's code and adapt it into CMS. I'd like to add things into its documentation including this flow chart [1] and how it should be configured [2], which isn't trivial. If this PR is going to merged, I can wait and make a new PR to add my changes.

[1]https://cmsdmops.docs.cern.ch/CMSRucio/Daemon_configurations/image.png
[2]https://cmsdmops.docs.cern.ch/CMSRucio/Daemon_configurations/replica_recoverer/

@ChristophAmes ChristophAmes force-pushed the 372-Create_overview_page_for_the_suspicious_replica_recoverer_daemon branch 2 times, most recently from 326d12c to 1f1a888 Compare November 8, 2024 13:20
@ChristophAmes
Copy link
Author

Sorry I haven't responded, I've been rather busy.
I've added an example for the JSON file and added a link to the overview of the replica workflow.
I think the flow chart created by @haozturk looks good, so I won't make one myself.

docs/operator/suspicious_replica_recoverer.md Outdated Show resolved Hide resolved
docs/operator/suspicious_replica_recoverer.md Outdated Show resolved Hide resolved
docs/operator/suspicious_replica_recoverer.md Outdated Show resolved Hide resolved
docs/operator/suspicious_replica_recoverer.md Outdated Show resolved Hide resolved
@ChristophAmes ChristophAmes force-pushed the 372-Create_overview_page_for_the_suspicious_replica_recoverer_daemon branch from 1f1a888 to fa7ed96 Compare November 25, 2024 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants