-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of a Monitoring Daemon for storage devices in SONiC switches #433
Conversation
…ge*' to include all disk types
Added to the PR. |
|
/azpw run |
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
sonic-stormond/scripts/stormond
Outdated
|
||
STORAGEUTIL_LOAD_ERROR = 127 | ||
|
||
log = syslogger.SysLogger(SYSLOG_IDENTIFIER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@assrinivasan can we move this inside daemon calss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in latest
|
||
if value is None: self.log_warning("{}:{} value = None in StateDB".format(storage_device, field)) | ||
|
||
self.statedb_storage_info_loaded = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@assrinivasan what if the value is None, in that case we should fall back to .json on the disk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this in latest. Also added a None
check in the _load_fsio_rw_json
function for None values. In this scenario, Both StateDB and JSON file have junk values, so it will be considered an init case.
sonic-stormond/scripts/stormond
Outdated
if self.statedb_storage_info_loaded == False and self.fsio_json_file_loaded == True: | ||
self.use_fsio_json_baseline = True | ||
self.use_statedb_baseline = False | ||
|
||
# If stormond is coming back up after a daemon crash, storage information would be saved in the | ||
# STATE_DB. In that scenario, we use the STATE_DB information as the SoT and reconcile the FSIO | ||
# reads and writes values. | ||
elif self.statedb_storage_info_loaded == True: | ||
self.use_fsio_json_baseline = False | ||
self.use_statedb_baseline = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@assrinivasan can you make the logic more clear, i.e, if the stats are available in STATE_DB, then use that and as a fallback use .json values from the backup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in latest
/azpw run |
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
Description
This commit adds a monitoring daemon for Storage device attributes on a device running SONiC.
SONiC Storage Monitoring Daemon HLD
Motivation and Context
Storage devices experience performance degradation over time on account of a variety of factors such as overall disk writes, bad-blocks management, lack of free space, sub-optimal operational temperature and good-old wear-and-tear which speaks to the overall health of the disk.
The goal of the Storage Monitoring Daemon (storagemond) is to provide meaningful metrics for the aforementioned issues and enable streaming telemetry for these attributes so that the required preventative measures are triggered in the eventuality of performance degradation.
How Has This Been Tested?
Has been manually tested on following platforms:
7050cx3.txt
S6100.txt
SN2700.txt
Additional Information (Optional)