Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASIC internal temperature sensors support #1517

Merged
merged 3 commits into from
Jan 4, 2021

Conversation

santhosh-kt
Copy link
Contributor

@santhosh-kt santhosh-kt commented Nov 23, 2020

What I did

  1. ASICs have multiple internal thermal sensors.
  2. With the help of configuration, a poller is introduced in Orchagent that will periodically retrieve values of these sensors with the help of SAI APIs (Add SAI_SWITCH_ATTR_CURRENT_TEMP to get average temperature from sensors opencomputeproject/SAI#880).
  3. These retrieved values are being populated to the state DB (In "ASIC_TEMPERATURE_INFO" table).

Why I did it
As part of ASIC Thermal Monitoring HLD.

How I verified it
Tested out in below platforms.
asic_thermal_log.s6100.txt
asic_thermal_log.z9264.txt
asic_thermal_log.s5232.txt
asic_thermal_log.s6000.txt

Do not merge until sonic-net/sonic-swss-common#419 is committed and synced.

Configuration Sample(config_db.json)

  1. To enable the poller:
  "ASIC_SENSORS": {
          "ASIC_SENSORS_POLLER_STATUS": {
                 "admin_status": "enable"                              
          }
   }
  1. To disable the poller:
  "ASIC_SENSORS": {
          "ASIC_SENSORS_POLLER_STATUS": {
                 "admin_status": "disable"                              
          }
   }
  1. To change the poller interval(Default time interval = 60 seconds):
  "ASIC_SENSORS": {
          "ASIC_SENSORS_POLLER_STATUS": {
                 "admin_status": "enable"                              
          },
          "ASIC_SENSORS_POLLER_INTERVAL": {
                 "interval": "30"
          }
   }

Test cases and their observations

Steps Test Case scenarios Result
(1) Enabling the poller with default time interval in config-db.json and do “config reload” Poller populated the “ASIC_TEMPERATURE_INFO” table in the state DB and updated the table every default interval.
(2) Disabling the poller in config-db.json and do “config reload” Poller disabled and the thermal sensors readings in “ASIC_TEMPERATURE_INFO” table are reset to 0
(3) Enabling the poller and setting the interval to “30” in config-db.json and do “config reload” Poller populated the “ASIC_TEMPERATURE_INFO” table in the state DB and updated the table every 30 seconds.
(4) Disabling the poller by removing the configs from the config-db.json and do “config reload” Poller disabled and the thermal sensors readings in “ASIC_TEMPERATURE_INFO” table are reset to 0
(5) Repeating the steps (1), (2), (3), and (4) but with “reboot” instead of “config reload” Same results as above
(6) Repeating the steps (1), (2), (3), and (4) but with “fast-reboot” instead of “config reload” Same results as above
(7) Downgrading the SONiC with config enabling the poller Configuration got neglected since not supported in the older version
(8) Upgrading the SONiC with config enabling the poller Poller ran and updated the “ASIC_TEMPERATURE_INFO” table in the state DB and updated the table every default interval.

Sample RedisDB ASIC Sensors Table Output(S6100)

Output: 443 => 44.3 degC

root@sonic:~# redis-cli -n 6 hgetall "ASIC_TEMPERATURE_INFO"
1) "temperature_0"
2) "443"
3) "temperature_1"
4) "452"
5) "temperature_2"
6) "457"
7) "temperature_3"
8) "447"
9) "temperature_4"
10) "433"
11) "temperature_5"
12) "457"
13) "temperature_6"
14) "423"
15) "temperature_7"
16) "443"
17) "maximum_temperature"
18) "477"
19) "average_temperature"
20) "451"
root@sonic:~#

Details if related
HLD Location: https://github.com/Azure/SONiC/blob/master/doc/asic_thermal_monitoring_hld.md

@santhosh-kt
Copy link
Contributor Author

vs build failed due to missing dependency, sonic-net/sonic-swss-common#419

jleveque pushed a commit to sonic-net/sonic-swss-common that referenced this pull request Nov 25, 2020
@jleveque
Copy link
Contributor

sonic-net/sonic-swss-common#419 has merged. Submodule update PR is here: sonic-net/sonic-buildimage#6042

@santhosh-kt
Copy link
Contributor Author

retest this please

1 similar comment
@santhosh-kt
Copy link
Contributor Author

retest this please

@santhosh-kt
Copy link
Contributor Author

retest LGTM please

@santhosh-kt
Copy link
Contributor Author

retest this please

1 similar comment
@santhosh-kt
Copy link
Contributor Author

retest this please

@padmanarayana
Copy link

@jleveque @keboliu : please review

@jleveque
Copy link
Contributor

jleveque commented Dec 1, 2020

LGTM. @kcudnik to review, as well.

@jleveque jleveque requested review from kcudnik and keboliu December 1, 2020 19:50
orchagent/switchorch.cpp Outdated Show resolved Hide resolved
orchagent/switchorch.cpp Outdated Show resolved Hide resolved
orchagent/switchorch.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@prsunny prsunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add VS tests?

orchagent/switchorch.cpp Outdated Show resolved Hide resolved
@santhosh-kt
Copy link
Contributor Author

retest this please

@padmanarayana
Copy link

@kcudnik , @prsunny : can you please review the changes ?

Copy link
Contributor

@kcudnik kcudnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please wait for others to approve

@santhosh-kt
Copy link
Contributor Author

@prsunny @keboliu : Please re-review the changes,

Copy link
Collaborator

@prsunny prsunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm..@keboliu

@santhosh-kt
Copy link
Contributor Author

@keboliu Please review.

Copy link
Collaborator

@keboliu keboliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jleveque
Copy link
Contributor

jleveque commented Jan 4, 2021

@santhosh-kt: All reviewers have signed off and sonic-net/sonic-swss-common#419 has been merged and the submodule has been updated to include that commit. Are we good to merge this now?

@santhosh-kt
Copy link
Contributor Author

@jleveque : Please proceed to merge and update the submodule.

@jleveque jleveque merged commit f6c7422 into sonic-net:master Jan 4, 2021
@santhosh-kt
Copy link
Contributor Author

@jleveque : This PR is yet to get update in the submodule.

@jleveque
Copy link
Contributor

@santhosh-kt: Please feel free to open a PR to update the sonic-swss submodule in sonic-buildimage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants