Skip to content

Commit

Permalink
[chassis][syncd][sai] Adjusting response timeout during syncd init (s…
Browse files Browse the repository at this point in the history
…onic-net#2159)

In VOQ based chassis where syncd uses VOQ SAI, if there are large
number of front panel ports, SAI takes more than 1 minutes to complete
the switch create initialization. Because of this, the switch create
request sent by orchagent is not getting response within the default
response wait time of 1 minute. So the orchagent declares switch create
failure and crashes.

To fix this, in orchagent, the syncd response time out is set to 5
minutes for line (voq) card and 10 minutes for supervisor (fabric) card
before sending request for switch create and is set back to default wait
time after the switch create.

Signed-off-by: vedganes <[email protected]>
  • Loading branch information
vganesan-nokia authored Mar 1, 2022
1 parent 0a99f54 commit 6e5ed1c
Showing 1 changed file with 46 additions and 0 deletions.
46 changes: 46 additions & 0 deletions orchagent/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -574,6 +574,36 @@ int main(int argc, char **argv)
attr.value.u64 = gSwitchId;
attrs.push_back(attr);

if (gMySwitchType == "voq" || gMySwitchType == "fabric")
{
/* We set this long timeout in order for orchagent to wait enough time for
* response from syncd. It is needed since switch create takes more time
* than default time to create switch if there are lots of front panel ports
* and systems ports to initialize
*/

if (gMySwitchType == "voq")
{
attr.value.u64 = (5 * SAI_REDIS_DEFAULT_SYNC_OPERATION_RESPONSE_TIMEOUT);
}
else if (gMySwitchType == "fabric")
{
attr.value.u64 = (10 * SAI_REDIS_DEFAULT_SYNC_OPERATION_RESPONSE_TIMEOUT);
}

attr.id = SAI_REDIS_SWITCH_ATTR_SYNC_OPERATION_RESPONSE_TIMEOUT;
status = sai_switch_api->set_switch_attribute(gSwitchId, &attr);

if (status != SAI_STATUS_SUCCESS)
{
SWSS_LOG_WARN("Failed to set SAI REDIS response timeout");
}
else
{
SWSS_LOG_NOTICE("SAI REDIS response timeout set successfully to %" PRIu64 " ", attr.value.u64);
}
}

status = sai_switch_api->create_switch(&gSwitchId, (uint32_t)attrs.size(), attrs.data());
if (status != SAI_STATUS_SUCCESS)
{
Expand All @@ -582,6 +612,22 @@ int main(int argc, char **argv)
}
SWSS_LOG_NOTICE("Create a switch, id:%" PRIu64, gSwitchId);

if (gMySwitchType == "voq" || gMySwitchType == "fabric")
{
/* Set syncd response timeout back to the default value */
attr.id = SAI_REDIS_SWITCH_ATTR_SYNC_OPERATION_RESPONSE_TIMEOUT;
attr.value.u64 = SAI_REDIS_DEFAULT_SYNC_OPERATION_RESPONSE_TIMEOUT;
status = sai_switch_api->set_switch_attribute(gSwitchId, &attr);

if (status != SAI_STATUS_SUCCESS)
{
SWSS_LOG_WARN("Failed to set SAI REDIS response timeout to default");
}
else
{
SWSS_LOG_NOTICE("SAI REDIS response timeout set successfully to default: %" PRIu64 " ", attr.value.u64);
}
}

if (gMySwitchType != "fabric")
{
Expand Down

0 comments on commit 6e5ed1c

Please sign in to comment.