-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic port configuration - solve lldp issues when adding/removing ports #9386
Dynamic port configuration - solve lldp issues when adding/removing ports #9386
Conversation
/azpw run |
/AzurePipelines run |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
do we know exactly why we see this |
dockers/docker-lldp/lldpmgrd
Outdated
@@ -100,31 +109,24 @@ class LldpManager(daemon_base.DaemonBase): | |||
def is_port_up(self, port_name): | |||
""" | |||
Determine if a port is up or down by looking into the oper-status for the port in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change oper-status
to netdev-oper-status
to be more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be changed
return port_oper_status == "up" | ||
else: | ||
return False | ||
else: | ||
# Retrieve PortInitDone entry from the Port table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we log info for debug purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case the port is not up or not exist on table we have log info for it outside the function
dockers/docker-lldp/lldpmgrd
Outdated
|
||
if port_alias == None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have such case since you check port_alias above? Even if so, let's use AND check in above to share the log_info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not needed, i will remove it
dockers/docker-lldp/lldpmgrd
Outdated
self.pending_cmds.pop(key, None) | ||
else: | ||
self.log_error("unknown operation") | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove trailing spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
dockers/docker-lldp/lldpmgrd
Outdated
if fvp and "up" in dict(fvp).get("oper_status",""): | ||
self.generate_pending_lldp_config_cmd_for_port(key, dict(fvp)) | ||
else: | ||
self.pending_cmds.pop(key, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if fvp is empty, should we do no-op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we have SET command fvp should always be with values.
but I will change it in a way that:
self.pending_cmds.pop(key, None)
will be executed only if oper_state is "down"
we still don't know what is the reason for this warning message, even when the host interface is up we see this message. |
/azpw run |
/AzurePipelines run |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
/azpw run |
/AzurePipelines run |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
failed checkers due to unknown reason: -- Docs: https://docs.pytest.org/en/latest/warnings.html |
/azpw run |
/AzurePipelines run |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
not sure why we see these failures on kvm tests-t0:
|
This does not make sense. In case host interface is up, lldpctl should not reporting |
/azpw run |
/AzurePipelines run |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
/azpw run Azure.sonic-buildimage |
/AzurePipelines run Azure.sonic-buildimage |
Azure Pipelines successfully started running 1 pipeline(s). |
I checked it again and I noticed that the command is failing (return code is not 0) but the error message (stderr) is an empty string and not "cannot find port Ethernet..." only after 2 or 3 retries, the return code is 0 - we see this only on several random ports, not always. so, I guess that checking the oper state of the host interface (from state db) did solve these warning messages ("cannot find port Ethernet...") |
/AzurePipelines run Azure.sonic-buildimage (Test kvmtest-t0) |
Commenter does not have sufficient privileges for PR 9386 in repo Azure/sonic-buildimage |
/azpw run Azure.sonic-buildimage (Test kvmtest-t0) |
/AzurePipelines run Azure.sonic-buildimage (Test kvmtest-t0) |
No pipelines are associated with this pull request. |
@zhenggen-xu could you please recheck last comment? |
what is the return code during the failure? Can we find out a bit more why it was the case? Sorry for insist here, to understand the exact reason and wait on that condition would be more consistent and deterministic. Retry might be simple to hide the issue but might not be reliable depending on the failure. |
the return code in these cases is 1, I don't think it is saying something about the failure.. |
@zhenggen-xu - Can you please approve this PR ? |
@zhenggen-xu - Can you please approve this PR ? |
@zhenggen-xu - can you please review and approve ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should open a issue link to here for later to understand why lldpcli would fail even when linux port was up.
Issue was opened: Endless LLDP warning messages when removing a port |
@qiluo-msft - Who can merge this PR ? Thanks. |
Add unit test? |
@abdosi - can you please review \ approve ? |
@qiluo-msft \ @abdosi - This PR was reviewed and approved by @zhenggen-xu . Can you please review as well as we would like to merge it ? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@qiluo-msft - As this PR was approved , who can merge it ? |
…orts (#9386) #### Why I did it when adding and removing ports after init stage we saw two issues: first: In several cases, after removing a port, lldpmgr is continuing to try to add a port to lldp with lldpcli command. the execution of this command is continuing to fail since the port is not existing anymore. second: after adding a port, we sometimes see this warning messgae: "Command failed 'lldpcli configure ports Ethernet18 lldp portidsubtype local etp5b': 2021-07-27T14:16:54 [WARN/lldpctl] cannot find port Ethernet18" we added these changes in order to solve it. #### How I did it port create events are taken from app db only. lldpcli command is executed only when linux port is up. when delete port event is received we remove this command from pending_cmds dictionary #### How to verify it manual tests and running lldp tests #### Description for the changelog Dynamic port configuration - solve lldp issues when adding/removing ports
@abdosi and @zhenggen-xu ; can this be ported to 202205, pls ? |
Why I did it
when adding and removing ports after init stage we saw two issues:
first:
In several cases, after removing a port, lldpmgr is continuing to try to add a port to lldp with lldpcli command. the execution of this command is continuing to fail since the port is not existing anymore.
second:
after adding a port, we sometimes see this warning messgae:
"Command failed 'lldpcli configure ports Ethernet18 lldp portidsubtype local etp5b': 2021-07-27T14:16:54 [WARN/lldpctl] cannot find port Ethernet18"
we added these changes in order to solve it.
How I did it
port create events are taken from app db only.
lldpcli command is executed only when linux port is up.
when delete port event is received we remove this command from pending_cmds dictionary
How to verify it
manual tests and running lldp tests
Which release branch to backport (provide reason below if selected)
Description for the changelog
Dynamic port configuration - solve lldp issues when adding/removing ports
A picture of a cute animal (not mandatory but encouraged)