Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[action] [PR:14084] Fix issue: wrong teamd link watch state after warm reboot #14575

Merged
merged 1 commit into from
Apr 8, 2023

Conversation

mssonicbld
Copy link
Collaborator

No description provided.

…14084)

#### Why I did it

Fix issue: wrong teamd link watch state after warm reboot due to TEAM_ATTR_PORT_CHANGED lost

The flag TEAM_ATTR_PORT_CHANGED is maintained by kernel team driver:
- a flag "changed" is maintained in struct team_port struct
- the flag is set by __team_port_change_send once relevant information is updated, including port linkup (together with speed, duplex), adding or removing
- the flag is cleared by team_nl_fill_one_port_get once the updated information has been notified to user space via RTNL

In the userspace, the change flag is maintained by libteam in struct team_port.
The team daemon calls port_priv_change_handler_func on receiving port change event.
The logic in port_priv_change_handler_func
1. creates the port if it did not exist, which triggers port add event and eventually calls lacp_port_added callback.
2. triggers port change event if team_port->changed is true, which eventually calls lw_ethtool_event_watch_port_changed to update port state for link watch ethtool.
3. removes the port if team_port->removed is removed

In lacp_port_added, it calls team_refresh to refresh ifinfo, port info, and option info from the kernel via RTNL.
In this step, port_priv_change_handler_func is called recursively.
- In the inner call, it won't get TEAM_ATTR_PORT_CHANGED flag because kernel has already notified that.
- As a result, team_port->changed flag is cleared in the libteam.
- The port change event won't be triggered from either inner or outer call of port_priv_change_handler_func.

If the port has been up when the port is being added to the team device, the "port up" information is carried in the outer call but will be lost.

In case the flag TEAM_ATTR_PORT_CHANGED is set only in the inner call, function port_priv_change_handler_func can be called in the inner call.
However, it will fail to fetch "enable" options because option_list_init has not be called.

Signed-off-by: Stephen Sun <[email protected]>

#### How I did it

Fix:
Do not call check_call_change_handlers when parsing RTNL function is called from another check_call_change_handlers recursively.

#### How to verify it

- Manually test
- Regression test
  - warm reboot
  - warm reboot sad lag
  - warm reboot sad lag member
  - warm reboot sad (partial)
@mssonicbld
Copy link
Collaborator Author

Original PR: #14084

@mssonicbld mssonicbld merged commit 95f387c into sonic-net:202211 Apr 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants