-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BGP PIC HLD Review #14703
Comments
In the discussions we've had, I've asked several questions about what's been proposed here; I'll try to capture them concisely. First, it seems (if I'm following what sonic is doing) that you are concerned about just one specific situation: the case where multiple nexthops are being used by some routes, and one of those nexthops stops being useful. Sonic wants to react to that event quickly, using only the viable remaining links' nexthops. What exactly are the triggers for that event? Is BFD in use, for example? Or is an IGP change or a BGP TCP session failure the trigger? Or is there a failure of a local link? It also seems that sonic is able to take the flattened list of nexthops that it learns about from FRR's FPM messages and render that list into the two-part division that its hardware supports. If you have the group of nexthops, and you learn that one of them is not useful, why not just ... react, using the hardware you have available to you? What I don't follow is why re-designing the internals of zebra's RIB processing data structures is the right solution. What alternatives to that sort of overhaul might be viable? Is there a missing event that sonic would like to have delivered through FPM, for example? |
My concern is the limitations brought by the design:
Is there a proposal (an extension to your current design) to have a compatible design to cover
I think you talked about a separate team that would handle the IGP failures implementation. I think this would be great to have the design proposed by this team. note: the link to the proposal I made: https://docs.google.com/presentation/d/11MGkW7MmAXl-dE9G28ezNCYvvsISB1zrfQzWSAoKBBE/edit#slide=id.p |
EDDIE: PIC has two enhancements described in RFC. One is "It prevents BGP load balancing updates from being triggered by IGP load balancing updates", which is basically recursive route support. The other is to achieve fast convergence in the event of a hardware forwarding failure related to a remote BGP PE becoming unreachable. This is what we mainly discuss here and derive to change the data structure organization. When a remote BGP PE goes down, we want to minimize the traffic loss. This was required by DT back to 2006 on Cisco GSR platform. It is a basic SP feature now.
EDDIE: normally it would be triggered via IGP detect BGP remote PE is down. It could be triggered via multi hop BFD since remote PE is normally far away.
EDDIE: fpm update is triggered from zebra when IGP detects a remote PE is not reachable. With current zebra data structure, zebra needs to trigger an update to all related NHG, which would be an O(N) operation. That's the main reason we want to change this behavior in zebra to O(1) operation.
EDDIE: The main reason is to let zebra provide O(1) update instead of O(N) update via fpm. This way, bottom subscribers could handle the event properly. By the way, the routing decision logic is not changed, the new pic_nhe is derived only when the routing decision is made. I don't know if any Cisco friends wants to chime in here. |
EDDIE: current design could be applied to MPLS VPN, EVPN as well. The same approach could be used as is since pic_nhe is the forwarding only portion. From testing point of view, we will only test SRv6 VPN case. For IGP label handling for PIC, current approach is applicable, as long as we decide to apply IGP PIC to it. But from my point of view, the IGP PIC's value is limited.
EDDIE: local link down event would be down independent from PIC. There is a separate HLD would be raised.
|
It looks like the HLD doc above is no longer available ... did it move someplace? |
Sorry. This HLD was moved under doc/pic recently due to we were consolidate all PIC related tasks. The new link is at https://github.com/eddieruan-alibaba/SONiC/blob/eruan-pic/doc/pic/bgp_pic_edge.md We have PIC architecture doc published at the same location. |
This issue is stale because it has been open 180 days with no activity. Comment or remove the |
This issue will be automatically closed in the specified period unless there is further activity. |
Keep it active, |
This issue will no longer be automatically closed. |
This would be part of SONiC Routing Working Group's Phoenix Wing Initiative. The detail information could be found at Code PR will be raised soon. |
Alibaba team is working with other folks in SONiC routing WG and wants to bring in BGP PIC support to FRR and SONiC. The current HLD is documented in https://github.com/eddieruan-alibaba/SONiC/blob/eruan-pic/doc/bgp_pic/bgp_pic.md and the SONiC PR for tracking the discussions for this HLD is at sonic-net/SONiC#1493
Based on the suggestion from today's FRR weekly meeting, I open this issue to track the discussion on this HLD in FRR. I would like to get the consensus on this approach for enabling BGP PIC in FRR.
I add some background about this feature here since I was in the group of people to enable this feature almost 20 years ago. This feature was first introduced to Cisco GSR IOS platform back to 2006 / 2007 ish. The main target was to reduce traffic loss when a remote BGP PE went down. The recursive route was supported in Cisco OS long before this feature. The design idea was to reorganize forwarding objects. So we could reduce the number of forwarding objects' updates from O(N) to O(1) which would stoping traffic loss first. This feature later became a standard RIB/FIB approach in Cisco IOS and IOX platforms. The draft was raised first in 2016 by my Cisco friends, which gave a clear description on how to handle these forwarding objects for achieving PIC.
The text was updated successfully, but these errors were encountered: