-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed a route conflict bug. #1546
Fixed a route conflict bug. #1546
Conversation
update route conflict
add a if logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain what your algorithm is doing please? Thanks
dist/mk-docker-opts.sh
Outdated
@@ -101,3 +101,15 @@ done | |||
if [ "$combined_opts" = true ]; then | |||
echo "${combined_opts_key}=\"${docker_opts}\"" >>$docker_env | |||
fi | |||
|
|||
if [ "x${FLANNEL_SUBNET}" != "x" ];then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is x? If yo want to check if it exists, what not using if [ -n "$FLANNEL_SUBNET" ]; then
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is x? If yo want to check if it exists, what not using
if [ -n "$FLANNEL_SUBNET" ]; then
?
Yes, -n is right.
dist/mk-docker-opts.sh
Outdated
if [ "x${FLANNEL_SUBNET}" != "x" ];then | ||
dot_four=`echo ${FLANNEL_SUBNET}|cut -d. -f4|cut -d/ -f1` | ||
dot_four=$((${dot_four}-1)) | ||
subnets_pre=`echo ${subnets}|cut -d. -f 1,2,3` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subnets does not exist, or?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subnets does not exist, or?
This is a way for me to judge whether the string is empty. Please ignore this bad expression
This is a very special situation. I have explained the recurrence path in PR. Node pick a subnets named 192.168.6.15/27. And there is such a route (192.168.6.14/27 via 31.24.28.1 dev eth0 ) in the routing table before flannel is started. When flannel and docker up. There append a route (192.168.6.14/27 dev docker0 proto kernel scope link ..). But the last route(192.168.6.14/27 via 31.24.28.1 dev eth0) was not deleted. This code means to clean up the route before completing the real startup of flannel |
In my opinion, flannel does not need to delete that subnet. The user needs to be careful that there are no active routes which could conflict with the flannel subnets. What if that route is important for the user and flannel deletes it? |
I agree that deleting a user's routing table is rude. However, in my reproduction path, this routing information is written by flannel. If the subnets to be occupied by the flannel already exists in the local routing table, we should remind the user of this failure in another way. For example, flannel failed to start. Instead of normal process and abnormal service. |
I like that approach of flannel logging the problem or failing if it finds that |
When the route table already contains the subnet to be allocated, exit and prompt the user to handle the error
Description
Fixed a routing conflict bug.
Recurrence path:
(1) Cetos7, Flannel work in vxlan mode.
(2) Select two normal nodes from the cluster as the target nodes.(nodeA and nodeB)
(3) Flannel daemon remove in nodeA and nodeB.
(4) Input "systemctl stop flanneld" in NodeB. (Simulate nodeB down.)
(5) Delete the network segment corresponding to the nodeA in etcd.(nodeA flannel will exit).
(6) Copy nodeA subnet.env to NodeB. (Used to simulate nodeB obtain the subnets used by node).
(7) Start flanneld in nodeB.
(8) Execute "ip r s" in nodeB.
Through the above operations, you will see that the route to the local docker0 is conflict with previous.
Reason:
When nodeB down, it cannot get nodeA subnets is deleted by etcd, so it's route is not update. When nodeB flannel starting, nodeB pick the subnet who used by nodeA (the subnets is not in etcd now). The previous route has no chance to be updated.
Resolvent:
When a node pick a network. We should ensure that this subnet has no conflicts in the routing table.