Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use iptables-restore to preserve order with MASQ & FORWARD rules #1264

Merged

Conversation

ryarnyah
Copy link

@ryarnyah ryarnyah commented Mar 10, 2020

Use iptables-restore to preserve order with MASQ & FORWARD rules.

Description

When anything happen when delete/append iptables rules, sometimes you can have your rules out of order.This PR use only iptables-restore to guaranty order in delete/append rules.

This close #1261 & #1146

Todos

  • Tests
  • Documentation (Needed?)
  • Release note

Release Note

None required

@ryarnyah ryarnyah force-pushed the feat/add-iptables-restore-at-startup branch from 0eee12d to 190e2aa Compare March 12, 2020 11:06
@rajatchopra
Copy link
Contributor

Looks good. How can we have an end-to-end test for it?
Maybe check on iptables rules after doing a kube e2e in dist/function-test-k8s.sh?

@ryarnyah
Copy link
Author

e2e tests has been added as requested

@ryarnyah
Copy link
Author

Do you need something else to have this PR merged?

@luthermonson
Copy link
Contributor

@Oats87 take a look would you?

@luthermonson
Copy link
Contributor

@Oats87 sign off on this so we can merge it, youre the iptables ninja

@Oats87
Copy link
Member

Oats87 commented Apr 1, 2021

@ryarnyah would you mind rebasing this PR (or just fixing the merge conflict)?

@ryarnyah
Copy link
Author

ryarnyah commented Apr 3, 2021

@Oats87 i will look to rebase it this weekend if i have time ;)
Edit: Let me have a look at the upgraded code before merging please.

@ryarnyah ryarnyah force-pushed the feat/add-iptables-restore-at-startup branch from 8dfc176 to 09cafb7 Compare April 4, 2021 19:17
@ryarnyah
Copy link
Author

ryarnyah commented Apr 4, 2021

@Oats87 It's seem to be good for review now.

network/iptables_restore.go Outdated Show resolved Hide resolved
}

// ipTablesRestore internal type
type ipTablesRestore struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we don't want to have a timeout for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Oats87 I'm not sur to understood, can you give me more details ?

network/iptables_restore.go Show resolved Hide resolved
return fmt.Errorf("failed to apply partial iptables-restore %v", err)
}

log.Infof("bootstrap done")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we provide some more context in this error message? Does this have to be done at the info level?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's possible

network/iptables.go Outdated Show resolved Hide resolved
network/iptables.go Outdated Show resolved Hide resolved
network/iptables.go Outdated Show resolved Hide resolved
network/iptables.go Outdated Show resolved Hide resolved
@@ -0,0 +1,213 @@
// Copyright 2015 flannel authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably actually belongs in coreos/go-iptables but having it here should be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I leave it that way ?

network/iptables_restore.go Outdated Show resolved Hide resolved
@rajatchopra
Copy link
Contributor

@ryarnyah could you rebase and respond to the review comments?

@louiznk
Copy link
Contributor

louiznk commented Jan 12, 2022

@ryarnyah could you rebase and respond to the review comments?

Hello @rajatchopra if you and @ryarnyah are agree I can do this for @ryarnyah
Thanks

@luthermonson
Copy link
Contributor

@louiznk please do

@louiznk louiznk force-pushed the feat/add-iptables-restore-at-startup branch 2 times, most recently from 45222ed to b0f38e7 Compare January 13, 2022 13:17
@luthermonson
Copy link
Contributor

@louiznk any chance you can answer/fix for @Oats87 feedback? can probably wrap this up and merge if those are addressed

@louiznk
Copy link
Contributor

louiznk commented Jan 13, 2022

@louiznk any chance you can answer/fix for @Oats87 feedback? can probably wrap this up and merge if those are addressed

I just fixe some of this feedback (mark with 👌). But there is some point I'm not sure to understood.

@louiznk
Copy link
Contributor

louiznk commented Jan 14, 2022

was requested for review

Hello @luthermonson
I don't know if you have the opportunity to see the commit and question/reply for @Oats87 feedback?
Thanks a lot for your help and review.

Copy link
Contributor

@louiznk louiznk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apply requests

@manuelbuil
Copy link
Collaborator

Are you still interested in merging this? Could you rebase so that we can review it again? Thanks!

@louiznk
Copy link
Contributor

louiznk commented May 30, 2022

Are you still interested in merging this? Could you rebase so that we can review it again? Thanks!

Hello @manuelbuil ,

Thanks, I can do that this week

@louiznk louiznk force-pushed the feat/add-iptables-restore-at-startup branch from 8b689bf to 24062ab Compare May 31, 2022 15:59
@louiznk
Copy link
Contributor

louiznk commented May 31, 2022

Are you still interested in merging this? Could you rebase so that we can review it again? Thanks!

Hello @manuelbuil , the rebase is done, I fixe some trouble with this new versions (usage of comments on iptables rules for exemples)
Thanks for your review and feedback

@louiznk louiznk force-pushed the feat/add-iptables-restore-at-startup branch from 24062ab to 04d157d Compare May 31, 2022 17:43
Copy link
Collaborator

@manuelbuil manuelbuil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, some users detected a race condition in which these rules: https://github.com/flannel-io/flannel/blob/master/network/iptables.go#L57-L66, are applied in different order. In my opinion, the simplest solution would be adding a lock here https://github.com/flannel-io/flannel/blob/master/network/iptables.go#L30 and then check the lock around here: https://github.com/flannel-io/flannel/blob/master/network/iptables.go#L244 and not allow any other rule to be added until the lock is released. That way the race condition will not exist anymore and the problem would be fixed, or? Using iptables-restore also fixes the problem but it requires adding ~500 lines of code, which makes me wonder if this is an overkill given that the use of iptables in flannel is minimal (if we had some sort of network policy engine using iptables, I would probably think differently).

I'd love to hear more opinions!


// NewIPTablesRestore build new IPTablesRestore for IPv4
func NewIPTablesRestore() (IPTablesRestore, error) {
return NewIPTablesRestoreWithProtocol(iptables.ProtocolIPv4)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to hardcode IPv4?

Copy link
Contributor

@louiznk louiznk Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, some users detected a race condition in which these rules: https://github.com/flannel-io/flannel/blob/master/network/iptables.go#L57-L66, are applied in different order. In my opinion, the simplest solution would be adding a lock here https://github.com/flannel-io/flannel/blob/master/network/iptables.go#L30 and then check the lock around here: https://github.com/flannel-io/flannel/blob/master/network/iptables.go#L244 and not allow any other rule to be added until the lock is released. That way the race condition will not exist anymore and the problem would be fixed, or? Using iptables-restore also fixes the problem but it requires adding ~500 lines of code, which makes me wonder if this is an overkill given that the use of iptables in flannel is minimal (if we had some sort of network policy engine using iptables, I would probably think differently).

I'd love to hear more opinions!

You're right in the case of flannel is alone. But I have trouble with some concurrent access to iptables from other programs (in production).
In my case it was kube-router (using for network policy).
I know there are better solutions than using flannel + kube-routeur, but unfortunately not in special my case (some external requirements).
So the main advantage for me with iptables-restore is the "transaction" that allows to preserve the order when you have some concurrents access to iptables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to hardcode IPv4?

Just for respecting the same convention as implementation based on iptables (New => ipv4 by default and NewWithProtocol => protocol in parameter). For exemple see :
https://github.com/flannel-io/flannel/blob/master/network/iptables.go#L190 and https://github.com/flannel-io/flannel/blob/master/network/iptables.go#L206
I can change it if it's better

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would prefer to use the same functions for both ipv4 and ipv6 and change the arg accordingly. Therefore, I'd prefer NewIPTablesRestoreWithProtocol(iptables.ProtocolIPv6) and NewIPTablesRestoreWithProtocol(iptables.ProtocolIPv4) in the code. It feels to me more confusing if we use different functions for each protocol

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change done and push @manuelbuil

network/iptables_restore.go Outdated Show resolved Hide resolved
network/iptables.go Show resolved Hide resolved
network/iptables.go Show resolved Hide resolved
@louiznk louiznk force-pushed the feat/add-iptables-restore-at-startup branch from 1f59190 to f8d178c Compare June 8, 2022 13:33
@louiznk louiznk force-pushed the feat/add-iptables-restore-at-startup branch from f8d178c to 70ed47a Compare June 20, 2022 08:17
…nd review

-> Changes from the review:

- refactor protocol to proto
- fix check version for wait command available on iptables-restore
- apply "Error strings should not be capitalized"
- lower case for iptables-restore
- refactor function ApplyPartial to ApplyWithoutFlush
- remove unused function ApplyFully
- comments functions
@louiznk louiznk force-pushed the feat/add-iptables-restore-at-startup branch from 70ed47a to 121db21 Compare June 23, 2022 12:45
@manuelbuil
Copy link
Collaborator

Tested and works

@manuelbuil manuelbuil merged commit fb2ee82 into flannel-io:master Jun 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The iptables rules in the nat table POTROUTING chain are out of order
8 participants