-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p: ban bad peers #4548
p2p: ban bad peers #4548
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4548 +/- ##
==========================================
- Coverage 65.56% 65.33% -0.23%
==========================================
Files 229 229
Lines 20287 20326 +39
==========================================
- Hits 13302 13281 -21
- Misses 5937 5990 +53
- Partials 1048 1055 +7
|
@@ -8,7 +8,7 @@ import ( | |||
|
|||
"github.com/pkg/errors" | |||
|
|||
amino "github.com/tendermint/go-amino" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure. My main guess would be that my IDE did it in the background
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall approach makes sense. 👍 We should have some tests for this, and fix some minor style nits, but let's put that off until the final review.
p2p/pex/addrbook.go
Outdated
func (a *addrBook) ReinstateBadPeers() { | ||
for _, ka := range a.badPeers { | ||
if !ka.isBanned(defaultBanTime) { | ||
a.mtx.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mutex lock must be taken out at the start of the function, otherwise a concurrent thread can modify badPeers
while you are iterating over it and then bad things happen.
p2p/pex/addrbook.go
Outdated
func (a *addrBook) MarkBad(addr *p2p.NetAddress) { | ||
a.RemoveAddress(addr) | ||
if a.addBadPeer(addr) { | ||
a.RemoveAddress(addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addBadPeer()
returns false
if the peer is already considered bad, in which case it will not be removed from the address book. Shouldn't we always remove it regardless, since we do not want bad peers there under any circumstances? We may also want to refresh the ban timer in that case.
Also, since both of these calls take out mutex locks separately, the data can change between them, which may cause problems. Not sure if that will be an actual problem here, but in general it's a good idea to take out mutex locks in public methods, and have internal methods that assume locks are already held such that they can be combined and composed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if it's bad but still in part of the address book then we should return true so that it still runs RemoveAddress.
For your second paragraph about the mutexes, that makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I remember why I didn't add the mutexes in the public function because RemoveAddress has it's own mutex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. Which means that addBadPeer
and RemoveAddress
can't be called in a single critical section. If we had removeAddress()
which did not take out a lock, and RemoveAddress()
which does take out a lock and call removeAddress()
internally, then we could use removeAddress()
in a critical section together with addBadPeer()
.
Again, not sure if it's worth spending time on in this case, since I suspect it's safe to call RemoveAddress()
regardless of any state changes. But this is a general problem across the code base which we should keep in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes I understand. I think I will change it like that anyway
p2p/pex/known_address.go
Outdated
ka.LastBanTime = time.Now() | ||
} | ||
|
||
func (ka *knownAddress) isBanned(banTime time.Duration) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary to pass banTime as a parameter here, maybe just use the constant directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering whether we want to make it something that is either a variant - or at least configurable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense. But I don't think it's something that all callers should have to deal with every time time they want to check if someone is banned, it's more convenient if isBanned()
just takes all conditions into account automatically. But I'm not sure what the best way is to inject variables into the current architecture - I guess this is fine for a first iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option may be to have ban()
take a parameter for how long to ban, and store bannedUntil
as the time when the ban ends (the max of the current and given). Then isBanned()
can simply compare time.Now()
against bannedUntil
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that could be a good idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's necessary in the first iteration. But it did occur to me that passing the ban time to ban()
would allow us to vary the ban duration for different offences, if that's something we would need later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we perhaps want to expose Ban()
so other modules could use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. Let's consider that when there's an actual need for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay we can leave it for later. I was talking to Tess how each module sort of has their own way of categorising and dealing with peers that makes things at a whole inconsistent and potentially repetitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure. That's something to consider for the p2p refactor.
Current usages of the ban list:
This makes me think do we differentiate based on whether a peer timed out or if they actually misbehaved (#4415 (comment)) and whether we want a banlist and a blacklist - one to not talk to a node for a while and the other to permanently never talk to that node - note that removing the address at the moment isn't enough because we don't keep track of it and the node can simply join again #1444 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ this PR
p2p/pex/addrbook.go
Outdated
func (a *addrBook) IsBanned(addr *p2p.NetAddress) bool { | ||
a.mtx.Lock() | ||
defer a.mtx.Unlock() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a.mtx.Lock()
_, ok := a.badPeers[addr.ID]
a.mtx.Unlock()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain to me the difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need in defer if you don't call other functions and code is not complex
p2p/pex/addrbook.go
Outdated
bucket := a.calcNewBucket(ka.Addr, ka.Src) | ||
a.addToNewBucket(ka, bucket) | ||
delete(a.badPeers, ka.ID()) | ||
a.Logger.Info("Reinstated Address", "addr", ka.Addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a.Logger.Info("Reinstated Address", "addr", ka.Addr) | |
a.Logger.Info("Reinstated address", "addr", ka.Addr) |
@@ -63,3 +63,11 @@ type ErrAddrBookInvalidAddr struct { | |||
func (err ErrAddrBookInvalidAddr) Error() string { | |||
return fmt.Sprintf("Cannot add invalid address %v: %v", err.Addr, err.AddrErr) | |||
} | |||
|
|||
type ErrAddressBanned struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about for other errors - there isn't any documentation for them either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please write doc for them too?
p2p/pex/pex_reactor.go
Outdated
@@ -529,7 +538,7 @@ func (r *Reactor) dialPeer(addr *p2p.NetAddress) error { | |||
// failed to connect to. Then we can clean up attemptsToDial, which acts as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remove this comment?
I think we need both attemptsToDial and blacklist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
p2p/pex/addrbook.go
Outdated
// check it exists in addrbook | ||
ka := a.addrLookup[addr.ID] | ||
// check address is not already there | ||
if ka != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can simplify this with an early return, i.e. if ka == nil { return false }
p2p/pex/known_address.go
Outdated
@@ -54,6 +55,14 @@ func (ka *knownAddress) markGood() { | |||
ka.LastSuccess = now | |||
} | |||
|
|||
func (ka *knownAddress) ban(banTime time.Duration) { | |||
ka.LastBanTime = time.Now().Add(banTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should only extend the ban, not shorten it. For example, if someone banned the address until 16:00, and then someone else wants to ban them until 13:00, they should remain banned until 16:00. This will matter when we start having different ban periods for various offences.
@@ -8,7 +8,7 @@ import ( | |||
|
|||
"github.com/pkg/errors" | |||
|
|||
amino "github.com/tendermint/go-amino" | |||
"github.com/tendermint/go-amino" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should keep the old formatting. It's considered good form to use an explicit name if the package name differs from the URL name (amino
vs go-amino
), and to use a blank line between external and internal dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for some reason my ide doesn't see the difference between amino and go-amino but I will make the change regardless
p2p/pex/errors.go
Outdated
@@ -63,3 +63,12 @@ type ErrAddrBookInvalidAddr struct { | |||
func (err ErrAddrBookInvalidAddr) Error() string { | |||
return fmt.Sprintf("Cannot add invalid address %v: %v", err.Addr, err.AddrErr) | |||
} | |||
|
|||
// Err is thrown when the address is banned and therefore cannot be used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go comments should start with name of what they're describing, i.e. ErrAddressBanned is thrown ...
. I thought the linter would catch this, but we may have disabled it.
Where's the changelog pending entry? |
Also, tendermint/spec needs to be updated. |
p2p: Update Changelog with ban list PR - #4548
Closes: #1614
Description
This is just a MVP implementation of having a blacklist that bans peers for a bantime (default is 1 day) and I do feel that this is potentially a bandage to be replaced when the entire p2p module gets redone. Below is a illustration of the implementation.
Reinstated peers are added to the "new" (unvetted) bucket.
Thoughts:
The
ReinstateBadPeers()
function is run by the synchronous functionensurePeers()
every 30 seconds if theNeedMoreAddrs()
fails. The alternative would be to haveisBanned()
check within parts of the code that want to use the address.It might be a good idea to also keep a count of the amount of bans such that repeat offenders can serve longer bans / be banned forever.
For contributor use:
docs/
) and code commentsFiles changed
in the Github PR explorer