Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time Disputes #742

Open
eskimor opened this issue Jan 5, 2023 · 17 comments
Open

Time Disputes #742

eskimor opened this issue Jan 5, 2023 · 17 comments
Labels
I6-meta A specific issue for grouping tasks or bugs of a specific category.

Comments

@eskimor
Copy link
Member

eskimor commented Jan 5, 2023

Latest design here.

Basic Idea/Design Considerations

New objectives:

  1. It is not a security threat for approval checkers to not vote for a very long
    time, because it is taking long - approval voting is secure regardless.
  2. We do need some timeout for liveness only, without a dispute we would never
    finalize.
  3. Backers are special, they are the only validators that are forced to commit
    to a <2s execution.
  4. Approval checkers will send timing information, if it took them significantly longer than
    the backing timeout then this is put on chain.
  5. We charge backers via era points for that extra time. The charging will be
    exponential and reaches the cost of raising a concluding valid dispute. Thus
    we can actually have very long approval voting timeouts.
  6. Voting invalid validators will not get charged for wasting everybody's time,
    if that bill has already been paid by the backers via 5. (Getting honest
    validators slashed is thus no longer possible.) Basically validators voting
    invalid only need to get charged whatever the backers were not charged yet.
  7. On disputes concluding invalid, backers always get slashed 100%, voting valid
    approval checkers (and dispute participants) only if there is no (small
    enough) time report of approval checkers. This way there is an incentive to
    actually validate (and not just always vote valid -which would be the case if we always only slashed backers).

Questions?

  1. Would it be possible for malicious approval checkers to get honest backers
    charged? Assuming enough honest approval checkers, this should not be
    possible.

Threats/Complications:

Getting those approval votes (time) on chain:

  1. How many? Only tranch 0? Is that secure, what if validators are DoSed?
  2. Cherry picking?
  3. If we allow for a very long approval voting timeout, we might run into issues with state pruning. Let's assume an attacker tries to get a bad candidate in. It makes it so, that it takes a very long time on honest checkers. Now if he sees that approval checkers are not his guys, the attacker reveals a better fork (not including that candidate) - that chain gets finalized. Now by the time the validation for the fork timed out, the chain's state has already been discarded (too far behind the finalized block height), which might prevent a dispute from succeeding. In fact nodes might not even participate at all, as they have not "seen" the candidate backed/included. Both should be fixable though, by not clearing up too eagerly on finalization.

Tl;DR: By charging backers for excess time proportional to the amount, we can afford very long approval checking timeouts. This way natural fluctuations in load/performance should no longer realistically cause a dispute and even if that timeout is triggered by malicious backers, they would end up paying the bill. We would slash backers on concluding invalid disputes, but also other nodes voting valid if the timing reported by approval checkers is low enough, that it can be deemed unlikely that the nodes who voted invalid did so because of the timeout. This way approval checkers keep having an incentive to actually do the validation and not just always vote valid (which would be risk free, if we only ever slashed backers).

@BradleyOlson64
Copy link
Contributor

It occurs to me that time disputes are somewhat likely to be inconclusive, as honest validators may be split so that neither side achieves the necessary 2/3. This likelihood increases if some number of malicious validators withhold their votes in disputes.

Keeping many disputes alive through their entire timeout period seems like a potential DOS vector. Perhaps the large number of simultaneous legitimate disputes could be leveraged to spam consistent dispute statements from malicious validators.

Those dispute statements would at least bypass spam slots since they reference disputes for included candidates. Perhaps the total message volume produced this way would still be insignificant?

This seems less dangerous if backers are still charged era points for validation time approvers take beyond the backing timeout regardless of whether disputes conclude. As long as the cost to malicious backers is sufficiently high they can't spam time disputes like I described.

Is my thinking on the right track here?

@burdges
Copy link

burdges commented Mar 1, 2023

We might've some miss-conceptions here, so maybe you want to chat with @eskimor and/or me, but roughly speaking..

We should call these time overruns, not time disputes, because disputes causes confusion: time overruns are not disputes, do not take dispute slots, do not trigger validators to do extra work, and do not need 2/3rd of anything.

We just need some mechanism by which we compute the median execution time declared by the approval votes. Anyone not declaring a time de facto votes 2s.

We do not rerun the approval logic on-chain so any approval vote counts in the median, which sucks since whales could cheat, but if they do so then we need governance to manually slash them. In future, we could likely move this whole penalties system off-chain with the off-chain rewards system, but I've not thought much about doing so yet, but it'll hopefully solve the whales issue.

We might "bill" backers nominators stake, not just take era point, if we foresee time overrun costs could exceed what era points pay, but doing so should not count as a slash. It really depends upon how much parathread blocks cost.

We do also have real time disputes in which we claim a block is invalid for taking way way too long. We've three possibilities here:

  1. Invalid, so 100% slash pays for everything
  2. Valid, but time overrun fees cover everything
  3. Valid, but time overrun fees insufficient

Assuming 2/3rd honest, we judge 3 to be a serious code bug, which requires a bug fix & host upgrade, not slashing. In particular, our validators who'd raise time dispute were all replaced as no-shows already, so approvals being secure means we de facto escalated already in case 1, due to all honest nodes being no-shows, or else we approved in case 1 or 2, so likely this dispute comes after the block was already finalized, and governance sorts it out anyways.

@Overkillus
Copy link
Contributor

With regards to the question posted by @eskimor:

Question?
"Would it be possible for malicious approval checkers to get honest backers
charged? Assuming enough honest approval checkers, this should not be
possible."

One potential way of exploiting this I see is a
Time Report Inflation:

Assuming we use an average of time reports:

  1. Honest backer validates and backs a block
  2. Some dishonest (on av. 33%) approval-checkers inflate the time reports with artificially high numbers, but not high enough to invalidate the block and trigger a real dispute (does not go above the approval-checkers' very high timeout)
  3. Inflated time reports result in a slightly, but noticeably inflated averages
  4. Inflated average bills the time overrun fees to the honest backer

Assuming we use a median as pointed by @burdges (better but still seems exploitable):

  1. Honest backer validates and backs a block
  2. Some dishonest approval-checkers inflate the time reports with artificially high numbers, but not high enough to invalidate the block and trigger a real dispute (does not go above the approval-checkers' very high timeout)
  3. To notice a difference in the median >=50% of the assigned approval checkers need to be malicious. Assuming the average of 40 approval-checkers and the maximum of 1/3 malicious validators overall the probability that more than or equal to 50% of them are malicious is given by the formula:
    $\displaystyle\sum_{n=20}^{40} \left(\left(\frac{1}{3}\right)^n \cdot \left(\frac{2}{3}\right)^{40-n} \cdot {40 \choose n}\right)= 2.14$ %
  4. Inflated median bills the time overrun fees to the honest backer in 2.14% of cases while costing nothing for the dishonest approval-checkers

Is there anything that protects us from that?


Real time disputes issue:

  1. The approval-checkers timeout can be arbitrarily large
  2. Some dishonest backer backs a block stating it takes <2s, but it actually takes on average exactly the approval-checkers timeout value
  3. Due to probabilistic execution time (running the validation takes a different amount of time on different machines etc) it will split the (honest) validators group in half (I think this is what @BradleyOlson64 mentions)
  4. This will result in an inconclusive dispute
  5. However a dishonest backer could back such a block that e.g. probabilistically 25% of validators judge as taking less time and 75% judge as taking more time than the approval-checkers timeout period. This would conclude the dispute and the 25% minority would get slashed despite acting honestly.

Also @burdges you mention "We've four possibilities here:" but only list 3 options.

@burdges
Copy link

burdges commented Mar 2, 2023

Yes, adversaries could unjustly bill people anytime they've enough approval checkers for a block (and wish to target the backers). We'd need to fix this in governance, aka victims must convince others to run the block, report faster times, and eventually pass some refund motion. I've previously mentioned a percentile higher than median here too, which reduces risks but requires correspondingly higher fees, not really sure there.

We do not slash incorrect voters 100% in that case, only the backers. I suggested a tiered slashing for invalid blocks, so 100% for backers, 10% for approval checkers, and 1% for voters. We didn't choose that option I think, but now I've forgotten why.

@BradleyOlson64
Copy link
Contributor

Thanks to @Overkillus for clarifying what I meant. I was indeed thinking of attacks exploiting disputes caused by extreme time overruns, which you described better.

@burdges Mind explaining the reasoning that brings you to this judgement?
“Assuming 2/3rd honest, we judge 3 to be a serious code bug, which requires a bug fix & host upgrade, not slashing.”

@Overkillus
Copy link
Contributor

@burdges
With regards to governance being used as an answer to the Time Report Inflation Attack:
Assuming we use the median, as stated above the probability is 2.14% per each parablock that goes through the approval-checking process. Assuming all current parachains make candidates (42) and that each one is approval-checked after being included in the relay chain block we naively get 42 random sets of approval-checkers per relay-chain block time (6s).

The chance that we had more than 50% honest approvers for all 42 parablocks is:
(1 - 0.0214)^42 = 0.4 = 40%

And during an hour we have 25200 attempts instead of 42 which practically guarantees (e-237 that it doesn't) that at least once there will be an approval-checker group with more than 50% malicious actors making the attack a possibility.

If the attack is basically guaranteed to occur every hour how can we reasonably say that governance can manually handle that? If the situation was expected to occur once every 10 years maybe you could say the governance can step in and make an adjustment, but at this point those are too frequent for manual intervention.


With regards to the slashing response in Real Time Disputes:
"We do not slash incorrect voters 100% in that case, only the backers." How much do we slash incorrect voters? Nothing?

Also what justifies the slashing of the backer in that case?:

  1. The approval-checkers timeout can be arbitrarily large
  2. Some dishonest backer backs a block stating it takes <2s, but it actually takes on average a bit lower than the approval-checkers timeout value. To be precise 75% of the time it takes less than the approval-checkers timeout.
  3. When it comes to the approval-checker phase some honest approval-checkers will be unlucky and land in the 25% group that needs more time than the approval-checker timeout. They will raise a dispute (based around the timeout).
  4. All validators participate in the dispute but 75% of them support the backer, thus the backer wins the dispute.
  5. The honest nodes were essentially tricked into raising a dispute and will pay the dispute cost of raising a false-positive (concluding valid dispute; an unsuccessful dispute).

In the above example it seems the backer shouldn't get slashed and in the change proposed by @eskimor the backer will only be fined based on the time overrun reports. It should also mean that the total fine should cover the costs that normally would be applied to the 25% honest validators that were tricked into raising the dispute. Is that correct?

The edge case in that regard is such a block that splits the validator set in 33.3% voting invalid and 66.7% voting valid (including the backer). Based on the approval-checkers timeout constant and the expected execution time variance one would need to calculate what execution time splits the community in such a way. Then calibrate the expected time overrun fees for that particular execution time to be higher than the costs normally incurred by the 33.3% group of validators voting invalid in a dispute that concludes valid (unsuccessful dispute).

In the analogous case where the backer picks a block such that that 75% of the time it takes MORE than the approval-checkers timeout a backer should be directly slashed as he will loose the dispute when it comes to it.

@burdges
Copy link

burdges commented Mar 3, 2023

@BradleyOlson64 We need higher time overrun fees if 3 ever happens, but yes more a parameterization error than a code bug, but we fix it if it happens, and it's still our mistake so no penalties.

@Overkillus We're not limited to refunding in governance. We should be able to manually slash from governance too, which sounds very appropriate for the grieffing attack you describe. We've other flavors where 1/3 targets a small-ish number of validators, avoids harming themselves, avoids bystanders, etc., which all make the attack less unrealistic than pure grieffing but also offers less frequent attack windows. We do not much care though because governance refunds and governance slashing remains the overall solution.

We've many places where exceptional behavior falls upon governance, and we'll add more this year by removing some complexity from slashing, so really we need some "polkadot constitution" document that says how our design expect governance to react to various exceptional situations situations.

We need code to enforce soundness, safety, and liveness each within some reasonable parameters, deal with fast attacks, etc. We never thought code could cover every case correctly though because afaik all other peer-to-peer networks require human intervention sometimes.

We won't slash the backer in your case: At some point someone escalates by saying the block runs longer than the approval checker timeout, which actually does not matter much because.. We've already de facto escalated long before this happens however, since every 12s or soon 24s we'll have a new batch of no-shows, which includes everyone honest. In any case, we now have 2/3rds who claim the block runs insanely slow, even if they claim valid, so we're already charging the dishonest backer enough to pay for the dispute, so either valid or invalid results sound fine.

We do however punish one honest node here for raising the time dispute. We've discussed this no longer being slashing per se, but merely fees like the time overrun fees the backer pays or related. We could, and likely should, reduce this fee by whatever the backers pay, which requires yet more careful balancing of course.

Anyways..

Yes, there are a bunch of easy implementation mistakes, like say ignoring the invalid votes when computing the median, not having the backers pay enough, etc., so yes we should spell all this out as much as possible.

In my mind, we've basically one really tricky question: Can or should the off-chain rewards protocol handle time overruns? We'll discuss this in future I think.

@eskimor
Copy link
Member Author

eskimor commented Mar 3, 2023

We do however punish one honest node here for raising the time dispute. We've discussed this no longer being slashing per se, but merely fees like the time overrun fees the backer pays or related. We could, and likely should, reduce this fee by whatever the backers pay, which requires yet more careful balancing of course.

Yes this is what I would suggest. If we know that the execution time was at the high end, the voting invalid validators will pay nothing. (The backers already paid)

Some dishonest (on av. 33%) approval-checkers inflate the time reports with artificially high numbers, but not high enough to invalidate the block and trigger a real dispute (does not go above the approval-checkers' very high timeout)

One thing to consider here is that the no-show timeout should be much lower than the worst case execution timeout. Therefore providing such large numbers does mean escalation and we would be getting much more checkers.

Consequences:

  1. This means we need timing information not only on tranch 0 approvals but on all.
  2. The attacker does not actually need to take as long as it claims to. But we can enforce that: If a validator reports a time longer than the no-show timeout, we escalate as if it were a no-show.

With 2) adversaries' effectiveness should be greatly reduced. But I agree with @Overkillus - potential frequency of such an attack matters. We should do some proper worst case calculations and design it so, that resorting to governance/human intervention is feasible. E.g. if we assume execution time of up to 12 seconds will never/rarely cause a no-show, we could simply not charge anything up to that time (no escalation - everything is fine). Therefore with approval voters reporting times less than those 12 seconds, no harm is done - if they go above, we cover them as if they were a no-show and thus get more time values in.

For frequencies that are acceptable: I don't think we need to go to 10 years, any time frame that allows humans to react will do. E.g. once per week should be totally acceptable:

  1. Frequency should be low enough that backers will never get disabled (assuming we only disable on a 100% slash) - even if we deduct costs via slashing. Hence there is no security threat.
  2. Because of 1 and because governance action must be anticipated there is very little incentive for such an attack. You don't earn anything, you just make others lose money, which they will get refunded - at the same time you the very least damaged your reputation and might even be up for punishment via governance. So a lot of risk, with little/no value.

As long as those 2 are maintained, potential frequency of such an attack matters not too much - although it is a good idea to limit it for defense in depth. As long as 1 is maintained the security of the network is not at risk, therefore this might also be one of the few occasions where it would be acceptable to loosen our byzantine assumptions:

  1. No real threat
  2. No real incentive for the attacker
  3. High risk for the attacker

-> Highly unlikely to find 1/3 of validators trying to do this. Anyhow, I actually don't think this will be necessary.

Due to probabilistic execution time (running the validation takes a different amount of time on different machines etc) it will split the (honest) validators group in half (I think this is what @BradleyOlson64 mentions)

As long as f+1 validators voted valid we can assume it is fine (at least one honest validator voted in favor of the candidate). Who has to pay what can then be determined by reported timing information.

However a dishonest backer could back such a block that e.g. probabilistically 25% of validators judge as taking less time and 75% judge as taking more time than the approval-checkers timeout period. This would conclude the dispute and the 25% minority would get slashed despite acting honestly.

This would conclude the dispute as "candidate invalid". We do not currently slash approval voters, so this would only slash backers which have not been honest (they obviously ignored the backing timeout).

In the above example it seems the backer shouldn't get slashed and in the change proposed by @eskimor the backer will only be fined based on the time overrun reports. It should also mean that the total fine should cover the costs that normally would be applied to the 25% honest validators that were tricked into raising the dispute. Is that correct?

Yes.

In the analogous case where the backer picks a block such that that 75% of the time it takes MORE than the approval-checkers timeout a backer should be directly slashed as he will loose the dispute when it comes to it.

Correct.

Thanks @Overkillus ! Very useful input.

@burdges
Copy link

burdges commented Mar 3, 2023

The attacker does not actually need to take as long as it claims to. But we can enforce that: If a validator reports a time longer than the no-show timeout, we escalate as if it were a no-show.

Interesting, we'd ensure that reporting time overruns creates more checkers who perhaps contradict your report. Ain't so simple however because our approvals counter loop actually un-counts no-shows once they finally voted. We could complicate it's logic of course, like by counting and not un-counting your declared no-shows, but..

We do make various compromises all over the place, but we do prioritize comparatively higher priority design like soundness over comparatively lower priority design like correct billing. I'm thus hesitant to even slightly complicate the approvals counter unless it's really the right solutions, as it's soundness code.

Also, if an off-chain solution works in future then we could likely count the minimum of message arrival time and the declared time, which likely fixes this cleanly without touching the approvals counter. That's many future ifs but it's reason not to do this now.

Anyways, do we really need this?

Case 1. An honest backer makes a block that runs under 2s but haters report it run slow. I still think governance could handle this case all by itself. In other words, the human backer should report the haters by sharing the PoV and the approval signatures with someone in governance, who then runs the block and reports under 2s. After this, more humans in governance run the block, and then finally they vote to refund the backer and slash the haters. Yes, your suggestion automates this somewhat, and maybe simple enough to justify, but again maybe not if off-chain ever works.

Case 2. A dishonest backer makes a borderline block. Can adversarial nodes inflating runtimes help the adversary? It perhaps increase their own fees, but afaik minimal other effect.

We could delay doing this until after we discover if punishments fit into the off-chain rewards system?

once per week should be totally acceptable

Once per minute is kinda acceptable if governance eventually manually slashes the haters. Yeah, they can make us look bad by bringing everything to a halt, but only like once, and then they'll wind up gone for good. Yes, longer is better however.

Again the real problem here is that anybody can vote, not just the approval checkers, which happens because we're doing this on-chain instead of doing it off-chain.

I should do a proper write up for the new slashing and punishments system soon, so maybe the off-chain rewards system this makes a good companion write up.

As long as f+1 validators voted valid we can assume it is fine (at least one honest validator voted in favor of the candidate). Who has to pay what can then be determined by reported timing information.

We do however abandon a disputed fork that never achieves 2f+1 though, right?

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/polkadot-dispute-storm-the-postmortem/2550/1

@Sophia-Gold
Copy link
Contributor

I've been adding a time overruns section to the implementer's guide and found it helpful to try to model what the actual charge would be. One way to think about it is in terms of an unlikely worst case scenario: a dispute that concludes valid, but where 33% of approval checkers voted invalid due to reaching the execution timeout.

Assuming a 0.1% slash for disputes where the supermajority concludes the candidate is valid, 1000 paravalidators, and equal amounts bonded between validators; then the maximum amount collectively slashed from approval checkers disputing a valid candidate is equal to 33% of the backer's bond. We can ask how long the median should be in order for us to assume the approval checkers who voted invalid actually timed out and therefore shouldn't be slashed.

A few questions I'm unclear about:

  • Do we want to use an exact median or something higher to reduce the frequency of it being controlled by malicious validators?
  • Do we include execution times from invalid votes? If yes, then in this worst case the median is in the 75th percentile of the supermajority.
  • Do we start charging at 2s or at a point where it risks a no-show?

Other than that we could use something like:

time_overrun = median_execution_time - backing_timeout
maximum overrun = approval_checking_timeout - backing_timeout
charge_to_backer = bond * (time_overrun / maximum_overrun)^2

Then a median overrun of 58% of the approval checking timeout (7.8s with a 12s timeout or 24s with a 40s timeout) will result in the backer being charged 33% and the timed out approval checkers not being slashed at all. On the other hand, a median overrun of 10% (3s with a 12s timeout or 5.8s with a 40s timeout) would result in a charge to the backer of 1%.

@eskimor
Copy link
Member Author

eskimor commented May 11, 2023

Some results of our latest discussion:

Slashing Logic

Slashing resolution on concluding valid disputes is going to be changed based on time information:

Use of previous logic

If time report variations are within reasonable limits from the calculated median (of all reported votes), then the slashing logic is the same as it was before: The validators voting invalid will get slashed some amount that is supposed to make up for the wasted effort of the network.

Time information based slashes

On the flip side, if there are time reports that differ a lot from the median, who voted invalid will be irrelevant, instead we charge whoever deviates too much from the median. Also invalid votes should contain time information - we should not just assume max time, explanation in the section "Backers raising a dispute".

This means, if the median is rather low, voters (valid or invalid) with high values will get slashed - they are either unacceptably slow or have been dishonest with their values.

If the median is on the high end, then we will assume backers have been messing with us and they get slashed (together with all approval checkers also reporting such low times).

If the median is somewhere in the middle, it can happen that we will end up slashing people on both ends. The likely explanation for such a situation would be that there are people on both ends trying to mess with us at the same time.

Interaction with charging backers without a dispute

In case of an actual dispute, any charging of backers due to approval votes will be dropped/replaced with the actual dispute resolution as described above. In case of a dispute, we get time information from everyone, so with our 2/3 honest assumption we actually get a reliable median so the one of the tranch 0 approval checkers becomes superseded.

Backer raising a dispute

Solution to the above problems that malicious approval checkers could be messing with backers, reporting blown up times to get backers paying based on tranch 0 time information: What a backer can do to defend himself is raising a dispute!

If the backer notices approval votes with reported times that result in a median that would result in him having to pay, he can equivocate and send an additional explicit invalid vote- with the same time as the backing timeout (or the actual time it took it to validate the candidate). Then assuming the dispute resolves for the candidate (which should be the case if the backer is honest) the above slashing logic kicks in and the backer will not have to pay anything, but instead the approval voters, who tried messing with the backer. The equivocation does not matter either, because we are not slashing based on valid/invalid.

For this to work values have to be picked carefully: It should not be possible to have a backer charged (a significant amount/at all), but at the same time have dispute resolution result in case one, where there are no votes deviating too much from the median. So thresholds for charging have to be in sync with this, which seems to be only logical, but worth mentioning anyways.

Summary: With this simple mechanism a 2% chance is totally acceptable, because it would no longer be risk free for the attackers, in fact having to pay the bill is virtually guaranteed. At least together with the fact that there is not even a direct incentive to do the attack in the first place, it is hard to imagine that with this in place people would even try. Which is disputes serving their purpose: "Being there for them not ever being needed to run, because of their mere existence.". And if they tried anyways, they would pay the bill.

Open Questions

Exact numbers. E.g. we assume "normal" fluctuations in time to be maxing out at around 6, this results in a something like accepted sqrt(6) for expected deviations from the median (or not?) - in any case it should be smaller. If reports are within that window, nobody should be charged based on time information and dispute resolution would be option 1.

@burdges
Copy link

burdges commented May 13, 2023

We do want the backer who disputes to set some "valid" flag in his dispute. In fact, we'll want them to quote a vote with a very different time. It'll help debug the system obviously, but also.

Any dispute should leave the backer on-the-hook for whatever fees other voters do not pay. If not, the backer can dispute themselves, but since everyone agrees on the time, then everybody checks but nobody pays. We'll know someone pays if the backer must quote some vote with a very different time.

I think backers should not raise these time disputes just because a few nodes have very different times, but only do so when they risk being charged, so maybe needed_approvals/2 votes with bad time information. Also, these time disputes by backers need not block finality and could even be raised after finalization.

@eskimor
Copy link
Member Author

eskimor commented May 15, 2023

Yes, that what I wrote:

f the backer notices approval votes with reported times that result in a median the would result in him having to pay, he can equivocate and send an additional explicit invalid vote

The backer would only dispute if the median would result in him getting charged.

I also don't understand how a dispute can be raised without a slash:

There are two options:

  1. Either there are significant deviations from the median, then those deviations will be slashed.
  2. There are none, in that case all voting invalid would get slashed - which includes the equivocating backer.

Hence if there is no attack on the backer, raising a dispute will get the backer slashed. Same as for any other unjustified dispute.

@Sophia-Gold
Copy link
Contributor

Use of previous logic

If time report variations are within reasonable limits from the calculated median (of all reported votes), then the slashing logic is the same as it was before: The validators voting invalid will get slashed some amount that is supposed to make up for the wasted effort of the network.

What we discussed is actually changing this to 1% of the minimum bond of the backer and all approval checkers, split between everyone who votes against a valid candidate and the backer according to the time overrun curve. So in this scenario where there's no overrun charge it's 1% split between all the dishonest approval checkers vs. .1% each we've previously suggested.

@Sophia-Gold
Copy link
Contributor

Exact numbers. E.g. we assume "normal" fluctuations in time to be maxing out at around 6, this results in a something like accepted sqrt(6) for expected deviations from the median (or not?) - in any case it should be smaller. If reports are within that window, nobody should be charged based on time information and dispute resolution would be option 1.

I'm not sure using the standard deviation is correct. In my draft of the guide section I'm proposing just 3x, so starting charging at 3x backing timeout, maxxing out at 1/3 approval checking timeout, and detecting inflation when needed_approvals/2 is 3x the median of the entire validator set.

@the-right-joyce the-right-joyce added the I6-meta A specific issue for grouping tasks or bugs of a specific category. label Oct 3, 2023
@tdimitrov tdimitrov mentioned this issue Oct 20, 2023
4 tasks
This was referenced Nov 22, 2023
claravanstaden added a commit to Snowfork/polkadot-sdk that referenced this issue Dec 8, 2023
…ch#742)

* Starts working on weak subjectivity period check

* Adds weak subjectivity check.

* fmt

* Adds WeakSubjectivityPeriod config to snowblink and snowbridge runtime.

* Fix tests.

* Fix tab.

* Converts weak subjectivity check to system time instead of block time. Adds bridge blocked flag.

* Fix tests and fmt

* Tiny update

* Reverts some of the logic to make way for long range attack governance handling.

* Refactors finalized header state into a single storage item. Use config for weak subjectivity period check.

* fmt and fix benchmarks, tests

* Fix tests

Co-authored-by: claravanstaden <Cats 4 life!>
alexggh added a commit that referenced this issue Dec 13, 2023
Initial implementation for the plan discussed here: #701
Built on top of #1178
v0: paritytech/polkadot#7554,

## Overall idea

When approval-voting checks a candidate and is ready to advertise the
approval, defer it in a per-relay chain block until we either have
MAX_APPROVAL_COALESCE_COUNT candidates to sign or a candidate has stayed
MAX_APPROVALS_COALESCE_TICKS in the queue, in both cases we sign what
candidates we have available.

This should allow us to reduce the number of approvals messages we have
to create/send/verify. The parameters are configurable, so we should
find some values that balance:

- Security of the network: Delaying broadcasting of an approval
shouldn't but the finality at risk and to make sure that never happens
we won't delay sending a vote if we are past 2/3 from the no-show time.
- Scalability of the network: MAX_APPROVAL_COALESCE_COUNT = 1 &
MAX_APPROVALS_COALESCE_TICKS =0, is what we have now and we know from
the measurements we did on versi, it bottlenecks
approval-distribution/approval-voting when increase significantly the
number of validators and parachains
- Block storage: In case of disputes we have to import this votes on
chain and that increase the necessary storage with
MAX_APPROVAL_COALESCE_COUNT * CandidateHash per vote. Given that
disputes are not the normal way of the network functioning and we will
limit MAX_APPROVAL_COALESCE_COUNT in the single digits numbers, this
should be good enough. Alternatively, we could try to create a better
way to store this on-chain through indirection, if that's needed.

## Other fixes:
- Fixed the fact that we were sending random assignments to
non-validators, that was wrong because those won't do anything with it
and they won't gossip it either because they do not have a grid topology
set, so we would waste the random assignments.
- Added metrics to be able to debug potential no-shows and
mis-processing of approvals/assignments.

## TODO:
- [x] Get feedback, that this is moving in the right direction. @ordian
@sandreim @eskimor @burdges, let me know what you think.
- [x] More and more testing.
- [x]  Test in versi.
- [x] Make MAX_APPROVAL_COALESCE_COUNT &
MAX_APPROVAL_COALESCE_WAIT_MILLIS a parachain host configuration.
- [x] Make sure the backwards compatibility works correctly
- [x] Make sure this direction is compatible with other streams of work:
#635 &
#742
- [x] Final versi burn-in before merging

---------

Signed-off-by: Alexandru Gheorghe <[email protected]>
helin6 pushed a commit to boolnetwork/polkadot-sdk that referenced this issue Feb 5, 2024
* Use `pallet::getter` syntax

* Leave `current_block_hash`
@Overkillus
Copy link
Contributor

This will still relevant is very low prio due to the new disabling strategy mitigating a lot of the risks (combined with gov refunds in case of attacks). PVM should also heavily reduce the risk further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I6-meta A specific issue for grouping tasks or bugs of a specific category.
Projects
Status: Backlog
Status: To do
Development

No branches or pull requests

7 participants