From 1a26f32aaabf4bb9cf65d736a5e11234a1fca134 Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Thu, 6 Dec 2018 23:05:11 +0800 Subject: [PATCH 01/23] in progress: --- docs/spec/staking/tombstone.md | 38 ++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 docs/spec/staking/tombstone.md diff --git a/docs/spec/staking/tombstone.md b/docs/spec/staking/tombstone.md new file mode 100644 index 000000000000..8d32c8545bd9 --- /dev/null +++ b/docs/spec/staking/tombstone.md @@ -0,0 +1,38 @@ +# Staking Tombstone + +## Abstract + +In the current implementation of the `stake` module, when the consensus engine informs the state machine of a validator's consensus fault, the validator is partially slashed, and put into a "jail period", a period of time in which they are not allowed to rejoin the validator set. However, because of the nature of consensus faults and ABCI, there can be a delay between an infraction occuring, and evidence of the infraction reaching the state machine (this is one of the primary reasons for the existence of the unbonding period). + +Part of the design of the `stake` module is that delegators should be slashed for the infractions that happened during blocks that they were delegated to the offending validator, however, they should not be slashed for infractions that their voting power did not contribute to. + +Thus, if the sequence of events is: +1. Validator A commits Infraction 1 +2. Delegator X delegates to Validator A +3. Evidence for Infraction A reaches state machine +Delegator X should not be slashed. + +However, in the current system, because delegator can cause + +In order to think about these cases, we should list the "events" that we have to consider, and then how to deal with each of the cases that happen. + + + +Given the 4 following events: +1 - Validator A Infraction +2 - Delegator X Bonds to Validator A +3 - Validator A Evidence + +For this reason, if a validator's + + + + +<------------------------------------------------------------> + A B C D E + + + + + +Note: The tombstone concept, only applies to byzantine faults reported over ABCI. For slashable offenses tracked by the state machine (such as liveness faults), as there is not a delay between infraction and slashing, the current jail system is adequate. \ No newline at end of file From 346bf9c4100edfc2d66a55d0cd745b312a8573e5 Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Fri, 14 Dec 2018 00:26:35 +0800 Subject: [PATCH 02/23] in progress --- docs/spec/staking/tombstone.md | 56 ++++++++++++++++++++++++---------- 1 file changed, 40 insertions(+), 16 deletions(-) diff --git a/docs/spec/staking/tombstone.md b/docs/spec/staking/tombstone.md index 8d32c8545bd9..acc26e2c3b64 100644 --- a/docs/spec/staking/tombstone.md +++ b/docs/spec/staking/tombstone.md @@ -2,34 +2,58 @@ ## Abstract -In the current implementation of the `stake` module, when the consensus engine informs the state machine of a validator's consensus fault, the validator is partially slashed, and put into a "jail period", a period of time in which they are not allowed to rejoin the validator set. However, because of the nature of consensus faults and ABCI, there can be a delay between an infraction occuring, and evidence of the infraction reaching the state machine (this is one of the primary reasons for the existence of the unbonding period). +In the current implementation of the `slashing` module, when the consensus engine informs the state machine of a validator's consensus fault, the validator is partially slashed, and put into a "jail period", a period of time in which they are not allowed to rejoin the validator set. However, because of the nature of consensus faults and ABCI, there can be a delay between an infraction occuring, and evidence of the infraction reaching the state machine (this is one of the primary reasons for the existence of the unbonding period). -Part of the design of the `stake` module is that delegators should be slashed for the infractions that happened during blocks that they were delegated to the offending validator, however, they should not be slashed for infractions that their voting power did not contribute to. +In the current system design, once a validator is put in the jail for a consensus fault, after the `JailPeriod` they are allowed to send a transaction to `unjail` themselves, and thus rejoin the validator set. -Thus, if the sequence of events is: -1. Validator A commits Infraction 1 -2. Delegator X delegates to Validator A -3. Evidence for Infraction A reaches state machine -Delegator X should not be slashed. +One of the "design desires" of `slashing` module is that if multiple infractions occur before evidence is executed (and a validator is put in jail), they should only be punished for single worst infraction, but not cumulatively. For example, if the sequence of events is: +1. Validator A commits Infraction 1 (worth 30% slash) +2. Validator A commits Infraction 2 (worth 40% slash) +3. Validator A commits Infraction 3 (worth 35% slash) +4. Evidence for Infraction 1 reaches state machine (and validator is put in jail) +5. Evidence for Infraction 2 reaches state machine +6. Evidence for Infraction 3 reaches state machine +Only Infraction 2 should have it's slash take effect, as it is the highest. This is done, so that in the case of the compromise of a validator's consensus key, they will only be punished once, even if the hacker double-signs many blocks. Because, the unjailing has to be done with the validator's account key, they have a chance to resecure their consensus key, and then signal that they are ready using their account key. We call this period during which we track only the max infraction, the "slashing period". + +Once, a validator rejoins by unjailing themselves, we begin a new slashing period; if they commit a new infraction after unjailing, it gets slashed cumulatively on top of the worst infraction from the previous slashing period. + +However, while infractions are grouped based off of the slashing periods, because evidence can be submitted up to an `unbondingPeriod` after the infraction, we still have to allow for evidence to be submitted for previous slashing periods. For example, if the sequence of events is: +1. Validator A commits Infraction 1 (worth 30% slash) +2. Validator A commits Infraction 2 (worth 40% slash) +3. Evidence for Infraction 1 reaches state machine (and Validator A is put in jail) +4. Validator A unjails +We are now in a new slashing period, however we still have to keep the door open for the previous infraction, as the evidence for Infraction 2 may still come in. As the number of slashing periods increase, it creates more complexity as we have to keep track of the highest infraction amount for every single slashing period. -However, in the current system, because delegator can cause +> Note: Currently, according to the `slashing` module spec, a new slashing period is created everytime a validator is unbonded then rebonded. This should probably be changed to jailed/unjailed, as in the current system, let's say I compromised the key of the rank 100 validator, I could bond my own validator into and out of the validator set many times, in order to create as many slashing periods I want for the validator. Then I can create infractions for each of the slashing periods I created for the validator, allowing me to get them multiply slashed. I'm not sure if this is how it is implemented in the code, or is just a mistake in the spec. For the remainder of this, I will assume that we only start a new slashing period when a validator gets unjailed. -In order to think about these cases, we should list the "events" that we have to consider, and then how to deal with each of the cases that happen. +The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks and 2 days, respectively. This means there could potentially be up to 11 slashing periods concurrently being tracked per validator. +Things get even messier, once we begin to add in delegations. +If we set the `JailPeriod >= UnbondingPeriod`, we only have to track 1 slashing period. And at the point that the JailPeriod is equivalent to the UnbondingPeriod, we might as well just call JailPeriod -Given the 4 following events: -1 - Validator A Infraction -2 - Delegator X Bonds to Validator A -3 - Validator A Evidence +We can get rid of the hooks between staking and slashing. -For this reason, if a validator's + -<------------------------------------------------------------> - A B C D E From 3fe1688b821c0393a0bf47cd596452baacce8331 Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Fri, 14 Dec 2018 00:54:08 +0800 Subject: [PATCH 03/23] in progress --- docs/spec/staking/tombstone.md | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/docs/spec/staking/tombstone.md b/docs/spec/staking/tombstone.md index acc26e2c3b64..28abbc4938d5 100644 --- a/docs/spec/staking/tombstone.md +++ b/docs/spec/staking/tombstone.md @@ -26,16 +26,15 @@ We are now in a new slashing period, however we still have to keep the door open > Note: Currently, according to the `slashing` module spec, a new slashing period is created everytime a validator is unbonded then rebonded. This should probably be changed to jailed/unjailed, as in the current system, let's say I compromised the key of the rank 100 validator, I could bond my own validator into and out of the validator set many times, in order to create as many slashing periods I want for the validator. Then I can create infractions for each of the slashing periods I created for the validator, allowing me to get them multiply slashed. I'm not sure if this is how it is implemented in the code, or is just a mistake in the spec. For the remainder of this, I will assume that we only start a new slashing period when a validator gets unjailed. -The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks and 2 days, respectively. This means there could potentially be up to 11 slashing periods concurrently being tracked per validator. +The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks and 2 days, respectively. This means there could potentially be up to 11 slashing periods concurrently being tracked per validator. If we set the `JailPeriod >= UnbondingPeriod`, we only have to track 1 slashing period (i.e not have to track slashing periods). -Things get even messier, once we begin to add in delegations. - -If we set the `JailPeriod >= UnbondingPeriod`, we only have to track 1 slashing period. And at the point that the JailPeriod is equivalent to the UnbondingPeriod, we might as well just call JailPeriod - -We can get rid of the hooks between staking and slashing. +Things get even messier, once we begin to add in delegations. TODO +Currently, in the jail period implementation, once a validator unjails, all of their delegators who are delegated to them (haven't unbonded / redelegated away), stay with them. Given that consensus safety faults, are so egregious (way more so than liveness faults), it is probably prudent to have delegators not "auto-rebond" to the validator. Thus, we propose that instead of being put in a "jailed state" after evidence for a consensus safety fault, validators are instead put into a "tombstone state", which means the validator is kicked out of the validator set and not allowed to rejoin. All of the stake that was delegated to it is put into an unbonding period. The validator operator can create a new validator if they would like, preferably with a new consensus key (do we need to enforce this? No rational validator should reuse the same compromised key lol), but they have to "reearn" their delegations back. +Doing this tombstone system and getting rid of the slashing period tracking, will make the `slashing` module way simpler, especially because we can remove the hooks between the `stake` and `slashing` modules. +Note: The tombstone concept, only applies to byzantine faults reported over ABCI. For slashable offenses tracked by the state machine (such as liveness faults), as there is not a delay between infraction and slashing, no slashing period tracking is needed. Also, a liveness bug probably isn't so egregious that it mandates force unbonding all delegations, and so the current jail system is adequate. - - - -Note: The tombstone concept, only applies to byzantine faults reported over ABCI. For slashable offenses tracked by the state machine (such as liveness faults), as there is not a delay between infraction and slashing, the current jail system is adequate. \ No newline at end of file From 3bb6de92d2f0f012276b3cfe5b090a21311bb645 Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Mon, 17 Dec 2018 21:43:02 +0800 Subject: [PATCH 04/23] in progress --- docs/spec/{staking => slashing}/tombstone.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename docs/spec/{staking => slashing}/tombstone.md (98%) diff --git a/docs/spec/staking/tombstone.md b/docs/spec/slashing/tombstone.md similarity index 98% rename from docs/spec/staking/tombstone.md rename to docs/spec/slashing/tombstone.md index 28abbc4938d5..a80ca11ee997 100644 --- a/docs/spec/staking/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -13,6 +13,7 @@ One of the "design desires" of `slashing` module is that if multiple infractions 4. Evidence for Infraction 1 reaches state machine (and validator is put in jail) 5. Evidence for Infraction 2 reaches state machine 6. Evidence for Infraction 3 reaches state machine + Only Infraction 2 should have it's slash take effect, as it is the highest. This is done, so that in the case of the compromise of a validator's consensus key, they will only be punished once, even if the hacker double-signs many blocks. Because, the unjailing has to be done with the validator's account key, they have a chance to resecure their consensus key, and then signal that they are ready using their account key. We call this period during which we track only the max infraction, the "slashing period". Once, a validator rejoins by unjailing themselves, we begin a new slashing period; if they commit a new infraction after unjailing, it gets slashed cumulatively on top of the worst infraction from the previous slashing period. @@ -22,14 +23,13 @@ However, while infractions are grouped based off of the slashing periods, becaus 2. Validator A commits Infraction 2 (worth 40% slash) 3. Evidence for Infraction 1 reaches state machine (and Validator A is put in jail) 4. Validator A unjails + We are now in a new slashing period, however we still have to keep the door open for the previous infraction, as the evidence for Infraction 2 may still come in. As the number of slashing periods increase, it creates more complexity as we have to keep track of the highest infraction amount for every single slashing period. > Note: Currently, according to the `slashing` module spec, a new slashing period is created everytime a validator is unbonded then rebonded. This should probably be changed to jailed/unjailed, as in the current system, let's say I compromised the key of the rank 100 validator, I could bond my own validator into and out of the validator set many times, in order to create as many slashing periods I want for the validator. Then I can create infractions for each of the slashing periods I created for the validator, allowing me to get them multiply slashed. I'm not sure if this is how it is implemented in the code, or is just a mistake in the spec. For the remainder of this, I will assume that we only start a new slashing period when a validator gets unjailed. The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks and 2 days, respectively. This means there could potentially be up to 11 slashing periods concurrently being tracked per validator. If we set the `JailPeriod >= UnbondingPeriod`, we only have to track 1 slashing period (i.e not have to track slashing periods). -Things get even messier, once we begin to add in delegations. TODO - Currently, in the jail period implementation, once a validator unjails, all of their delegators who are delegated to them (haven't unbonded / redelegated away), stay with them. Given that consensus safety faults, are so egregious (way more so than liveness faults), it is probably prudent to have delegators not "auto-rebond" to the validator. Thus, we propose that instead of being put in a "jailed state" after evidence for a consensus safety fault, validators are instead put into a "tombstone state", which means the validator is kicked out of the validator set and not allowed to rejoin. All of the stake that was delegated to it is put into an unbonding period. The validator operator can create a new validator if they would like, preferably with a new consensus key (do we need to enforce this? No rational validator should reuse the same compromised key lol), but they have to "reearn" their delegations back. Doing this tombstone system and getting rid of the slashing period tracking, will make the `slashing` module way simpler, especially because we can remove the hooks between the `stake` and `slashing` modules. From 96c074c22b9d5fd938b52ab6b080194fb5f25e54 Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Tue, 18 Dec 2018 00:45:05 +0800 Subject: [PATCH 05/23] asdf --- docs/spec/slashing/tombstone.md | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index a80ca11ee997..b7146e7b6125 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -16,6 +16,8 @@ One of the "design desires" of `slashing` module is that if multiple infractions Only Infraction 2 should have it's slash take effect, as it is the highest. This is done, so that in the case of the compromise of a validator's consensus key, they will only be punished once, even if the hacker double-signs many blocks. Because, the unjailing has to be done with the validator's account key, they have a chance to resecure their consensus key, and then signal that they are ready using their account key. We call this period during which we track only the max infraction, the "slashing period". +> Note: Currently, from what I can tell in the code, while the stake actively delegated to a validator follows the Max slash rule, but stake that is unbonding or redelegating away from a validator gets slashed cumulatively for consecutive faults. I think this may be a bug that needs to be fixed independently of whether the tombstone is accepted or not. + Once, a validator rejoins by unjailing themselves, we begin a new slashing period; if they commit a new infraction after unjailing, it gets slashed cumulatively on top of the worst infraction from the previous slashing period. However, while infractions are grouped based off of the slashing periods, because evidence can be submitted up to an `unbondingPeriod` after the infraction, we still have to allow for evidence to be submitted for previous slashing periods. For example, if the sequence of events is: @@ -34,7 +36,25 @@ Currently, in the jail period implementation, once a validator unjails, all of t Doing this tombstone system and getting rid of the slashing period tracking, will make the `slashing` module way simpler, especially because we can remove the hooks between the `stake` and `slashing` modules. -Note: The tombstone concept, only applies to byzantine faults reported over ABCI. For slashable offenses tracked by the state machine (such as liveness faults), as there is not a delay between infraction and slashing, no slashing period tracking is needed. Also, a liveness bug probably isn't so egregious that it mandates force unbonding all delegations, and so the current jail system is adequate. +> Note: The tombstone concept, only applies to byzantine faults reported over ABCI. For slashable offenses tracked by the state machine (such as liveness faults), as there is not a delay between infraction and slashing, no slashing period tracking is needed. Also, a liveness bug probably isn't so egregious that it mandates force unbonding all delegations, and so the current jail system is adequate. + + +## Further improvements / Related proposals: + +### Single slashing amount + +Another optimization that can be made is that if we assume that all ABCI faults for Tendermint consensus are slashed at the same level, we don't have to keep track of "max slash". Once an ABCI fault happens once, we don't have to worry about comparing potential future ones to find the max. + +I believe current Tendermint ABCI faults are primarily: +- Unjustified precommits (double signs) +- Signing a precommit when you're in unbonding phase (can be used to trick light clients) + +Do we want to punish these two faults at different levels? If not, we can enact the above change. Note: This change may make sense for current Tendermint consensus, but maybe not for a different consensus algorithm or future versions of Tendermint that may want to punish at different levels (for example, partial slashing). + +### Store infractions in state instead of iterating over unbonds/redelegations + +Currently, every time evidence of a new fault comes in, we currently iterate over all of the unbonds/redelegations away from a validator to see if the slash affects them or not. If it does, we decrease the "balance" of the `ubd` or `red`. However, as the number of unbonds or redelegations can be very high, this might be very expensive. Instead, we can store evidences for all infractions that happened in the last `Unbonding Period` in state, and then whenever a ubd or red hits maturity, it can check if it needs to be slashed by checking it against the last infraction that happened before they started unbonding/redelegating away. Because we only need to store the infractions from the last unbon + - - - From 5b9aa01fa029472d81596b0dff77124f57a6e66d Mon Sep 17 00:00:00 2001 From: Alexander Bezobchuk Date: Wed, 19 Dec 2018 15:01:15 +0800 Subject: [PATCH 06/23] Update docs/spec/slashing/tombstone.md Co-Authored-By: sunnya97 --- docs/spec/slashing/tombstone.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index b7146e7b6125..267e06041cc5 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -14,7 +14,7 @@ One of the "design desires" of `slashing` module is that if multiple infractions 5. Evidence for Infraction 2 reaches state machine 6. Evidence for Infraction 3 reaches state machine -Only Infraction 2 should have it's slash take effect, as it is the highest. This is done, so that in the case of the compromise of a validator's consensus key, they will only be punished once, even if the hacker double-signs many blocks. Because, the unjailing has to be done with the validator's account key, they have a chance to resecure their consensus key, and then signal that they are ready using their account key. We call this period during which we track only the max infraction, the "slashing period". +Only Infraction 2 should have it's slash take effect, as it is the highest. This is done, so that in the case of the compromise of a validator's consensus key, they will only be punished once, even if the hacker double-signs many blocks. Because, the unjailing has to be done with the validator's operator key, they have a chance to re-secure their consensus key, and then signal that they are ready using their operator key. We call this period during which we track only the max infraction, the "slashing period". > Note: Currently, from what I can tell in the code, while the stake actively delegated to a validator follows the Max slash rule, but stake that is unbonding or redelegating away from a validator gets slashed cumulatively for consecutive faults. I think this may be a bug that needs to be fixed independently of whether the tombstone is accepted or not. From 04c2e6ebbb311da19824c1e798b00f4c8be6e2ec Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Wed, 19 Dec 2018 16:06:45 +0800 Subject: [PATCH 07/23] fixed incorrect note --- docs/spec/slashing/tombstone.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index 267e06041cc5..8c0d211b476c 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -16,8 +16,6 @@ One of the "design desires" of `slashing` module is that if multiple infractions Only Infraction 2 should have it's slash take effect, as it is the highest. This is done, so that in the case of the compromise of a validator's consensus key, they will only be punished once, even if the hacker double-signs many blocks. Because, the unjailing has to be done with the validator's operator key, they have a chance to re-secure their consensus key, and then signal that they are ready using their operator key. We call this period during which we track only the max infraction, the "slashing period". -> Note: Currently, from what I can tell in the code, while the stake actively delegated to a validator follows the Max slash rule, but stake that is unbonding or redelegating away from a validator gets slashed cumulatively for consecutive faults. I think this may be a bug that needs to be fixed independently of whether the tombstone is accepted or not. - Once, a validator rejoins by unjailing themselves, we begin a new slashing period; if they commit a new infraction after unjailing, it gets slashed cumulatively on top of the worst infraction from the previous slashing period. However, while infractions are grouped based off of the slashing periods, because evidence can be submitted up to an `unbondingPeriod` after the infraction, we still have to allow for evidence to be submitted for previous slashing periods. For example, if the sequence of events is: From 4faf5fb09d9e4080b3eda9a22ab65f78fc402b1a Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Mon, 31 Dec 2018 19:36:17 -0800 Subject: [PATCH 08/23] comments --- docs/spec/slashing/tombstone.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index 8c0d211b476c..f37e7864cc90 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -43,11 +43,11 @@ Doing this tombstone system and getting rid of the slashing period tracking, wil Another optimization that can be made is that if we assume that all ABCI faults for Tendermint consensus are slashed at the same level, we don't have to keep track of "max slash". Once an ABCI fault happens once, we don't have to worry about comparing potential future ones to find the max. -I believe current Tendermint ABCI faults are primarily: +I believe current planned Tendermint ABCI faults are primarily: - Unjustified precommits (double signs) - Signing a precommit when you're in unbonding phase (can be used to trick light clients) -Do we want to punish these two faults at different levels? If not, we can enact the above change. Note: This change may make sense for current Tendermint consensus, but maybe not for a different consensus algorithm or future versions of Tendermint that may want to punish at different levels (for example, partial slashing). +At the moment, the second one is not implemented, but needs to be implemented soon, in order to make light client bisection safe. Do we want to punish these two faults at different levels? If not, we can enact the above change. Note: This change may make sense for current Tendermint consensus, but maybe not for a different consensus algorithm or future versions of Tendermint that may want to punish at different levels (for example, partial slashing). ### Store infractions in state instead of iterating over unbonds/redelegations From 3864df33b5562bb2430cc8c77b5b9247e6f3591b Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Wed, 2 Jan 2019 10:33:18 -0800 Subject: [PATCH 09/23] asdf --- docs/spec/slashing/tombstone.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index f37e7864cc90..1278f43dc0d1 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -26,16 +26,17 @@ However, while infractions are grouped based off of the slashing periods, becaus We are now in a new slashing period, however we still have to keep the door open for the previous infraction, as the evidence for Infraction 2 may still come in. As the number of slashing periods increase, it creates more complexity as we have to keep track of the highest infraction amount for every single slashing period. -> Note: Currently, according to the `slashing` module spec, a new slashing period is created everytime a validator is unbonded then rebonded. This should probably be changed to jailed/unjailed, as in the current system, let's say I compromised the key of the rank 100 validator, I could bond my own validator into and out of the validator set many times, in order to create as many slashing periods I want for the validator. Then I can create infractions for each of the slashing periods I created for the validator, allowing me to get them multiply slashed. I'm not sure if this is how it is implemented in the code, or is just a mistake in the spec. For the remainder of this, I will assume that we only start a new slashing period when a validator gets unjailed. +> Note: Currently, according to the `slashing` module spec, a new slashing period is created everytime a validator is unbonded then rebonded. This should probably be changed to jailed/unjailed. See issue #3205 for further details. For the remainder of this, I will assume that we only start a new slashing period when a validator gets unjailed. The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks and 2 days, respectively. This means there could potentially be up to 11 slashing periods concurrently being tracked per validator. If we set the `JailPeriod >= UnbondingPeriod`, we only have to track 1 slashing period (i.e not have to track slashing periods). Currently, in the jail period implementation, once a validator unjails, all of their delegators who are delegated to them (haven't unbonded / redelegated away), stay with them. Given that consensus safety faults, are so egregious (way more so than liveness faults), it is probably prudent to have delegators not "auto-rebond" to the validator. Thus, we propose that instead of being put in a "jailed state" after evidence for a consensus safety fault, validators are instead put into a "tombstone state", which means the validator is kicked out of the validator set and not allowed to rejoin. All of the stake that was delegated to it is put into an unbonding period. The validator operator can create a new validator if they would like, preferably with a new consensus key (do we need to enforce this? No rational validator should reuse the same compromised key lol), but they have to "reearn" their delegations back. -Doing this tombstone system and getting rid of the slashing period tracking, will make the `slashing` module way simpler, especially because we can remove the hooks between the `stake` and `slashing` modules. +Given that in the current state machine, an unbonding validator cannot "rebond" and re-enter the validator set, the easiest way to accomplish the tombstone, is to just put the validator into the unbonding state. This way, the validator cannot re-enter the validator set. The validator operator can rejoin the validator set using a new Operator account, or wait for the unbonding to finish, and rebond with the same Operator Key. Either way, however, their old delegations do not carry over. -> Note: The tombstone concept, only applies to byzantine faults reported over ABCI. For slashable offenses tracked by the state machine (such as liveness faults), as there is not a delay between infraction and slashing, no slashing period tracking is needed. Also, a liveness bug probably isn't so egregious that it mandates force unbonding all delegations, and so the current jail system is adequate. +Doing this tombstone system and getting rid of the slashing period tracking, will make the `slashing` module way simpler, especially because we can remove all of the hooks defined in the `slashing` module. +> Note: The tombstone concept, only applies to byzantine faults reported over ABCI. For slashable offenses tracked by the state machine (such as liveness faults), as there is not a delay between infraction and slashing, no slashing period tracking is needed. Also, a liveness bug probably isn't so egregious that it mandates force unbonding all delegations, and so the current jail system is adequate. ## Further improvements / Related proposals: @@ -51,7 +52,11 @@ At the moment, the second one is not implemented, but needs to be implemented so ### Store infractions in state instead of iterating over unbonds/redelegations -Currently, every time evidence of a new fault comes in, we currently iterate over all of the unbonds/redelegations away from a validator to see if the slash affects them or not. If it does, we decrease the "balance" of the `ubd` or `red`. However, as the number of unbonds or redelegations can be very high, this might be very expensive. Instead, we can store evidences for all infractions that happened in the last `Unbonding Period` in state, and then whenever a ubd or red hits maturity, it can check if it needs to be slashed by checking it against the last infraction that happened before they started unbonding/redelegating away. Because we only need to store the infractions from the last unbon +Pending discussion on #3206. + + + + - - - - - +Given that these faults are both attributable byzantine faults, we will likely want to slash them equally, and thus we can enact the above change. Note: This change may make sense for current Tendermint consensus, but maybe not for a different consensus algorithm or future versions of Tendermint that may want to punish at different levels (for example, partial slashing). From 25afcae0f8414afe775cbad22fc8dfc02603b864 Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Thu, 24 Jan 2019 15:46:44 -0800 Subject: [PATCH 17/23] addressed comments --- docs/spec/slashing/tombstone.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index b56347a55e21..d9cf9e0e66e4 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -4,7 +4,7 @@ In the current implementation of the `slashing` module, when the consensus engine informs the state machine of a validator's consensus fault, the validator is partially slashed, and put into a "jail period", a period of time in which they are not allowed to rejoin the validator set. However, because of the nature of consensus faults and ABCI, there can be a delay between an infraction occuring, and evidence of the infraction reaching the state machine (this is one of the primary reasons for the existence of the unbonding period). -> Note: The tombstone concept, only applies to byzantine faults reported over ABCI. For slashable offenses tracked by the state machine (such as liveness faults), as there is not a delay between infraction and slashing, no slashing period tracking is needed. A validator is immediately put into jail period, and they cannot commit another liveness fault until they unjail. +> Note: The tombstone concept, only applies to faults that have a delay between the infraction occuring and evidence reaching the state machine. For example, evidence of a validator double signing may take a while to reach the state machine due to unpredictable evidence gossip layer delays and the ability of validators to selectively reveal double-signatures (e.g. to infrequently-online light clients). Liveness slashing, on the other hand, is detected immediately as soon as the infraction occurs, and therefore no slashing period is needed. A validator is immediately put into jail period, and they cannot commit another liveness fault until they unjail. In the future, there may be other types of byznatine faults that have delays (for example, submitting evidence of an invalid proposal as a tranasction). When implemented, it will have to be decided whether these future types of byzantine faults will result in a tombstoning (and if not, the slash amounts will not be capped by a slashing period). In the current system design, once a validator is put in the jail for a consensus fault, after the `JailPeriod` they are allowed to send a transaction to `unjail` themselves, and thus rejoin the validator set. From a9f2c67c52d0ebeceba0b3a6917ffecbdb7335bd Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Mon, 28 Jan 2019 08:23:11 -0800 Subject: [PATCH 18/23] fix misspelling --- docs/spec/slashing/tombstone.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index d9cf9e0e66e4..5cf21a30c032 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -4,7 +4,7 @@ In the current implementation of the `slashing` module, when the consensus engine informs the state machine of a validator's consensus fault, the validator is partially slashed, and put into a "jail period", a period of time in which they are not allowed to rejoin the validator set. However, because of the nature of consensus faults and ABCI, there can be a delay between an infraction occuring, and evidence of the infraction reaching the state machine (this is one of the primary reasons for the existence of the unbonding period). -> Note: The tombstone concept, only applies to faults that have a delay between the infraction occuring and evidence reaching the state machine. For example, evidence of a validator double signing may take a while to reach the state machine due to unpredictable evidence gossip layer delays and the ability of validators to selectively reveal double-signatures (e.g. to infrequently-online light clients). Liveness slashing, on the other hand, is detected immediately as soon as the infraction occurs, and therefore no slashing period is needed. A validator is immediately put into jail period, and they cannot commit another liveness fault until they unjail. In the future, there may be other types of byznatine faults that have delays (for example, submitting evidence of an invalid proposal as a tranasction). When implemented, it will have to be decided whether these future types of byzantine faults will result in a tombstoning (and if not, the slash amounts will not be capped by a slashing period). +> Note: The tombstone concept, only applies to faults that have a delay between the infraction occuring and evidence reaching the state machine. For example, evidence of a validator double signing may take a while to reach the state machine due to unpredictable evidence gossip layer delays and the ability of validators to selectively reveal double-signatures (e.g. to infrequently-online light clients). Liveness slashing, on the other hand, is detected immediately as soon as the infraction occurs, and therefore no slashing period is needed. A validator is immediately put into jail period, and they cannot commit another liveness fault until they unjail. In the future, there may be other types of byzantine faults that have delays (for example, submitting evidence of an invalid proposal as a transaction). When implemented, it will have to be decided whether these future types of byzantine faults will result in a tombstoning (and if not, the slash amounts will not be capped by a slashing period). In the current system design, once a validator is put in the jail for a consensus fault, after the `JailPeriod` they are allowed to send a transaction to `unjail` themselves, and thus rejoin the validator set. From 331d84a6425101c54686f777f0ea234c45b8b061 Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Mon, 28 Jan 2019 10:12:26 -0800 Subject: [PATCH 19/23] removed devtools-stamp --- devtools-stamp | 0 1 file changed, 0 insertions(+), 0 deletions(-) delete mode 100644 devtools-stamp diff --git a/devtools-stamp b/devtools-stamp deleted file mode 100644 index e69de29bb2d1..000000000000 From fba66602628c1de2ee35acddf92fdfdde864f2fc Mon Sep 17 00:00:00 2001 From: Christopher Goes Date: Mon, 28 Jan 2019 17:54:59 -0800 Subject: [PATCH 20/23] Apply @cwgoes suggestions from code review Co-Authored-By: sunnya97 --- docs/spec/slashing/tombstone.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index 5cf21a30c032..694f4e96c7f5 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -32,15 +32,15 @@ We are now in a new slashing period, however we still have to keep the door open The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks and 2 days, respectively. This means there could potentially be up to 11 slashing periods concurrently being tracked per validator. If we set the `JailPeriod >= UnbondingPeriod`, we only have to track 1 slashing period (i.e not have to track slashing periods). -Currently, in the jail period implementation, once a validator unjails, all of their delegators who are delegated to them (haven't unbonded / redelegated away), stay with them. Given that consensus safety faults are so egregious (way more so than liveness faults), it is probably prudent to have delegators not "auto-rebond" to the validator. Thus, we propose setting the "jail time" for a validator who commits a consensus safety fault, to `infite` (i.e. a tombstone state). This essentially kicks the validator out of the validator set and does not allow them to re-enter the validator set. All of their delegators (including the operator themselves) have to either unbond or redelegate away. The validator operator can create a new validator if they would like, with a new operator key and consensus key, but they have to "re-earn" their delegations back. To put the validator in the tombstone state, we set `DoubleSignJailEndTime` to `time.Unix(253402300800)`, the MAX time supported by Amino. +Currently, in the jail period implementation, once a validator unjails, all of their delegators who are delegated to them (haven't unbonded / redelegated away), stay with them. Given that consensus safety faults are so egregious (way more so than liveness faults), it is probably prudent to have delegators not "auto-rebond" to the validator. Thus, we propose setting the "jail time" for a validator who commits a consensus safety fault, to `infinite` (i.e. a tombstone state). This essentially kicks the validator out of the validator set and does not allow them to re-enter the validator set. All of their delegators (including the operator themselves) have to either unbond or redelegate away. The validator operator can create a new validator if they would like, with a new operator key and consensus key, but they have to "re-earn" their delegations back. To put the validator in the tombstone state, we set `DoubleSignJailEndTime` to `time.Unix(253402300800)`, the MAX time supported by Amino. -By implementing the tombstone system and getting rid of the slashing period tracking, will make the `slashing` module way simpler, especially because we can remove all of the hooks defined in the `slashing` module consumed by the `staking` module (the `slashing` module still consumes hooks defined in `staking`). +Implementing the tombstone system and getting rid of the slashing period tracking will make the `slashing` module way simpler, especially because we can remove all of the hooks defined in the `slashing` module consumed by the `staking` module (the `slashing` module still consumes hooks defined in `staking`). ## Further improvements / Related proposals: ### Single slashing amount -Another optimization that can be made is that if we assume that all ABCI faults for Tendermint consensus are slashed at the same level, we don't have to keep track of "max slash". Once an ABCI fault happens once, we don't have to worry about comparing potential future ones to find the max. +Another optimization that can be made is that if we assume that all ABCI faults for Tendermint consensus are slashed at the same level, we don't have to keep track of "max slash". Once an ABCI fault happens, we don't have to worry about comparing potential future ones to find the max. Currently the only Tendermint ABCI fault is: - Unjustified precommits (double signs) From 4fa183db96f6498542cdfadddd514f6b5902a695 Mon Sep 17 00:00:00 2001 From: Sunny Aggarwal Date: Mon, 28 Jan 2019 17:57:25 -0800 Subject: [PATCH 21/23] removed future improvements --- docs/spec/slashing/tombstone.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index 694f4e96c7f5..6112a7aadbf4 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -36,8 +36,6 @@ Currently, in the jail period implementation, once a validator unjails, all of t Implementing the tombstone system and getting rid of the slashing period tracking will make the `slashing` module way simpler, especially because we can remove all of the hooks defined in the `slashing` module consumed by the `staking` module (the `slashing` module still consumes hooks defined in `staking`). -## Further improvements / Related proposals: - ### Single slashing amount Another optimization that can be made is that if we assume that all ABCI faults for Tendermint consensus are slashed at the same level, we don't have to keep track of "max slash". Once an ABCI fault happens, we don't have to worry about comparing potential future ones to find the max. From 153190c2a360a8e8cdb0ab796ed118e54fa20360 Mon Sep 17 00:00:00 2001 From: Christopher Goes Date: Mon, 28 Jan 2019 18:04:47 -0800 Subject: [PATCH 22/23] Apply @cwgoes suggestions from code review Co-Authored-By: sunnya97 --- docs/spec/slashing/tombstone.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index 6112a7aadbf4..99a9b78b393f 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -32,7 +32,7 @@ We are now in a new slashing period, however we still have to keep the door open The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks and 2 days, respectively. This means there could potentially be up to 11 slashing periods concurrently being tracked per validator. If we set the `JailPeriod >= UnbondingPeriod`, we only have to track 1 slashing period (i.e not have to track slashing periods). -Currently, in the jail period implementation, once a validator unjails, all of their delegators who are delegated to them (haven't unbonded / redelegated away), stay with them. Given that consensus safety faults are so egregious (way more so than liveness faults), it is probably prudent to have delegators not "auto-rebond" to the validator. Thus, we propose setting the "jail time" for a validator who commits a consensus safety fault, to `infinite` (i.e. a tombstone state). This essentially kicks the validator out of the validator set and does not allow them to re-enter the validator set. All of their delegators (including the operator themselves) have to either unbond or redelegate away. The validator operator can create a new validator if they would like, with a new operator key and consensus key, but they have to "re-earn" their delegations back. To put the validator in the tombstone state, we set `DoubleSignJailEndTime` to `time.Unix(253402300800)`, the MAX time supported by Amino. +Currently, in the jail period implementation, once a validator unjails, all of their delegators who are delegated to them (haven't unbonded / redelegated away), stay with them. Given that consensus safety faults are so egregious (way more so than liveness faults), it is probably prudent to have delegators not "auto-rebond" to the validator. Thus, we propose setting the "jail time" for a validator who commits a consensus safety fault, to `infinite` (i.e. a tombstone state). This essentially kicks the validator out of the validator set and does not allow them to re-enter the validator set. All of their delegators (including the operator themselves) have to either unbond or redelegate away. The validator operator can create a new validator if they would like, with a new operator key and consensus key, but they have to "re-earn" their delegations back. To put the validator in the tombstone state, we set `DoubleSignJailEndTime` to `time.Unix(253402300800)`, the maximum time supported by Amino. Implementing the tombstone system and getting rid of the slashing period tracking will make the `slashing` module way simpler, especially because we can remove all of the hooks defined in the `slashing` module consumed by the `staking` module (the `slashing` module still consumes hooks defined in `staking`). @@ -43,7 +43,7 @@ Another optimization that can be made is that if we assume that all ABCI faults Currently the only Tendermint ABCI fault is: - Unjustified precommits (double signs) -In it currently planned to include the following fault in the near future: +It is currently planned to include the following fault in the near future: - Signing a precommit when you're in unbonding phase (needed to make light client bisection safe) Given that these faults are both attributable byzantine faults, we will likely want to slash them equally, and thus we can enact the above change. Note: This change may make sense for current Tendermint consensus, but maybe not for a different consensus algorithm or future versions of Tendermint that may want to punish at different levels (for example, partial slashing). From 78424da8a549371ce9c94d430e033128247bcb94 Mon Sep 17 00:00:00 2001 From: Aleksandr Bezobchuk Date: Tue, 29 Jan 2019 12:21:22 -0500 Subject: [PATCH 23/23] typo fixes and minor markdown restructure --- docs/spec/slashing/tombstone.md | 114 ++++++++++++++++++++++++++------ 1 file changed, 93 insertions(+), 21 deletions(-) diff --git a/docs/spec/slashing/tombstone.md b/docs/spec/slashing/tombstone.md index 99a9b78b393f..d6ba47d5f62f 100644 --- a/docs/spec/slashing/tombstone.md +++ b/docs/spec/slashing/tombstone.md @@ -2,48 +2,120 @@ ## Abstract -In the current implementation of the `slashing` module, when the consensus engine informs the state machine of a validator's consensus fault, the validator is partially slashed, and put into a "jail period", a period of time in which they are not allowed to rejoin the validator set. However, because of the nature of consensus faults and ABCI, there can be a delay between an infraction occuring, and evidence of the infraction reaching the state machine (this is one of the primary reasons for the existence of the unbonding period). +In the current implementation of the `slashing` module, when the consensus engine +informs the state machine of a validator's consensus fault, the validator is +partially slashed, and put into a "jail period", a period of time in which they +are not allowed to rejoin the validator set. However, because of the nature of +consensus faults and ABCI, there can be a delay between an infraction occurring, +and evidence of the infraction reaching the state machine (this is one of the +primary reasons for the existence of the unbonding period). + +> Note: The tombstone concept, only applies to faults that have a delay between +the infraction occurring and evidence reaching the state machine. For example, +evidence of a validator double signing may take a while to reach the state machine +due to unpredictable evidence gossip layer delays and the ability of validators to +selectively reveal double-signatures (e.g. to infrequently-online light clients). +Liveness slashing, on the other hand, is detected immediately as soon as the +infraction occurs, and therefore no slashing period is needed. A validator is +immediately put into jail period, and they cannot commit another liveness fault +until they unjail. In the future, there may be other types of byzantine faults +that have delays (for example, submitting evidence of an invalid proposal as a transaction). +When implemented, it will have to be decided whether these future types of +byzantine faults will result in a tombstoning (and if not, the slash amounts +will not be capped by a slashing period). + +In the current system design, once a validator is put in the jail for a consensus +fault, after the `JailPeriod` they are allowed to send a transaction to `unjail` +themselves, and thus rejoin the validator set. + +One of the "design desires" of the `slashing` module is that if multiple +infractions occur before evidence is executed (and a validator is put in jail), +they should only be punished for single worst infraction, but not cumulatively. +For example, if the sequence of events is: -> Note: The tombstone concept, only applies to faults that have a delay between the infraction occuring and evidence reaching the state machine. For example, evidence of a validator double signing may take a while to reach the state machine due to unpredictable evidence gossip layer delays and the ability of validators to selectively reveal double-signatures (e.g. to infrequently-online light clients). Liveness slashing, on the other hand, is detected immediately as soon as the infraction occurs, and therefore no slashing period is needed. A validator is immediately put into jail period, and they cannot commit another liveness fault until they unjail. In the future, there may be other types of byzantine faults that have delays (for example, submitting evidence of an invalid proposal as a transaction). When implemented, it will have to be decided whether these future types of byzantine faults will result in a tombstoning (and if not, the slash amounts will not be capped by a slashing period). - -In the current system design, once a validator is put in the jail for a consensus fault, after the `JailPeriod` they are allowed to send a transaction to `unjail` themselves, and thus rejoin the validator set. - -One of the "design desires" of the `slashing` module is that if multiple infractions occur before evidence is executed (and a validator is put in jail), they should only be punished for single worst infraction, but not cumulatively. For example, if the sequence of events is: 1. Validator A commits Infraction 1 (worth 30% slash) 2. Validator A commits Infraction 2 (worth 40% slash) 3. Validator A commits Infraction 3 (worth 35% slash) 4. Evidence for Infraction 1 reaches state machine (and validator is put in jail) 5. Evidence for Infraction 2 reaches state machine 6. Evidence for Infraction 3 reaches state machine - -Only Infraction 2 should have its slash take effect, as it is the highest. This is done, so that in the case of the compromise of a validator's consensus key, they will only be punished once, even if the hacker double-signs many blocks. Because, the unjailing has to be done with the validator's operator key, they have a chance to re-secure their consensus key, and then signal that they are ready using their operator key. We call this period during which we track only the max infraction, the "slashing period". -Once, a validator rejoins by unjailing themselves, we begin a new slashing period; if they commit a new infraction after unjailing, it gets slashed cumulatively on top of the worst infraction from the previous slashing period. +Only Infraction 2 should have its slash take effect, as it is the highest. This +is done, so that in the case of the compromise of a validator's consensus key, +they will only be punished once, even if the hacker double-signs many blocks. +Because, the unjailing has to be done with the validator's operator key, they +have a chance to re-secure their consensus key, and then signal that they are +ready using their operator key. We call this period during which we track only +the max infraction, the "slashing period". + +Once, a validator rejoins by unjailing themselves, we begin a new slashing period; +if they commit a new infraction after unjailing, it gets slashed cumulatively on +top of the worst infraction from the previous slashing period. + +However, while infractions are grouped based off of the slashing periods, because +evidence can be submitted up to an `unbondingPeriod` after the infraction, we +still have to allow for evidence to be submitted for previous slashing periods. +For example, if the sequence of events is: -However, while infractions are grouped based off of the slashing periods, because evidence can be submitted up to an `unbondingPeriod` after the infraction, we still have to allow for evidence to be submitted for previous slashing periods. For example, if the sequence of events is: 1. Validator A commits Infraction 1 (worth 30% slash) 2. Validator A commits Infraction 2 (worth 40% slash) 3. Evidence for Infraction 1 reaches state machine (and Validator A is put in jail) 4. Validator A unjails - -We are now in a new slashing period, however we still have to keep the door open for the previous infraction, as the evidence for Infraction 2 may still come in. As the number of slashing periods increase, it creates more complexity as we have to keep track of the highest infraction amount for every single slashing period. - -> Note: Currently, according to the `slashing` module spec, a new slashing period is created every time a validator is unbonded then rebonded. This should probably be changed to jailed/unjailed. See issue [#3205](https://github.com/cosmos/cosmos-sdk/issues/3205) for further details. For the remainder of this, I will assume that we only start a new slashing period when a validator gets unjailed. -The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks and 2 days, respectively. This means there could potentially be up to 11 slashing periods concurrently being tracked per validator. If we set the `JailPeriod >= UnbondingPeriod`, we only have to track 1 slashing period (i.e not have to track slashing periods). - -Currently, in the jail period implementation, once a validator unjails, all of their delegators who are delegated to them (haven't unbonded / redelegated away), stay with them. Given that consensus safety faults are so egregious (way more so than liveness faults), it is probably prudent to have delegators not "auto-rebond" to the validator. Thus, we propose setting the "jail time" for a validator who commits a consensus safety fault, to `infinite` (i.e. a tombstone state). This essentially kicks the validator out of the validator set and does not allow them to re-enter the validator set. All of their delegators (including the operator themselves) have to either unbond or redelegate away. The validator operator can create a new validator if they would like, with a new operator key and consensus key, but they have to "re-earn" their delegations back. To put the validator in the tombstone state, we set `DoubleSignJailEndTime` to `time.Unix(253402300800)`, the maximum time supported by Amino. - -Implementing the tombstone system and getting rid of the slashing period tracking will make the `slashing` module way simpler, especially because we can remove all of the hooks defined in the `slashing` module consumed by the `staking` module (the `slashing` module still consumes hooks defined in `staking`). +We are now in a new slashing period, however we still have to keep the door open +for the previous infraction, as the evidence for Infraction 2 may still come in. +As the number of slashing periods increase, it creates more complexity as we have +to keep track of the highest infraction amount for every single slashing period. + +> Note: Currently, according to the `slashing` module spec, a new slashing period +is created every time a validator is unbonded then rebonded. This should probably +be changed to jailed/unjailed. See issue [#3205](https://github.com/cosmos/cosmos-sdk/issues/3205) +for further details. For the remainder of this, I will assume that we only start +a new slashing period when a validator gets unjailed. + +The maximum number of slashing periods is the `len(UnbondingPeriod) / len(JailPeriod)`. +The current defaults in Gaia for the `UnbondingPeriod` and `JailPeriod` are 3 weeks +and 2 days, respectively. This means there could potentially be up to 11 slashing +periods concurrently being tracked per validator. If we set the `JailPeriod >= UnbondingPeriod`, +we only have to track 1 slashing period (i.e not have to track slashing periods). + +Currently, in the jail period implementation, once a validator unjails, all of +their delegators who are delegated to them (haven't unbonded / redelegated away), +stay with them. Given that consensus safety faults are so egregious +(way more so than liveness faults), it is probably prudent to have delegators not +"auto-rebond" to the validator. Thus, we propose setting the "jail time" for a +validator who commits a consensus safety fault, to `infinite` (i.e. a tombstone state). +This essentially kicks the validator out of the validator set and does not allow +them to re-enter the validator set. All of their delegators (including the operator themselves) +have to either unbond or redelegate away. The validator operator can create a new +validator if they would like, with a new operator key and consensus key, but they +have to "re-earn" their delegations back. To put the validator in the tombstone +state, we set `DoubleSignJailEndTime` to `time.Unix(253402300800)`, the maximum +time supported by Amino. + +Implementing the tombstone system and getting rid of the slashing period tracking +will make the `slashing` module way simpler, especially because we can remove all +of the hooks defined in the `slashing` module consumed by the `staking` module +(the `slashing` module still consumes hooks defined in `staking`). ### Single slashing amount -Another optimization that can be made is that if we assume that all ABCI faults for Tendermint consensus are slashed at the same level, we don't have to keep track of "max slash". Once an ABCI fault happens, we don't have to worry about comparing potential future ones to find the max. +Another optimization that can be made is that if we assume that all ABCI faults +for Tendermint consensus are slashed at the same level, we don't have to keep +track of "max slash". Once an ABCI fault happens, we don't have to worry about +comparing potential future ones to find the max. Currently the only Tendermint ABCI fault is: + - Unjustified precommits (double signs) It is currently planned to include the following fault in the near future: + - Signing a precommit when you're in unbonding phase (needed to make light client bisection safe) -Given that these faults are both attributable byzantine faults, we will likely want to slash them equally, and thus we can enact the above change. Note: This change may make sense for current Tendermint consensus, but maybe not for a different consensus algorithm or future versions of Tendermint that may want to punish at different levels (for example, partial slashing). +Given that these faults are both attributable byzantine faults, we will likely +want to slash them equally, and thus we can enact the above change. + +> Note: This change may make sense for current Tendermint consensus, but maybe +not for a different consensus algorithm or future versions of Tendermint that +may want to punish at different levels (for example, partial slashing).