Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epoch hooks, panic recovery #1308

Merged
merged 5 commits into from
Apr 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
* [#1107](https://github.com/osmosis-labs/osmosis/pull/1107) Update to wasmvm v0.24.0, re-enabling building on M1 macs!

### Minor improvements & Bug Fixes
ValarDragon marked this conversation as resolved.
Show resolved Hide resolved

* [#1308](https://github.com/osmosis-labs/osmosis/pull/1308) Make panics inside of epochs no longer chain halt by default.
* [#1286](https://github.com/osmosis-labs/osmosis/pull/1286) Fix release build scripts.
* [#1203](https://github.com/osmosis-labs/osmosis/pull/1203) cleanup Makefile and ci workflows
* [#1177](https://github.com/osmosis-labs/osmosis/pull/1177) upgrade to go 1.18
Expand Down
26 changes: 24 additions & 2 deletions osmoutils/cache_ctx.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ package osmoutils
import (
"errors"
"fmt"
"runtime"
"runtime/debug"

sdk "github.com/cosmos/cosmos-sdk/types"
)
Expand All @@ -14,8 +16,8 @@ import (
func ApplyFuncIfNoError(ctx sdk.Context, f func(ctx sdk.Context) error) (err error) {
// Add a panic safeguard
defer func() {
if recovErr := recover(); recovErr != nil {
fmt.Println(recovErr)
if recoveryError := recover(); recoveryError != nil {
PrintPanicRecoveryError(ctx, recoveryError)
err = errors.New("panic occurred during execution")
}
}()
Expand All @@ -30,3 +32,23 @@ func ApplyFuncIfNoError(ctx sdk.Context, f func(ctx sdk.Context) error) (err err
}
return err
}

// PrintPanicRecoveryError error logs the recoveryError, along with the stacktrace, if it can be parsed.
// If not emits them to stdout.
func PrintPanicRecoveryError(ctx sdk.Context, recoveryError interface{}) {
errStackTrace := string(debug.Stack())
switch e := recoveryError.(type) {
case string:
ctx.Logger().Error("Recovering from (string) panic: " + e)
case runtime.Error:
ctx.Logger().Error("recovered (runtime.Error) panic: " + e.Error())
case error:
ctx.Logger().Error("recovered (error) panic: " + e.Error())
default:
ctx.Logger().Error("recovered (default) panic. Could not capture logs in ctx, see stdout")
fmt.Println("Recovering from panic ", recoveryError)
debug.PrintStack()
return
}
ctx.Logger().Error("stack trace: " + errStackTrace)
}
9 changes: 8 additions & 1 deletion x/epochs/spec/05_hooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,11 @@ order: 5

On hook receiver function of other modules, they need to filter `epochIdentifier` and only do executions for only specific epochIdentifier.
Filtering epochIdentifier could be in `Params` of other modules so that they can be modified by governance.
Governance can change epoch from `week` to `day` as their need.
Governance can change epoch from `week` to `day` as their need.

## Panic isolation

If a given epoch hook panics, its state update is reverted, but we keep proceeding through the remaining hooks.
This allows more advanced epoch logic to be used, without concern over state machine halting, or halting subsequent modules.

This does mean that if there is behavior you expect from a prior epoch hook, and that epoch hook reverted, your hook may also have an issue. So do keep in mind "what if a prior hook didn't get executed" in the safety checks you consider for a new epoch hook.
22 changes: 20 additions & 2 deletions x/epochs/types/hooks.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ package types

import (
sdk "github.com/cosmos/cosmos-sdk/types"

"github.com/osmosis-labs/osmosis/v7/osmoutils"
)

type EpochHooks interface {
Expand All @@ -23,13 +25,29 @@ func NewMultiEpochHooks(hooks ...EpochHooks) MultiEpochHooks {
// AfterEpochEnd is called when epoch is going to be ended, epochNumber is the number of epoch that is ending.
func (h MultiEpochHooks) AfterEpochEnd(ctx sdk.Context, epochIdentifier string, epochNumber int64) {
for i := range h {
h[i].AfterEpochEnd(ctx, epochIdentifier, epochNumber)
panicCatchingEpochHook(ctx, h[i].AfterEpochEnd, epochIdentifier, epochNumber)
}
}

// BeforeEpochStart is called when epoch is going to be started, epochNumber is the number of epoch that is starting.
func (h MultiEpochHooks) BeforeEpochStart(ctx sdk.Context, epochIdentifier string, epochNumber int64) {
for i := range h {
h[i].BeforeEpochStart(ctx, epochIdentifier, epochNumber)
panicCatchingEpochHook(ctx, h[i].BeforeEpochStart, epochIdentifier, epochNumber)
}
}

func panicCatchingEpochHook(
ctx sdk.Context,
hookFn func(ctx sdk.Context, epochIdentifier string, epochNumber int64),
epochIdentifier string,
epochNumber int64,
) {
defer func() {
if recovErr := recover(); recovErr != nil {
osmoutils.PrintPanicRecoveryError(ctx, recovErr)
}
}()
cacheCtx, write := ctx.CacheContext()
hookFn(cacheCtx, epochIdentifier, epochNumber)
write()
}
96 changes: 96 additions & 0 deletions x/epochs/types/hooks_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
package types_test

import (
"testing"

sdk "github.com/cosmos/cosmos-sdk/types"
"github.com/stretchr/testify/suite"

"github.com/osmosis-labs/osmosis/v7/app/apptesting"
"github.com/osmosis-labs/osmosis/v7/x/epochs/types"
)

type KeeperTestSuite struct {
apptesting.KeeperTestHelper

queryClient types.QueryClient
}

func TestKeeperTestSuite(t *testing.T) {
suite.Run(t, new(KeeperTestSuite))
}

func (suite *KeeperTestSuite) SetupTest() {
suite.Setup()

suite.queryClient = types.NewQueryClient(suite.QueryHelper)
}

// dummyEpochHook is a struct satisfying the epoch hook interface,
// that maintains a counter for how many times its been succesfully called,
// and a boolean for whether it should panic during its execution.
type dummyEpochHook struct {
successCounter int
shouldPanic bool
}

func (hook *dummyEpochHook) AfterEpochEnd(ctx sdk.Context, epochIdentifier string, epochNumber int64) {
if hook.shouldPanic {
panic("dummyEpochHook is panicking")
}
hook.successCounter += 1
}

func (hook *dummyEpochHook) BeforeEpochStart(ctx sdk.Context, epochIdentifier string, epochNumber int64) {
if hook.shouldPanic {
panic("dummyEpochHook is panicking")
}
hook.successCounter += 1
}

func (hook *dummyEpochHook) Clone() *dummyEpochHook {
newHook := dummyEpochHook{shouldPanic: hook.shouldPanic, successCounter: hook.successCounter}
return &newHook
}

var _ types.EpochHooks = &dummyEpochHook{}

func (suite *KeeperTestSuite) TestHooksPanicRecovery() {
panicHook := dummyEpochHook{shouldPanic: true}
noPanicHook := dummyEpochHook{shouldPanic: false}
simpleHooks := []dummyEpochHook{panicHook, noPanicHook}

tests := []struct {
hooks []dummyEpochHook
expectedCounterValues []int
}{
{[]dummyEpochHook{noPanicHook}, []int{1}},
{simpleHooks, []int{0, 1}},
}

for tcIndex, tc := range tests {
for epochActionSelector := 0; epochActionSelector < 2; epochActionSelector++ {
suite.SetupTest()
hookRefs := []types.EpochHooks{}

for _, hook := range tc.hooks {
hookRefs = append(hookRefs, hook.Clone())
}

hooks := types.NewMultiEpochHooks(hookRefs...)
suite.NotPanics(func() {
if epochActionSelector == 0 {
hooks.BeforeEpochStart(suite.Ctx, "id", 0)
} else if epochActionSelector == 1 {
hooks.AfterEpochEnd(suite.Ctx, "id", 0)
}
})

for i := 0; i < len(hooks); i++ {
epochHook := hookRefs[i].(*dummyEpochHook)
suite.Require().Equal(tc.expectedCounterValues[i], epochHook.successCounter, "test case index %d", tcIndex)
}
}

}
}