Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't perform software upgrade #8885

Closed
4 tasks
colin-axner opened this issue Mar 15, 2021 · 6 comments
Closed
4 tasks

Can't perform software upgrade #8885

colin-axner opened this issue Mar 15, 2021 · 6 comments
Labels

Comments

@colin-axner
Copy link
Contributor

colin-axner commented Mar 15, 2021

Summary of Bug

Failing to do a standard SoftwareUpgrade with x/upgrade. The stores appear to lack migration

Version

Upgrading from gaia v4.1.0 to SDK commit 24ed401c81e881ee85e629da824bd407fe0b6b6d using branch colin/ibc-testing on gaia

Steps to Reproduce

  1. Create a chain with v4.1.0
  2. Create passing software upgrade proposal
  3. Stop node on panic from upgrade
  4. Install colin/ibc-testing branch with an additional no-op upgrade handler
  5. restart node

I get this error with the above steps:

"level":"error","module":"x/distribution","time":"2021-03-15T13:01:43Z","message":"WARNING: Attempt to allocate proposer rewards to unknown proposer cosmosvalcons1wmw49cu3y38ktp9l7vj895ngsld0stv6773xpj. This should happen only if the proposer unbonded completely within a single block, which generally should not happen except in exceptional circumstances (or fuzz testing). We recommend you investigate immediately."}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x150439a]

goroutine 1 [running]:
github.com/cosmos/cosmos-sdk/x/distribution/keeper.Keeper.AllocateTokensToValidator(0x21b8320, 0xc000f5bb40, 0x21f03a0, 0xc000f5a2d0, 0x21f03a0, 0xc000f5a2d0, 0xc000d66068, 0x21b8320, 0xc000f5bb70, 0x21b83a0, ...)
    github.com/cosmos/[email protected]/x/distribution/keeper/allocation.go:105 +0x3a
github.com/cosmos/cosmos-sdk/x/distribution/keeper.Keeper.AllocateTokens(0x21b8320, 0xc000f5bb40, 0x21f03a0, 0xc000f5a2d0, 0x21f03a0, 0xc000f5a2d

but I also noticed if I swap the binary without x/upgrade, I also get some other random error related to nil values.

I suspect I need to run some in-place migrations or something. Upgrading to the same version I am using is fine (as expected).

Here is my no-op:

app.UpgradeKeeper.SetUpgradeHandler("non-ibc-upgrade", func(ctx sdk.Context, plan upgradetypes.Plan) {})

Questions

  • Is this expected behaviour for the next release?
  • If so, what additional changes to the gaia app.go do I need to make? I know this would likely be covered by migration docs, but I'd love to continue testing upgrades

cc @AmauryM @aaronc


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@alessio alessio added the T:Bug label Mar 16, 2021
@alexanderbez
Copy link
Contributor

@AmauryM, @ethanfrey stated you might have some thoughts here? I'm not sure this is necessarily a bug, but perhaps an invalid upgrade?

if you move from 0.41 to 0.42 you need to migrate all the modules.

@ethanfrey
Copy link
Contributor

ethanfrey commented Mar 16, 2021

There were heavy changes to the on-disk data format in 0.42. Notably, all addresses changed encoding to support variable lengths.

@AmauryM added some inline migrations for the various modules in #8485 and #8504

You must make sure to execute all of these migrations when doing the upgrade.

@ethanfrey
Copy link
Contributor

The "bug" here is most likely missing documentation. (But maybe actually failing code as well).
Looking at https://github.com/cosmos/cosmos-sdk/issues?q=is%3Aopen+is%3Aissue+label%3Adocs I think there is plenty left to do still from he 0.40.0/0.41.0 releases.

Maybe the core team can do a docs-only sprint to trim down this backlog. No one likes writing docs, but it is hard for all users without them,

@amaury1093
Copy link
Contributor

The last piece implementing ADR-041 (#8743) has not yet been merged in master, so x/upgrade not working properly yet is expected.

The current plan is:

@ethanfrey
Copy link
Contributor

ethanfrey commented Mar 16, 2021

As a side note, I think I spend too much time reading discord, so I know all this even when not developing it.

However, it does seem the only way to stay on top of the documentation and needed changes for upgrades.

@colin-axner
Copy link
Contributor Author

thanks for the response @AmauryM. Sorry I jumped the gun a little bit. I'll wait until the pr is merged and some basic docs are up and then rerun my test scenarios as well. Please cc me on a pr which adds docs

Closing this issue since I can just track #8743

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants