Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK Errors UX #14070

Closed
alexanderbez opened this issue Nov 29, 2022 · 25 comments
Closed

SDK Errors UX #14070

alexanderbez opened this issue Nov 29, 2022 · 25 comments
Labels
C:errors T: Dev UX UX for SDK developers (i.e. how to call our code)

Comments

@alexanderbez
Copy link
Contributor

alexanderbez commented Nov 29, 2022

Context

Error registration in modules has proven to be pretty cumbersome, e.g. needing to avoid certain integers.

Proposal

There are two proposals floating around:

  1. Use gRPC error codes for all errors in modules.
  2. Allow modules to bypass registration completely and simply define a codespace and their own errors as needed.

Specifically, a module would define it's own codespace (this can even be simply inferred from the store key). The question is, does a module define it's own errors or does it just simply use the gRPC ones?

/cc @tac0turtle @aaronc

@alexanderbez alexanderbez added T: Dev UX UX for SDK developers (i.e. how to call our code) C:errors labels Nov 29, 2022
@ethanfrey
Copy link
Contributor

Error registration in modules has proven to be pretty cumbersome, e.g. needing to avoid certain integers.

This is kind of needed to preserve a unique mapping of (codespace, code) which was supposed be meaningful in the app. As long as we maintain this restriction, I don't see that much we can simplify. However...

The whole error code / code space idea was a design from tendermint that pre-dated me and the cosmos sdk. Tendermint wanted unique error codes for all failures. In practice, no one ever uses them and the produce a lot of burden on app developers. People heavily rely on the error messages, but the error codes can be broken into 3 groups:

  • out of gas error
  • other panic
  • application error

The first two are handled in ante handlers. We could simply let the application return normal Go errors (errors or github.com/pkg/errors) and the ante handler assigns them the same error code (eg 501) and a fixed code space. We just use the string message in the logs to return to the users as we do now.

This would be a consensus breaking change but I challenge you to find one client that checks the error codes beyond code == 0. We could even do this as part of 0.47 and simplify a lot. Keep the sdk/errors package as is and people can use it still (non-breaking) but any new module can just use simpler errors and refactor existing ones there slowly.

@ethanfrey
Copy link
Contributor

Specifically, a module would define it's own codespace

I think this is how it is done now. Or at least wasmd does this since ages. Actually, staking does the same.

Yes, we have to skip 0 and 1 and then add a few lines to create a list of all error codes. It is a bit annoying but I don't see much simplification as long as we want meaningful code ids. As I said above, I think there was a lot of demand on providing these abc error codes and no usage. Happy to see them go.

@fedekunze
Copy link
Collaborator

This is kind of needed to preserve a unique mapping of (codespace, code) which was supposed be meaningful in the app.

I'm unaware of any application that uses the codespace or inspects it for anything. I'd propose removing them altogether since users only care about the error message

@aaronc
Copy link
Member

aaronc commented Nov 29, 2022

I don't think we should remove the ability to register errors because that would break too much code, but we can point users away from it.

I think providing a good set of common errors in cosmossdk.io/errors and/or supporting out of the box the gRPC status codes are good approaches. Maybe we do both.

The gRPC status codes are already in common usage when people implement query handlers so if they're reusable elsewhere it's one API to learn. And they're also pretty well thought out IMHO (for instance RESOURCE_EXHAUSTED can be used for gas).

But having reusable SDK specific errors would also be good. I just think they should be more well thought out and minimal than all the ones currently registered in types/errors.

@ValarDragon
Copy link
Contributor

I insist on all new Osmosis code that we never register errors, because it is an enormous overhead / barrier to safe patch releases.

I advocate for removing the sdk error registration entirely for now, and only revisit it if someone is going to make a plan to ensure clients can actually use them (and gather what they really care about being merkelized)

@alexanderbez
Copy link
Contributor Author

alexanderbez commented Nov 29, 2022

I insist on all new Osmosis code that we never register errors, because it is an enormous overhead / barrier to safe patch releases.

This <3

To re-iterate, no one really uses codespaces or codes for that matter...maybe they use codespaces, I'm not sure. But codes are really meaningless. So that's why I proposed we remove registration and just rely on good messaging and clear codespaces as we do now. Removing the unnecessary registration process.

@aaronc
Copy link
Member

aaronc commented Nov 29, 2022

just rely on good messaging and clear codespaces as we do now. Removing the unnecessary registration process.

What do codespaces get us?

@tac0turtle
Copy link
Member

Knowing where the error came from. Like which module or package. Otherwise a invalid logic could mean 20 different things

@minxylynx
Copy link

I use codespaces! There are certain codes/spaces combos that denote no fee was taken. These are used in the earlier set of blocks, but still useful if theres a very specific scenario.

@ethanfrey
Copy link
Contributor

There are certain codes/spaces combos that denote no fee was taken.

I assume these are from sdk/auth module? Can you list the ones you care about?

I think if we have even a dozen cases people care about, we can keep those, but in general when writing new modules, they just return generic error that has codespace somehow (how to do that in least invasive way) but not a specific code per error.

@tac0turtle
Copy link
Member

What I'm getting from this discussion is we want code spaces, which could be inferred from the module name, if there is a need for more granularity the application developer can register another code space (staking/delegation, etc..).

Error codes are not used fully because there is not a design flow we are aware that uses them. We should remove error codes and simply return a default number.

@aaronc
Copy link
Member

aaronc commented Nov 30, 2022

What will be the API for just using the code space?

@adizere
Copy link
Contributor

adizere commented Nov 30, 2022

From a relayer perspective (i.e., a client for the SDK/app) we typically match on error messages. Examples:

In general, dealing with the above error messages is very brittle and feels ad-hoc. In an ideal world, messages would be typed and encoded in a .proto file so we can consume/match on them in some automated way instead of string matching. But I guess a tradeoff of that approach is code evolving slower.

@alexanderbez
Copy link
Contributor Author

  • Typed errors!
  • Proper codespaces (already used)
  • Remove need for registration
  • Remove the need for error codes?

@ethanfrey
Copy link
Contributor

ethanfrey commented Nov 30, 2022

@adizere Why didn't you use the existing error codes?

There are codes for fee issues and for sequence mismatch.

If you don't currently use codes to detect this, it is a bit of a proof that they don't help client.

Here is a full list

@minxylynx
Copy link

@ethanfrey The ones I watch for are sdk/8 and sdk/32

@tac0turtle
Copy link
Member

@adizere Why didn't you use the existing error codes?

There are codes for fee issues and for sequence mismatch.

If you don't currently use codes to detect this, it is a bit of a proof that they don't help client.

Here is a full list

heavily agree here, you should not be matching strings for errors.

@adizere
Copy link
Contributor

adizere commented Dec 1, 2022

@adizere Why didn't you use the existing error codes?

Two reasons:

  1. We tried using error codes and had at least in one case bite us back, so we eliminated matching on the error code

    • the specific case is documented here, see the comment:

    Gaia v6.0.1 (SDK 0.44.5) returns codeInvalidArgument, whereas gaia v6.0.4
    // (SDK 0.44.6, and potentially others) returns code Unknown.
    // Workaround by matching strictly on the status message.

  2. The team has not had much SDK experience (this was especially the case 1-2 years ago, but still applies today), so using a piece of input from SDK (error codes) in a way in which we didn't fully understand implications and dependencies seemed unwise

    • one way to think of this is we were applying the unix/robustness principle of being "liberal on what we accept" from SDK and make the least assumptions we can make

@tac0turtle
Copy link
Member

My proposal here is:

  • remove the need to register errors to wrap them with a module name.
  • error codes defaults to a integer that represents a module error if the error code is not already set by the module.
  • modules can still register with custom error codes, but it's not needed.
  • message server and query server wraps the returned error with the module name.

This simplifies and possibly removes most of the custom errors in modules

@minxylynx
Copy link

@tac0turtle Would the error codes be unique across the code? or only unique within the module?

@alexanderbez
Copy link
Contributor Author

error codes can only be guaranteed to be unique within the module

@ethanfrey
Copy link
Contributor

ethanfrey commented Dec 19, 2022

@adizere

I am trying to understand your comment here

You say there is a change of error code for a given action between sdk 0.44.5 and 0.44.6?
Such a change is consensus breaking (code is committed to app hash, unlike the string).
Seems like something funky was going on somewhere...

@robert-zaremba
Copy link
Collaborator

I quickly checked the commits between 0.44.5 and 0.44.6 and there was no new error defined nor updated.
BTW, 0.44 is EOL (that being said, critical bugs should be pointed).

@tac0turtle
Copy link
Member

@tac0turtle Would the error codes be unique across the code? or only unique within the module?

I was thinking the error code is only the module, not specific to an error as this is unneeded information in consensus

@tac0turtle
Copy link
Member

we worked on errorsv2 that removed a lot of functionality in the errors pkg and made it zerodeps. users dont need to register errors anymore and i can see the sdk moving away from using sdk errors going forward

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:errors T: Dev UX UX for SDK developers (i.e. how to call our code)
Projects
None yet
Development

No branches or pull requests

9 participants