perf: bech32 library, implement a decode algorithm that doesn't check the checksum #10025

ValarDragon · 2021-08-29T06:21:43Z

Summary

Now that we are using bech32 strings in several locations within the state machine (e.g. all the proto structs for staking), we should be mindful of the efficiency. Bech32 decoding for checksum checks with the current implementation checks isn't negligible time at scale, especially if your doing a large operation over state. (E.g. in epoch based models)

Any bech32 address that makes it into state should be a correct bech32 address. Therefore, we do not need to check its checksum upon decoding. We should implement a method in our bech32 library for DecodeAssumingValidChecksum

Problem Definition

This feature helps improve performance of going to and fro 'native' data representation, and bech32 strings, for objects within the state machine.

The downside of this feature is it may add some additional thinking to relevant parts of the logic, for something that in the "standard" path of the SDK (e.g. not a large batch computation), will not be high impact.

Proposal

In the bech32 library we currently use, add a method for DecodeAssumingValidChecksum that simply omits the bech32verifychecksum call

For Admin Use

Not duplicate issue
Appropriate labels applied
Appropriate contributors tagged
Contributor assigned/self-assigned

The text was updated successfully, but these errors were encountered:

conr2d · 2021-09-10T10:10:24Z

goos: darwin
goarch: amd64
pkg: github.com/cosmos/cosmos-sdk/types/bech32
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz

name	iteration	time
BenchmarkDecode-16	1506727	807.4 ns/op
BenchmarkDecodeAssumingValidChecksum-16	2391042	494.4 ns/op
BenchmarkDecodeUnsafe-16	2923123	410.4 ns/op

Tested three cases:

Decode: Normalization + ChecksumValidation
DecodeAssumingValidChecksum: Normalization
DecodeUnsafe: -

Normalization here is checking and changing address to have only lower-case letters. If we normalize address when it is stored within state machine, normalization can be omitted too.

robert-zaremba · 2021-09-28T15:28:12Z

DecodeAssumingValidChecksum. So now we need someone to update SDK to use that function when we deserialize things from state. Anything else?

ValarDragon · 2021-09-28T16:11:02Z

Yup! Also @conr2d can you post a benchmarks with the memallocs as well? Add -benchmem to the benchmark command (That was what the bottleneck was in our osmosis benchmarks)

conr2d · 2021-09-30T05:45:27Z

@ValarDragon

In original version (enigmampc/btcutil), memory allocation always happens multiple times due to string copy.
https://github.com/enigmampc/btcutil/blob/e2fb6adb2a25f868283c4f243c873ad6b4894974/bech32/bech32.go#L35-L36

By merging the latest version (btcsuite/btcutil), memory allocation only happens when a given bech32 string consists of upper case letters.
https://github.com/cosmos/btcutil/blob/a68c44d216624107f23b6c8e66704ff4ecee879a/bech32/bech32.go#L176-L179

Now all tests have 1 allocs/op.

ValarDragon · 2021-09-30T13:38:27Z

Thats amazing! Thanks

tac0turtle · 2021-11-16T10:28:54Z

@conr2d would you like to update the sdk?

conr2d · 2021-11-17T00:03:23Z

@marbar3778 Yes, if possible. Do you mean changing decoding address from state to unsafe decode?

tac0turtle · 2022-01-12T09:57:03Z

@marbar3778 Yes, if possible. Do you mean changing decoding address from state to unsafe decode?

yes! would you make the required changes?

robert-zaremba · 2022-01-14T13:15:26Z

@conr2d fyi: https://github.com/cosmos/cosmos-sdk/blob/master/internal/conv/string.go#L10

conr2d · 2022-01-14T14:33:22Z

@marbar3778 @robert-zaremba OK, I will check it. Thank you.

conr2d · 2022-02-08T16:13:31Z

Hmm, I can add UnsafeDecodeAndConvert() to types/bech32/bech32.go, but its main usage is in types/address.go.

AccAddressFromBech32(), ValAddressFromBech32(), ConsAddressFromBech32() calls GetFromBech32() and GetFromBech32() will call DecodeAndConvert().

We cannot know in advance whether the module will call AccAddressFromBech32() to convert a bech32 string for the message (untrusted source) or from the state (must be verified already before being stored). I guess we need two methods for each conversion like AccAddressFromBech32() and UnsafeAccAddressFromBech32() so that module can decide what method to be called. This refactoring also seems to be related to #10838.

robert-zaremba · 2022-03-29T10:41:14Z

In many places we know if we deserialize address from store or from a request. So we can have functions like:

UnsafeAccAddressFromBech32
UnsafeValAddressFromBech32
...

BTW: there is always a dilemma, half of my brain would prefer to put Unsafe at the end to keep the prefix by object name (AccAddress...)

robert-zaremba · 2022-03-29T10:43:19Z

Personally I prefer to keep address as bytes in the store, and don't do any verification. That would require to have 2 type of structures: one for store and one for response. I think that this is the right way to go, instead constraining ourselves (in order to use the same struct for store and response) and doing potentially bad decisions. But that's a separate topic ;)

elias-orijtech · 2023-05-12T19:18:20Z

What's the status of this issue?

tac0turtle · 2023-05-15T08:50:05Z

this was partially implemented. There is a fork but the function to skip the checksum is not being used

elias-orijtech · 2023-05-17T22:34:33Z

I shall close this as implemented, under the assumption that the optimized function is used on a per-case basis.

tac0turtle · 2023-05-19T10:08:52Z

this isnt completed, i dont believe we are using the function that doesnt check checksums? want to work on this?

elias-orijtech · 2023-05-19T17:02:21Z

I would, but surely the checksum-skipping variant shall only be used where it matters for performance? My reasoning for closing this was because that is determined on a case-by-case basis.

tac0turtle · 2023-07-18T14:09:31Z

i have removed a few of these calls in staking and we plan on removing them more when we migrate state for now we can close this

ValarDragon added the T: Performance Performance improvements label Aug 30, 2021

This was referenced Sep 10, 2021

perf: Fork bech32 from btcsuite/btcutil (#10082) #10111

Closed

bech32: Add DecodeUnsafe for bypassing checksum validation cosmos/btcutil#1

Merged

conr2d mentioned this issue Sep 22, 2021

build(deps): Use self-maintained btcutil (#10082) #10201

Merged

19 tasks

daniel-farina mentioned this issue Nov 19, 2021

bech32 encoding performance. (Too many mallocs) osmosis-labs/feature-requests#14

Closed

tac0turtle added this to Cosmos SDK Maintenance Jan 12, 2022

tac0turtle moved this to Todo in Cosmos SDK Maintenance Jan 12, 2022

elias-orijtech closed this as completed May 17, 2023

github-project-automation bot moved this from Icebox to Done in Cosmos SDK Maintenance May 17, 2023

tac0turtle reopened this May 19, 2023

tac0turtle closed this as completed Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: bech32 library, implement a decode algorithm that doesn't check the checksum #10025

perf: bech32 library, implement a decode algorithm that doesn't check the checksum #10025

ValarDragon commented Aug 29, 2021

conr2d commented Sep 10, 2021 •

edited

Loading

robert-zaremba commented Sep 28, 2021

ValarDragon commented Sep 28, 2021

conr2d commented Sep 30, 2021 •

edited

Loading

ValarDragon commented Sep 30, 2021

tac0turtle commented Nov 16, 2021

conr2d commented Nov 17, 2021

tac0turtle commented Jan 12, 2022

robert-zaremba commented Jan 14, 2022

conr2d commented Jan 14, 2022

conr2d commented Feb 8, 2022

robert-zaremba commented Mar 29, 2022

robert-zaremba commented Mar 29, 2022

elias-orijtech commented May 12, 2023

tac0turtle commented May 15, 2023

elias-orijtech commented May 17, 2023

tac0turtle commented May 19, 2023

elias-orijtech commented May 19, 2023

tac0turtle commented Jul 18, 2023

perf: bech32 library, implement a decode algorithm that doesn't check the checksum #10025

perf: bech32 library, implement a decode algorithm that doesn't check the checksum #10025

Comments

ValarDragon commented Aug 29, 2021

Summary

Problem Definition

Proposal

For Admin Use

conr2d commented Sep 10, 2021 • edited Loading

robert-zaremba commented Sep 28, 2021

ValarDragon commented Sep 28, 2021

conr2d commented Sep 30, 2021 • edited Loading

ValarDragon commented Sep 30, 2021

tac0turtle commented Nov 16, 2021

conr2d commented Nov 17, 2021

tac0turtle commented Jan 12, 2022

robert-zaremba commented Jan 14, 2022

conr2d commented Jan 14, 2022

conr2d commented Feb 8, 2022

robert-zaremba commented Mar 29, 2022

robert-zaremba commented Mar 29, 2022

elias-orijtech commented May 12, 2023

tac0turtle commented May 15, 2023

elias-orijtech commented May 17, 2023

tac0turtle commented May 19, 2023

elias-orijtech commented May 19, 2023

tac0turtle commented Jul 18, 2023

conr2d commented Sep 10, 2021 •

edited

Loading

conr2d commented Sep 30, 2021 •

edited

Loading