Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR 028: Public Key Addresses #7086

Merged
merged 18 commits into from
Oct 21, 2020
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/architecture/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,5 @@ Please add a entry below in your Pull Request for an ADR.
- [ADR 025: IBC Passive Channels](./adr-025-ibc-passive-channels.md)
- [ADR 026: IBC Client Recovery Mechanisms](./adr-026-ibc-client-recovery-mechanisms.md)
- [ADR 027: Deterministic Protobuf Serialization](./adr-027-deterministic-protobuf-serialization.md)

- [ADR 028: Public Key Addresses](./adr-028-public-key-addresses.md)
171 changes: 171 additions & 0 deletions docs/architecture/adr-028-public-key-addresses.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# ADR 028: Public Key Addresses

## Changelog

- 2020/08/18: Initial version

## Status

Proposed

aaronc marked this conversation as resolved.
Show resolved Hide resolved
## Abstract

This ADR defines a canonical 20-byte address format for new public key algorithms, multisig public keys, and module
accounts using string prefixes and blake2b hashing.

## Context

Issue [\#3685](https://github.com/cosmos/cosmos-sdk/issues/3685) identified that public key
address spaces are currently overlapping. One initial proposal was extending the address length and
adding prefixes for different types of addresses.

@ethanfrey explained an alternate approach originally used in https://github.com/iov-one/weave:

> I spent quite a bit of time thinking about this issue while building weave... The other cosmos Sdk.

> Basically I define a condition to be a type and format as human readable string with some binary data appended. This condition is hashed into an Address (again at 20 bytes). The use of this prefix makes it impossible to find a preimage for a given address with a different condition (eg ed25519 vs secp256k1).

> This is explained in depth here https://weave.readthedocs.io/en/latest/design/permissions.html

> And the code is here, look mainly at the top where we process conditions. https://github.com/iov-one/weave/blob/master/conditions.go

And explained how this approach should be sufficiently collision resistant:
> Yeah, AFAIK, 20 bytes should be collision resistance when the preimages are unique and not malleable. A space of 2^160 would expect some collision to be likely around 2^80 elements (birthday paradox). And if you want to find a collision for some existing element in the database, it is still 2^160. 2^80 only is if all these elements are written to state.

> The good example you brought up was eg. a public key bytes being a valid public key on two algorithms supported by the codec. Meaning if either was broken, you would break accounts even if they were secured with the safer variant. This is only as the issue when no differentiating type info is present in the preimage (before hashing into an address).

> I would like to hear an argument if the 20 bytes space is an actual issue for security, as I would be happy to increase my address sizes in weave. I just figured cosmos and ethereum and bitcoin all use 20 bytes, it should be good enough. And the arguments above which made me feel it was secure. But I have not done a deeper analysis.
Copy link
Collaborator

@robert-zaremba robert-zaremba Oct 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the prefix will solve the security problem here. I was reading #3685 and I'm not sure if the solution solves the security problem. Here is my reasoning (maybe it's wrong):

Let's say we have to PK algorithms: A and B. For a user with (pk1, sk1) key pair, we have two possible attacks:

  1. A became vulnerable. Attacker is able to create a valid signature without knowing sk1. In this case we don't solve anything with this proposal.
  2. Attacker is able to find a different key pair (pk2, sk2), possible belonging to a different PK scheme, such that address(typ(pk2), pk2) == address(typ(pk1), pk1). Then he basically broke the cryptographic hash function. The key type (and the prefix) is not important here, because the attacker has an algorithm how to find an pre-image.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear - I'm not saying that adding prefix is bad. I'm not sure it solves anything. @ethanfrey ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one more attack, which in fact is important here, and this update address it:

  1. Similar to (1.): A became vulnerable. Attacker is able to forge a signature for pk \in A. In all places where we don't store any relationship between addresses and (PK, PK scheme) pair, the attacker will be able to spend address assets.

This proposal (including scheme url / name in the address algorithm) protects against the attack described #3685 (isolating address spaces to protect against attacks when one of the PK scheme is broken).


In discussions in [\#5694](https://github.com/cosmos/cosmos-sdk/issues/5694), we agreed to go with an
approach similar to this where essentially we take the first 20 bytes of the `sha256` hash of
the key type concatenated with the key bytes, summarized as `Sha256(KeyTypePrefix || Keybytes)[:20]`.

## Decision

### Legacy Public Key Addresses Don't Change
aaronc marked this conversation as resolved.
Show resolved Hide resolved

`secp256k1` and multisig public keys are currently in use in existing Cosmos SDK zones. They use the following
address formats:

- secp256k1: `ripemd160(sha256(pk_bytes))[:20]`
- legacy amino multisig: `sha256(aminoCdc.Marshal(pk))[:20]`

We don't want to change existing addresses. So the addresses for these two key types will remain the same.

The current multisig public keys use amino serialization to generate the address. We will retain
aaronc marked this conversation as resolved.
Show resolved Hide resolved
those public keys and their address formatting, and call them "legacy amino" multisig public keys
in protobuf. We will also create multisig public keys without amino addresses to be described below.


### Canonical Address Format

We have three types of accounts we would like to create addresses for in the future:
- regular public key addresses for new signature algorithms (ex. `sr25519`).
- public key addresses for multisig public keys that don't use amino encoding
- module accounts: basically any accounts which cannot sign transactions and
which are managed internally by modules

To address all of these use cases we propose the following basic `AddressHash` function,
based on the discussions in [\#5694](https://github.com/cosmos/cosmos-sdk/issues/5694):

```go
func AddressHash(prefix string, contents []byte) []byte {
preImage := []byte(prefix)
if len(contents) != 0 {
preImage = append(preImage, 0)
preImage = append(preImage, contents...)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When implementing this we should use use make([]byte, correctSize) and copy they content using for or copy (into slice starting from index 2).

return blake2b.Sum256(preImage)[:20]
}
```

`AddressHash` always take a string `prefix` as a starting point which should represent the
type of public key (ex. `sr25519`) or module account being used (ex. `staking` or `group`).
For public keys, the `contents` parameter is used to specify the binary contents of the public
key. For module accounts, `contents` can be left empty (for modules which don't manage "sub-accounts"),
or can be some module-specific content to specify different pools (ex. `bonded` or `not-bonded` for `staking`)
or managed accounts (ex. different accounts managed by the `group` module).

In the `preImage`, the byte value `0` is used as the separator between `prefix` and `contents`. This is a logical
choice given that `0` is an invalid value for a string character and is commonly used as a null terminator.

We use a 256-bit `blake2b` hash instead of `sha256` because it is generally considered more secure. Blake hashes
are considered "random oracle indifferentiable", a stronger property which `sha256` does not have.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random oracle indifferentiability is cryptographically irrelevant for constructing addresses.

https://blog.cryptographyengineering.com/2012/07/17/indifferentiability/

Cause see the prefix of an address always specifies the length of suffix. Ie. sr25519 key is always 32bytes. In the context of this protocol, a MD hash like sha256 is "random oracle indeffentiable"

I'm just saying this because I'm always going to strongly advocate for a really high bar for adding a new hash function to the Cosmos trusted computing base.

Copy link
Collaborator

@robert-zaremba robert-zaremba Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmanian I agree with your general approach. But we don't propose a any exotic algorithm. Blake2 is a part of at least Go and Python3 stdlib and it's an improvement. Some cryptographers claim that sha256 is easier to break than blake2 because we already know how to break other hash functions from the MD family.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cryptographers claim that sha256 is easier to break than blake2.

Some references would be appreciated 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just saying this because I'm always going to strongly advocate for a really high bar for adding a new hash function to the Cosmos trusted computing base.

So I want to summarize what I heard from @zmanian on our last call about this. Basically:

  • there is no consensus on the replacement for SHA256 yet in the crypto community and there's a widespread sense that it won't be broken anytime soon
  • we want to minimize what is needed for someone to implement "cosmos" on a broad array of devices including hardware wallets and embedded devices

Thus for now it's probably preferable to stick with SHA256. Is that accurate @zmanian ?

@alessio any progress on getting the cryptographer at AiB to take a look at this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W3F Research / Polkadot is using blake2b and avoiding sha2.


### Canonical Public Key Address Prefixes

All public key types will have a unique protobuf message type such as:
aaronc marked this conversation as resolved.
Show resolved Hide resolved

```proto
package cosmos.crypto.sr25519;

message PubKey {
bytes key = 1;
}
```

All protobuf messages have unique fully qualified names, in this example `cosmos.crypto.sr25519.PubKey`.
aaronc marked this conversation as resolved.
Show resolved Hide resolved
These names are derived directly from .proto files in a standardized way and used
in other places such as the type URL in `Any`s. Since there is an easy and obvious
way to get this name for every protobuf type, we can use this message name as the
key type `prefix` when creating addresses. For all basic public keys, `contents`
should just be the raw unencoded public key bytes.

Thus the canonical address for new public key types would be `AddressHash(proto.MessageName(pk), pk.Bytes)`.

### Multisig Addresses
aaronc marked this conversation as resolved.
Show resolved Hide resolved

For new multisig public keys, we define a custom address format not based on any encoding scheme
(amino or protobuf). This avoids issues with non-determinism in the encoding scheme. It also
ensures that multisig public keys which differ simply in the ordering of keys have the same
address by sorting child public keys first.

First we define a proto message for multisig public keys:
```proto
package cosmos.crypto.multisig;

message PubKey {
uint32 threshold = 1;
repeated google.protobuf.Any public_keys = 2;
}
```

We define the following `Address()` function for this public key:

```
func (multisig PubKey) Address() {
// first gather all the addresses of each nested public key
var addresses [][]byte
for key := range multisig.Keys {
addresses = append(joinedAddresses, key.Address())
}

// then sort them in ascending order
addresses = Sort(addresses)

// then concatenate them together
var joinedAddresses []byte
for addr := range addresses {
joinedAddresses := append(joinedAddresses, addr...)
}

// form the string prefix from the message name (cosmos.crypto.multisig.PubKey) and the threshold joined together
prefix := fmt.Sprintf("%s/%d", proto.MessageName(multisig), multisig.Threshold)

// use the standard AddressHash function
return AddressHash(prefix, joinedAddresses)
}
```

## Consequences

### Positive
- a simple algorithm for generating addresses for new public keys and module accounts

### Negative
- addresses do not communicate key type, a prefixed approach would have done this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to try and put this info in bech32 encodings? Is access to this info really that important?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. You're saying that maybe we can have the bech32 have an extra prefix that isn't present in the actual address? Would this be something like cosmossecp256k1sdgh3sghlsdsdg. Or would the prefix get added before bech32 encoding?


### Neutral
- protobuf message names are used as key type prefixes

## References