Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add topics for mepool, async behavior, ABCI, KVS, WAL to document #456

Merged
merged 6 commits into from
Sep 2, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions docs/en/01-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@ Ostracon includes the Consensus and Networking layers of the three layers that c

![Layered Structure](../static/layered_structure.png)

Transactions that have not yet been incorporated into a block are shared among nodes by an anti-entropy mechanism (gossipping) in the Network layer called mempool. Here, the Network and Consensus layers consider transactions as simple binaries and don't care about the contents of the data.
Transactions that have not yet been incorporated into a block are shared among nodes by an anti-entropy mechanism (gossipping) in the Network layer called [mempool](03-tx-sharing.md). Here, the Network and Consensus layers consider transactions as simple binaries and don't care about the contents of the data.

Ostracon's consensus state and generated blocks are stored in the State DB and Block DB, respectively. Ostracon uses an embedded Key-Value store based on LSMT (Log-Structured Merge Tree) because these storages are emphasize fast random access performance keyed by block height, in particular the Block DB is used frequently for append operations. The actual KVS implementation to be used can be determined at build time from several choices.

## Specifications and Technology Stack

Expand All @@ -39,7 +41,7 @@ Transactions that have not yet been incorporated into a block are shared among n
| Agreement | Strong Consistency w/Finality | Tendermint-BFT |
| Signature | Elliptic Curve Cryptography | Ed25519, *BLS12-381*<sup>*1</sup> |
| Hash | SHA2 | SHA-256, SHA-512 |
| HSM | *N/A* | *No support for VRF or signature aggregation* |
| Key Management | Local KeyStore, Remote KMS | *HSM is not support due to VRF or BLS* |
| Key Auth Protocol | Station-to-Station | |
| Tx Sharing Protocol | Gossiping | mempool |
| Application Protocol | ABCI | |
Expand All @@ -53,16 +55,17 @@ Transactions that have not yet been incorporated into a block are shared among n
## Ostracon Features

* [Extending Tendermint-BFT with VRF-based Election](02-consensus.md)
* [Transaction Sharing](03-tx-sharing.md)
* [BLS Signature Aggregation](03-signature-aggregation.md)

## Consideration with Other Consensus Schemes

What consensus schemes are used by other blockchain implementations? We went through a lot of comparison and consideration to determine the direction of Ostracon.

The **PoW** used by Bitcoin and Ethereum is the most well-known consensus mechanism for blockchain. It has a proven track record of working as a public chain but has a structural problem of not being able to guarantee consistency until a sufficient amount of time has passed. This would cause significant problems with lost updates in the short term, and the inability to scale performance in the long term. So we eliminated PoW in the early stages of our consideration.
The *PoW* used by Bitcoin and Ethereum is the most well-known consensus mechanism for blockchain. It has a proven track record of working as a public chain but has a structural problem of not being able to guarantee consistency until a sufficient amount of time has passed. This would cause significant problems with lost updates in the short term, and the inability to scale performance in the long term. So we eliminated PoW in the early stages of our consideration.

The consensus algorithm of Tendermint, **Tendermint-BFT**, is a well-considered design for blockchains. The ability to guarantee finality in a short period of time was also a good fit for our direction. On the other hand, the weighted round-robin algorithm used as the election algorithm works deterministically, so participants can know the future Proposer, which makes it easy to find the target and prepare an attack. For this reason, Ostracon uses VRF to make the election unpredictable in order to reduce the likelihood of an attack.
The consensus algorithm of Tendermint, *Tendermint-BFT*, is a well-considered design for blockchains. The ability to guarantee finality in a short period of time was also a good fit for our direction. On the other hand, the weighted round-robin algorithm used as the election algorithm works deterministically, so participants can know the future Proposer, which makes it easy to find the target and prepare an attack. For this reason, Ostracon uses VRF to make the election unpredictable in order to reduce the likelihood of an attack.

**Algorand** also uses VRF, but in a very different way than we do: at the start of an election, each node generates a VRF random number individually and identifies whether it's a winner of the next Validator or not (it's similar to all nodes tossing a coin at the same time). This is a better way to guarantee cryptographic security while saving a large amount of computation time and power consumption compared to the PoW method of identifying the winner by hash calculation. On the other hand, it's difficult to apply this scheme to our blockchain for several reasons: the number of Validators to be selected is non-deterministic and includes random behavior following a binomial distribution, the protocol complexity increases due to mutual recognition among the winning nodes, and it's impossible to find nodes that have been elected but have sabotaged their roles.
*Algorand* also uses VRF, but in a very different way than we do: at the start of an election, each node generates a VRF random number individually and identifies whether it's a winner of the next Validator or not (it's similar to all nodes tossing a coin at the same time). This is a better way to guarantee cryptographic security while saving a large amount of computation time and power consumption compared to the PoW method of identifying the winner by hash calculation. On the other hand, it's difficult to apply this scheme to our blockchain for several reasons: the number of Validators to be selected is non-deterministic and includes random behavior following a binomial distribution, the protocol complexity increases due to mutual recognition among the winning nodes, and it's impossible to find nodes that have been elected but have sabotaged their roles.

We have considered a number of other consensus mechanisms, but we believe that the current choice is the closest realistic choice for role election and agreement algorithms for P2P distributed systems. However, since Ostracon doesn't have a goal of experimental proofs or demonstrations for any particular research theory, we are ready to adopt better algorithms if they are proposed in the future.
10 changes: 9 additions & 1 deletion docs/en/02-consensus.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,14 @@ In the Ostracon network, Validators mean candidate nodes that hold Stakes and ca

Voter selections use a pseudo-random function $r$ to generate a sequence of random numbers in order to randomly select multiple nodes from a single VRF hash $t$. It's more important that $r$ is fast, simple to implement, has no variant by different interpretations, and saves memory since $t$ already has the properties of a cryptographic pseudo-random number. Ostracon uses a fast shift-register type pseudo-random number generation algorithm, called SplitMix64, for Voter selection.

## Disciplinary Scheme for Failures
## Failures

### Disciplinary Scheme

Although Ostracon's consensus scheme works correctly even if a few nodes fail, it's ideal that failed nodes aren't selected for the consensus group in order to avoid wasting network and CPU resources. In particular, for cases that aren't caused by general asynchronous messaging problems, such as intentional malpractice, evidence of the behavior (whether malicious or not) will be shared and action will be taken to eliminate the candidate from the selection process by forfeiting the Stake.

### Write Ahead Log

In a system with such a disciplinary rule, it's important to have a mechanism to prevent nodes from causing unintended behavior; Ostracon saves all received messages in its WAL (Write Ahead Log), and when it recovers from a node failure, it can correctly apply processing after the last message it applied.

For more information on WAL, see [Tendermint | WAL](https://github.com/tendermint/tendermint/blob/v0.34.x/spec/consensus/wal.md).
37 changes: 37 additions & 0 deletions docs/en/03-tx-sharing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
title: Transaction Sharing
---

A client can send a transaction to any of the Ostracon nodes that joining the blockchain network. The transaction propagates to other Ostracon nodes and is ultimately shared by all Ostracon nodes.

## Mempool

Once a block is accepted by the Ostracon consensus mechanism, the transactions contained in that block are considered *confirmed*. The unconfirmed transactions are validated stored in an area called **mempool**, which is separate from the block storage, after validation such as signatures.

Unconfirmed transactions stored in the mempool by an Ostracon node are broadcast to other Ostracon nodes.
However, if the transaction has already been received or is invalid, it's neither saved nor broadcast, but discarded.
Such a method is called *gossipping* (or flooding) and a transaction will reach all nodes at a rate of $O(\log N)$ hops,
where $N$ is the number of nodes in the Ostracon network.

The Ostracon node selected as a Proposer by [leader election](02-consensus.md) generates new proposal blocks from transactions stored in the mempool.
The following figure shows the flow of an Ostracon node from receiving an unconfirmed transaction and storing it in the mempool until it's used to generate a block.

![Mempool in Ostracon structure](../static/tx-sharing/mempool.png)

## Performance and Asynchronization

Blockchain performance tends to focus on the speed of block generation, but in a practical system, the efficiency of sharing transactions among nodes is also an important factor that significantly affects overall performance.
In particular, Ostarcon's mempool must process a large number of transactions in a short period in exchange for Gossipping's network propagation speed.

Ostracon has added several queues to the Tendermint implementation for the mempool to make them asynchronous.
This change allows large numbers of transactions to be stored in the mempool in a short period of time, improving the throughput of the blockchain network in more modern CPU core-equipped node environments.

With this asynchronization of the mempool, multiple transactions will have a *validation-processing* state at the same time; Ostracon will refuse to receive transactions when the mempool capacity is exceeded, but asynchronous validation-processing transactions are also correctly included in the calculation of this capacity limit.

## Tx Validation via ABCI

ABCI (Application Blockchain Interface) is a specification for applications to communicate with Ostracon and other tools remotely (via gRPC, ABCI-Socket) or in-process (via in-process).
For more information, see [Tendermint repository](https://github.com/tendermint/tendermint/tree/main/abci).

The process of validating unconfirmed transactions also queries the application layer via ABCI. This behavior allows the application to avoid including transactions in the block that are essentially unnecessary (although correct from a data point of view). Here, Ostracon replaces the Tendermint implementation with an asynchronous API that can start the validation process for the next transaction without waiting for ABCI-side validation results. This improvement improves performance in environments where applications are allocated separate CPU cores.

21 changes: 13 additions & 8 deletions docs/ja/01-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,13 @@ Networking レイヤーが含まれています。

![Layered Structure](../static/layered_structure.png)

まだブロックに取り込まれていないトランザクションは mempool と呼ばれる Network レイヤーのアンチエントロピー機構 (ゴシッピング) によって
各ノード間で共有されます。ここで、Network および Consensus レイヤーではトランザクションを単純なバイナリとして扱い、そのデータの内容には
関与しません。
まだブロックに取り込まれていないトランザクションは [mempool](03-tx-sharing.md) と呼ばれる Network レイヤーのアンチエントロピー機構
(ゴシッピング) によって各ノード間で共有されます。ここで、Network および Consensus レイヤーではトランザクションを単純なバイナリとして扱い、
そのデータの内容には関与しません。

Ostracon のコンセンサスの状態は State DB に、生成されたブロックは Block DB にそれぞれ保存されます。これらのストレージはブロック高をキーと
する高速なランダムアクセス性能が重視され、特に Block DB は追記が多用されることから Ostracon では LSMT (Log-Structured Merge Tree) に
基づく Embedded Key-Value ストアを使用しています (実際に使用する KVS 実装はいくつかの選択肢からビルド時に決定できます)。

## Specifications and Technology Stack

Expand All @@ -51,7 +55,7 @@ Networking レイヤーが含まれています。
| Agreement | Strong Consistency w/Finality | Tendermint-BFT |
| Signature | Elliptic Curve Cryptography | Ed25519, *BLS12-381*<sup>*1</sup> |
| Hash | SHA2 | SHA-256, SHA-512 |
| HSM | *N/A* | *No support for VRF or signature aggregation* |
| Key Management | Local KeyStore, Remote KMS | *HSM is not support due to VRF or BLS* |
| Key Auth Protocol | Station-to-Station | |
| Tx Sharing Protocol | Gossiping | mempool |
| Application Protocol | ABCI | |
Expand All @@ -65,22 +69,23 @@ Networking レイヤーが含まれています。
## Ostracon Features

* [Extending Tendermint-BFT with VRF-based Election](02-consensus.md)
* [BLS Signature Aggregation](03-signature-aggregation.md)
* [Transaction Sharing](03-tx-sharing.md)
* [BLS Signature Aggregation](04-signature-aggregation.md)

## Consideration with Other Consensus Schemes

他のブロックチェーンではどのようなコンセンサス機構を採用しているのでしょうか? Ostracon の方向性を決定するために多くの比較と検討を行いました。

**Bitcoin** や **Ethereum** で採用している PoW は最も有名なブロックチェーン向けコンセンサス機構です。これらはパブリックチェーンとして
*Bitcoin* や *Ethereum* で採用している PoW は最も有名なブロックチェーン向けコンセンサス機構です。これらはパブリックチェーンとして
運用している実績がありますが、十分な時間が経過しないと結果が覆る可能性があるという機能的な制約を持ちます。これは、短期には lost update 問題を
引き起こし、長期には必要なパフォーマンスが確保できないという問題が顕著に現れることから、PoW は検討初期の段階で選択肢から外れました。

**Tendermint** が合意アルゴリズムに採用している Tendermint-BFT はブロックチェーン向けによく考慮された設計です。短時間でファイナリティを
*Tendermint* が合意アルゴリズムに採用している Tendermint-BFT はブロックチェーン向けによく考慮された設計です。短時間でファイナリティを
保証できる点も我々の方針に適していました。一方で、選出アルゴリズムに採用している加重ラウンドロビンは決定論的に動作するため、誰でも将来の
Proposer を知り得ることから標的を見つけて攻撃を準備しやすい点があります。このため Ostracon では攻撃の可能性を軽減する目的で VRF を使って
予測不可能なアルゴリズムに置き換えています。

**Algorand** は我々とは大きく異なる方法で VRF を使用しています。Algorand では選挙が始まるとそれぞれのノードが VRF 乱数を生成して次の
*Algorand* は我々とは大きく異なる方法で VRF を使用しています。Algorand では選挙が始まるとそれぞれのノードが VRF 乱数を生成して次の
Validator に当選しているかをノード自身が判断します (すべてのノードが一斉にコイントスするのと似ています)。これは PoW のハッシュ計算で
当選を引き当てる方法と比較して、大量の計算時間と電力消費を省略しつつ暗号論的な安全性を保証している優れた方法です。一方で、選出される
Validator 数が決定的ではなく二項分布に従うランダムな振る舞い含むことや、当選ノード間の相互認識でプロトコルが複雑性が上がること、当選した
Expand Down
11 changes: 10 additions & 1 deletion docs/ja/02-consensus.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,9 +117,18 @@ Voter 選出では、一つの VRF ハッシュ $t$ から複数のノードを
より重要です。Ostracon ではこの Voter 選出にシフトレジスタ型と呼ばれる非常に高速な疑似乱数生成アルゴリズムである
SplitMix64 を使用しています。

## Disciplinary Scheme for Failures
## Failures

### Disciplinary Scheme

Ostracon の合意スキームは少数のノードが故障していても正しく機能しますが、ネットワークや CPU 資源を無駄に消費しないためには故障したノードが
コンセンサスグループに選ばれないことが理想的です。とりわけ一般的な非同期メッセージングの問題が原因ではないケース、つまり意図的に行ったと
思われる不正な行為に対しては (悪意の有無に関わらず) その挙動の evidence が共有されて Stake の没収によって選出候補から排除する措置が
取られます。

### Write Ahead Log

このような懲戒制を伴うシステムではノードが意図しない動作を引き起こさないような機構を持つことが重要です。Ostracon は受信したメッセージをすべて
WAL (Write Ahead Log) に記録し、ノード障害から復帰したときに最後に適用したメッセージより後の処理を正しく適用することができます。
WAL に関する詳細は [Tendermint | WAL](https://github.com/tendermint/tendermint/blob/v0.34.x/spec/consensus/wal.md)
を参照してください。
Loading