Use indexation cache to satisfy "coins to spend" queries #2463

rafal-ch · 2024-11-28T13:43:02Z

Closes #2391

This PR includes all changes from the Part 1 PR, making it deprecated.

Description

Changes in this PR:

The new `CoinsToSpend` index

This is the database that stores all coins to spend sorted by the amounts (i.e. largest-by-value coins first)
The key consists of several parts
- Retryable flag - to distinguish between retryable messages and other coins
- Address (owner)
- AssetID
- Amount - as "big-endian" bytes to leverage the RocksDB key sorting capabilities
- Foreign Key - this are bytes of the key from either the Messages or Coins on-chain databases
  - for messages this is a 32-bytes Nonce
  - for coins this is a 34-bytes UtxoId
The value is an instance of IndexedCoinType enum, so we know which on-chain database to query when returning the actual coins
This index is updated when executor events are processed
When querying for "coins to spend" the following algorithm is applied:
- First, we get as many "big" coins as required to satisfy double the amount from the query (respecting max and excluded params)
- If we have enough coins, but there are still some "slots" in the query left (because we selected less coins than max) we fill the remaining slots with a random number of "dust" coins
- If it happens that the value of selected "dust coins" is able to cover the value of some of the already selected "big coins", we remove the latter from the response
- If at any step we encounter a problem (reading from database, integer conversions, etc.) we bail with an appropriate error

Changes to `CoinsQueryError` type

The MaxCoinsReached variant has been removed because in the new algorithm we never query for more coins than the specified max, hence, without additional effort, we are not able to tell whether the query could be satisfied if user provided bigger max
The InsufficientCoins has been renamed to InsufficientCoinsForTheMax and it now contains the additional max field

Off-chain database metadata

The metadata for off-chain database now contain the additional IndexationKind - CoinsToSpend

Refactoring

The indexation.rs module was split into separate files, each per indexation type + errors + some utils.

Other

Integration tests have to be updated to not expect the exact number of coins to spend in the response (currently, due to randomness, we're not deterministic in this regard)
The number of excluded ids in the coinsToSpend GraphQL query is now limited to the maximum number of inputs allowed in transaction.

Before requesting review

I have reviewed the code myself
I have created follow-up issues caused by this PR and linked them here

Follow-up issues

…ore clean

xgreenx

Looks nice to me=D Left some comments to improve the errors and the code maintainance

xgreenx · 2024-12-12T02:53:39Z

tests/tests/coins.rs

@@ -233,7 +241,7 @@ mod coin {
        asset_id_a: AssetId,
        asset_id_b: AssetId,
    ) {
-        let context = setup(owner, asset_id_a, asset_id_b).await;
+        let (context, max_inputs) = setup(owner, asset_id_a, asset_id_b).await;


I will expect that you will pass max_inputs or ConsensusParameters with specified max_inputs into the setup.

It looks strange for me to see max_inputs as a return type because it means "test's setup defines the error". While if you pass max_inputs it will be "test defines the setup and an error".

If you agree to pass max_inputs here, then we need to do for other methods as well=)

Also maybe it will make more sense to pass ConsensusParameters

Yes, that makes sense. I changed the setup() to accept the ConsensusParameters in this commit: 1645c14

xgreenx · 2024-12-12T02:58:29Z

crates/fuel-core/src/schema/coins.rs

-impl From<CoinModel> for Coin {
-    fn from(value: CoinModel) -> Self {
-        Coin(value)
+async fn coins_to_spend_with_cache(


I think if coins_to_spend_without_cache is above of coins_to_spend_with_cache, the git diff will be smaller=) Could you try to swap them please?=)

Swapped in 9d52683

xgreenx · 2024-12-12T03:05:03Z

crates/fuel-core/src/schema/coins.rs

+            )
+            .await?,
+        )
+        .yield_each(db.batch_size);


The slowest operation is iteration over the storage. Because you already have Vec<CoinsToSpendIndexEntry> in the memory, you can just process everything together for one run. It should be very fast.

So we can remove yield_each and usage fo the stream here=)

Yes, right. Removed in 385aea7

xgreenx · 2024-12-12T03:07:05Z

crates/fuel-core/src/schema/coins.rs

+        if coins_per_asset.is_empty() {
+            return Err(CoinsQueryError::InsufficientCoinsForTheMax {
+                asset_id,
+                collected_amount: total_amount,


It looks like collected_amount should be zero in this case=)

This is now reworked and fixed in commit: 6740e7d

xgreenx · 2024-12-12T03:09:24Z

crates/fuel-core/src/schema/coins.rs

+            coins_per_asset.push(coin_type);
+        }
+
+        if coins_per_asset.is_empty() {


Hmm, I see that we rely several time on empty vector in the case of error inside of the select_coins_to_spend. But maybe it will be more clear if select_coins_to_spend returned an error and we shouldn't have empty coins_per_asset case.

That's a good point. See the explanation here: #2463 (comment)

This is now reworked and fixed in commit: 6740e7d

xgreenx · 2024-12-12T03:41:25Z

crates/fuel-core/src/coins_query.rs

+    while let Some(coin) = coins_stream.next().await {
+        let coin = coin?;
+        if !is_excluded(&coin, excluded_ids)? {
+            if count >= max || predicate(&coin, coins_total_value) {


Suggested change

if count >= max || predicate(&coin, coins_total_value) {

if coins.len() >= max || predicate(&coin, coins_total_value) {

I think we can remove count=)

Yes, well spotted. Removed in 0602da1

xgreenx · 2024-12-12T03:43:53Z

crates/fuel-core/src/coins_query.rs

+    if selected_big_coins_total < total {
+        return Ok(vec![]);
+    }


It would be nice to return an error here=)

The empty vector will be converted to CoinsQueryError::InsufficientCoinsForTheMax on the callsite, as you noticed here: https://github.com/FuelLabs/fuel-core/pull/2463/files#r1881319248

It's not the best solution, but returning this error from within the fn select_coins_to_spend() function would require the function to know the asset_id and currently it's asset agnostic (I'd like to keep it that way).

I was considering adding a new error variant (for example CoinSelectionError) and convert it to proper CoinsQueryError on the callsite, but this complicates stuff.

I think I'll probably replace vec![] with None and see how it fits the design.

I use the approach when I return None and construct the proper error with all necessary data upstream - commit: 2c40069

You can get AssetId from the last_selected_big_coin which you have below=)

That won't work in case we don't select any coins in big_coins(). That's also why I cannot get it from one of the selected_big_coins, at least not consistently.

I decided to just pass asset id to the function, so it can now return proper error (with correct asset_id and collected_amount). Commit: 6740e7d

xgreenx · 2024-12-12T03:44:21Z

crates/fuel-core/src/coins_query.rs

+    let Some(last_selected_big_coin) = selected_big_coins.last() else {
+        // Should never happen.
+        return Ok(vec![]);
+    };


t would be nice to return an error here=)

Added a proper error and some more explanation in 87245d4

xgreenx · 2024-12-12T03:48:26Z

crates/fuel-core/src/coins_query.rs

+                dust_coins_total = new_value;
+                true
+            })
+            .unwrap_or_default()


Suggested change

.unwrap_or_default()

.unwrap_or(false)

Updated in 0175e41

crates/fuel-core/src/coins_query.rs

… wrt to mutation

…` for clarity

…to_spend_cache_part_2

rafal-ch added 30 commits October 14, 2024 10:27

Add basic balances functionality

4a70f52

Add support for querying all balances for user

7d472fb

Adding balances_indexation_progress to DB metadata

4296fd4

Attempt at migrating the metadata to store the indexation progress

77d0f16

Merge remote-tracking branch 'upstream/master' into 1965_balances

525e6f9

Hack the replace_forced() and commit_changes_forced in

8eba5d7

Introduce ForcedCommitDatabase

c083f45

Update dependencies

6b76c37

DB metadata can track multiple indexation progresses

f246cd0

into_genesis() attemt

7472db7

Add some TODOs with ideas for the future

b4d2e0f

Use double_key! macro to define the balances key

d38fdb3

Add basic_storage_tests! for Balances

292acab

Merge remote-tracking branch 'upstream/master' into 1965_balances

651bcc8

Balances DB stores separate information for coins and messages

8342c8f

Fix the recursive call

9a9f120

Init indexation progresses with 0 upon metadata migration

6aa9325

Remove debug prints

b153db4

Store incoming balance in the new Balances DB

512a8a3

Read balance from the new Balances database

adf9e2a

Update coin balance, don't overwrite

344ca90

Use more detailed IndexationStatus, not just block height

f73dbed

Add processing of MessageImported

80da09b

Simplify processing of coins and message amounts

ad5216d

Extract increase_balance()

c784e44

Store coin and message balances separately

5762665

Clean up column naming

5595985

Add test for coin balances

38dd8d6

Support both coins and messages in the new balance system

530bcaf

Merge remote-tracking branch 'upstream/master' into 1965_balances

5604f30

rafal-ch and others added 6 commits December 9, 2024 11:50

Use ExcludedKeysAsBytes instead of Vec in ExcludedKeysAsBytes

3cf743a

Remove CoinOrMessageIdBytes type and avoid some allocations

2e0b7fd

Simplify asserts in some coin tests

ce637a8

Remove superfluous space

7b54abc

Make implementation of select_coins_until_respects_excluded_ids() m…

ad83e78

…ore clean

Merge branch 'master' into rafal_2391_coins_to_spend_cache_part_2

4c6cf85

xgreenx reviewed Dec 12, 2024

View reviewed changes

rafal-ch added 18 commits December 12, 2024 12:37

Make the skip_big_coins_up_to_amount() implementation more explicit…

c001fd3

… wrt to mutation

Use more clear names in skip_big_coins_up_to_amount()

a30c1da

Prefer unwrap_or(false) instead of .unwrap_or_default() for `bool…

0175e41

…` for clarity

Add CoinsQueryError::[FUnexpectedInternalState error

87245d4

Clean up error handling in coins to spend

2c40069

Improve error handling in coins to spend query

bef1548

Add test cases for errors in indexed_coins_to_spend

ba883d2

Remove unnecessary variable

0602da1

Swap functions to reduce diff size

9d52683

into_coin_id() does not have to be async

385aea7

setup() function in tests now accepts consensus parameters

1645c14

Simplify vec initialization in into_coin_id()

f1420e3

Update comment

db1e97f

Move the byte conversion from into_coin_id() to the key itself

bdc8ce2

Move the byte conversion from is_excluded() to the key itself

8e42b87

Merge remote-tracking branch 'upstream/master' into rafal_2391_coins_…

e841d66

…to_spend_cache_part_2

Make fields of CoinsToSpendIndexIter pub

2533060

Return proper error from select_coins_to_spend()

6740e7d

rafal-ch mentioned this pull request Dec 13, 2024

Convert the CoinsToSpendIndexKey from Vec<u8> to typed struct, similarly to OwnedTransactionIndexKey #2498

Open

Mention follow-up issue

03a0662

rafal-ch mentioned this pull request Dec 13, 2024

The OffChainDatabase::coins_to_spend_index() function should return error if indexation is not available #2499

Open

Mention follow-up issue

1049fea

rafal-ch requested a review from xgreenx December 13, 2024 12:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use indexation cache to satisfy "coins to spend" queries #2463

Use indexation cache to satisfy "coins to spend" queries #2463

rafal-ch commented Nov 28, 2024 •

edited

Loading

xgreenx left a comment

xgreenx Dec 12, 2024

rafal-ch Dec 13, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 13, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 13, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 13, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 12, 2024

rafal-ch Dec 13, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 12, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 12, 2024

rafal-ch Dec 12, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 13, 2024

rafal-ch Dec 13, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 12, 2024

xgreenx Dec 12, 2024

rafal-ch Dec 12, 2024

	if count >= max \|\| predicate(&coin, coins_total_value) {
	if coins.len() >= max \|\| predicate(&coin, coins_total_value) {

Use indexation cache to satisfy "coins to spend" queries #2463

Are you sure you want to change the base?

Use indexation cache to satisfy "coins to spend" queries #2463

Conversation

rafal-ch commented Nov 28, 2024 • edited Loading

Description

The new CoinsToSpend index

Changes to CoinsQueryError type

Off-chain database metadata

Refactoring

Other

Before requesting review

Follow-up issues

xgreenx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rafal-ch commented Nov 28, 2024 •

edited

Loading

The new `CoinsToSpend` index

Changes to `CoinsQueryError` type