Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use indexation cache to satisfy "coins to spend" queries #2463

Open
wants to merge 244 commits into
base: master
Choose a base branch
from

Conversation

rafal-ch
Copy link
Contributor

@rafal-ch rafal-ch commented Nov 28, 2024

Closes #2391

This PR includes all changes from the Part 1 PR, making it deprecated.

Description

Changes in this PR:

The new CoinsToSpend index

  • This is the database that stores all coins to spend sorted by the amounts (i.e. largest-by-value coins first)
  • The key consists of several parts
    • Retryable flag - to distinguish between retryable messages and other coins
    • Address (owner)
    • AssetID
    • Amount - as "big-endian" bytes to leverage the RocksDB key sorting capabilities
    • Foreign Key - this are bytes of the key from either the Messages or Coins on-chain databases
      • for messages this is a 32-bytes Nonce
      • for coins this is a 34-bytes UtxoId
  • The value is an instance of IndexedCoinType enum, so we know which on-chain database to query when returning the actual coins
  • This index is updated when executor events are processed
  • When querying for "coins to spend" the following algorithm is applied:
    • First, we get as many "big" coins as required to satisfy double the amount from the query (respecting max and excluded params)
    • If we have enough coins, but there are still some "slots" in the query left (because we selected less coins than max) we fill the remaining slots with a random number of "dust" coins
    • If it happens that the value of selected "dust coins" is able to cover the value of some of the already selected "big coins", we remove the latter from the response
    • If at any step we encounter a problem (reading from database, integer conversions, etc.) we bail with an appropriate error

Changes to CoinsQueryError type

  • The MaxCoinsReached variant has been removed because in the new algorithm we never query for more coins than the specified max, hence, without additional effort, we are not able to tell whether the query could be satisfied if user provided bigger max
  • The InsufficientCoins has been renamed to InsufficientCoinsForTheMax and it now contains the additional max field

Off-chain database metadata

  • The metadata for off-chain database now contain the additional IndexationKind - CoinsToSpend

Refactoring

  • The indexation.rs module was split into separate files, each per indexation type + errors + some utils.

Other

  • Integration tests have to be updated to not expect the exact number of coins to spend in the response (currently, due to randomness, we're not deterministic in this regard)
  • The number of excluded ids in the coinsToSpend GraphQL query is now limited to the maximum number of inputs allowed in transaction.

Before requesting review

  • I have reviewed the code myself
  • I have created follow-up issues caused by this PR and linked them here

Follow-up issues

Copy link
Collaborator

@xgreenx xgreenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice to me=D Left some comments to improve the errors and the code maintainance

@@ -233,7 +241,7 @@ mod coin {
asset_id_a: AssetId,
asset_id_b: AssetId,
) {
let context = setup(owner, asset_id_a, asset_id_b).await;
let (context, max_inputs) = setup(owner, asset_id_a, asset_id_b).await;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will expect that you will pass max_inputs or ConsensusParameters with specified max_inputs into the setup.

It looks strange for me to see max_inputs as a return type because it means "test's setup defines the error". While if you pass max_inputs it will be "test defines the setup and an error".

If you agree to pass max_inputs here, then we need to do for other methods as well=)

Also maybe it will make more sense to pass ConsensusParameters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. I changed the setup() to accept the ConsensusParameters in this commit: 1645c14

impl From<CoinModel> for Coin {
fn from(value: CoinModel) -> Self {
Coin(value)
async fn coins_to_spend_with_cache(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if coins_to_spend_without_cache is above of coins_to_spend_with_cache, the git diff will be smaller=) Could you try to swap them please?=)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swapped in 9d52683

)
.await?,
)
.yield_each(db.batch_size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slowest operation is iteration over the storage. Because you already have Vec<CoinsToSpendIndexEntry> in the memory, you can just process everything together for one run. It should be very fast.

So we can remove yield_each and usage fo the stream here=)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right. Removed in 385aea7

if coins_per_asset.is_empty() {
return Err(CoinsQueryError::InsufficientCoinsForTheMax {
asset_id,
collected_amount: total_amount,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like collected_amount should be zero in this case=)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now reworked and fixed in commit: 6740e7d

coins_per_asset.push(coin_type);
}

if coins_per_asset.is_empty() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see that we rely several time on empty vector in the case of error inside of the select_coins_to_spend. But maybe it will be more clear if select_coins_to_spend returned an error and we shouldn't have empty coins_per_asset case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. See the explanation here: #2463 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now reworked and fixed in commit: 6740e7d

while let Some(coin) = coins_stream.next().await {
let coin = coin?;
if !is_excluded(&coin, excluded_ids)? {
if count >= max || predicate(&coin, coins_total_value) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if count >= max || predicate(&coin, coins_total_value) {
if coins.len() >= max || predicate(&coin, coins_total_value) {

I think we can remove count=)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, well spotted. Removed in 0602da1

Comment on lines 300 to 302
if selected_big_coins_total < total {
return Ok(vec![]);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to return an error here=)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty vector will be converted to CoinsQueryError::InsufficientCoinsForTheMax on the callsite, as you noticed here: https://github.com/FuelLabs/fuel-core/pull/2463/files#r1881319248

It's not the best solution, but returning this error from within the fn select_coins_to_spend() function would require the function to know the asset_id and currently it's asset agnostic (I'd like to keep it that way).

I was considering adding a new error variant (for example CoinSelectionError) and convert it to proper CoinsQueryError on the callsite, but this complicates stuff.

I think I'll probably replace vec![] with None and see how it fits the design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use the approach when I return None and construct the proper error with all necessary data upstream - commit: 2c40069

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get AssetId from the last_selected_big_coin which you have below=)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't work in case we don't select any coins in big_coins(). That's also why I cannot get it from one of the selected_big_coins, at least not consistently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to just pass asset id to the function, so it can now return proper error (with correct asset_id and collected_amount). Commit: 6740e7d

Comment on lines 303 to 306
let Some(last_selected_big_coin) = selected_big_coins.last() else {
// Should never happen.
return Ok(vec![]);
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t would be nice to return an error here=)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a proper error and some more explanation in 87245d4

dust_coins_total = new_value;
true
})
.unwrap_or_default()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.unwrap_or_default()
.unwrap_or(false)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 0175e41

crates/fuel-core/src/coins_query.rs Outdated Show resolved Hide resolved
@rafal-ch rafal-ch requested a review from xgreenx December 13, 2024 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize GraphQL "coins to spend" query
4 participants