Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Need Migration] Improve query performance of get_cells in rich-indexer #4509

Merged
merged 18 commits into from
Sep 25, 2024

Conversation

EthanYuan
Copy link
Collaborator

What problem does this PR solve?

Problem Summary:

When querying Rich Indexer get_cells or get_cells_capacity, if there are too many DEAD cells under the specified address, it will pull down the performance of the query.

What is changed and how it works?

What's Changed:

  • Added a field is_spent to the output table to know if the cell is live or dead without having to join the input table.
  • Auto migrations
EXPLAIN ANALYZE
SELECT
    output.id,
    output.output_index,
    output.capacity,
    query_script.code_hash AS lock_code_hash,
    query_script.hash_type AS lock_hash_type,
    query_script.args AS lock_args,
    type_script.code_hash AS type_code_hash,
    type_script.hash_type AS type_hash_type,
    type_script.args AS type_args,
    ckb_transaction.tx_index,
    ckb_transaction.tx_hash,
    block.block_number,
	output.data as output_data
FROM
    output
    JOIN (
        SELECT
            script.id,
            script.code_hash,
            script.hash_type,
            script.args
        FROM
            script
        WHERE
            (code_hash = '\xd00c84f0ec8fd441c38bc3f87a371f547190f2fcff88e642bc5bf54b9e318323')
            AND (hash_type = 1)
            AND (args LIKE '\x25ee39ecfae122adbdcc557207119fae07c6ae788925')
    ) AS query_script ON output.lock_script_id = query_script.id
    JOIN ckb_transaction ON output.tx_id = ckb_transaction.id
    JOIN block ON ckb_transaction.block_id = block.id
    LEFT JOIN script AS type_script ON output.type_script_id = type_script.id
WHERE
	output.is_spent = 0
ORDER BY
    output.id DESC
LIMIT
    1;
 QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=362735.97..362735.98 rows=1 width=293) (actual time=36.386..43.978 rows=1 loops=1)
   ->  Sort  (cost=362735.97..362735.98 rows=2 width=293) (actual time=36.384..43.975 rows=1 loops=1)
         Sort Key: output.id DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Gather  (cost=16570.86..362735.96 rows=2 width=293) (actual time=36.181..43.936 rows=120 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               ->  Nested Loop Left Join  (cost=15570.86..361735.76 rows=1 width=293) (actual time=27.775..31.739 rows=40 loops=3)
                     ->  Nested Loop  (cost=15570.43..361728.23 rows=1 width=194) (actual time=27.766..31.670 rows=40 loops=3)
                           ->  Nested Loop  (cost=15569.99..361720.70 rows=1 width=194) (actual time=27.755..31.558 rows=40 loops=3)
                                 ->  Nested Loop  (cost=15569.56..361713.05 rows=1 width=157) (actual time=27.744..31.443 rows=40 loops=3)
                                       ->  Parallel Bitmap Heap Scan on script  (cost=15349.14..90708.40 rows=6 width=115) (actual time=27.420..30.277 rows=0 loops=3)
                                             Recheck Cond: ((code_hash = '\xd00c84f0ec8fd441c38bc3f87a371f547190f2fcff88e642bc5bf54b9e318323'::bytea) AND (hash_type = 1))
                                             Filter: (args ~~ '\x25ee39ecfae122adbdcc557207119fae07c6ae788925'::bytea)
                                             Rows Removed by Filter: 43845
                                             Heap Blocks: exact=6426
                                             ->  Bitmap Index Scan on script_code_hash_hash_type_args_key  (cost=0.00..15349.14 rows=149246 width=0) (actual time=16.896..16.897 rows=131537 loops=1)
                                                   Index Cond: ((code_hash = '\xd00c84f0ec8fd441c38bc3f87a371f547190f2fcff88e642bc5bf54b9e318323'::bytea) AND (hash_type = 1))
                                       ->  Bitmap Heap Scan on output  (cost=220.41..45166.83 rows=61 width=58) (actual time=0.953..3.456 rows=120 loops=1)
                                             Recheck Cond: (lock_script_id = script.id)
                                             Filter: (is_spent = 0)
                                             Rows Removed by Filter: 572
                                             ->  Bitmap Index Scan on idx_output_table_lock_script_id  (cost=0.00..220.40 rows=12244 width=0) (actual time=0.246..0.246 rows=1953 loops=1)
                                                   Index Cond: (lock_script_id = script.id)
                                 ->  Index Scan using ckb_transaction_pkey on ckb_transaction  (cost=0.44..7.64 rows=1 width=53) (actual time=0.002..0.002 rows=1 loops=120)
                                       Index Cond: (id = output.tx_id)
                           ->  Index Scan using block_pkey on block  (cost=0.43..7.54 rows=1 width=16) (actual time=0.002..0.002 rows=1 loops=120)
                                 Index Cond: (id = ckb_transaction.block_id)
                     ->  Index Scan using script_pkey on script type_script  (cost=0.43..7.53 rows=1 width=115) (actual time=0.001..0.001 rows=1 loops=120)
                           Index Cond: (id = output.type_script_id)
 Planning Time: 1.040 ms
 Execution Time: 44.067 ms
(32 rows)

Bitmap Heap Scan on output (cost=220.41..45166.83 rows=61 width=58) (actual time=0.953..3.456 rows=120 loops=1) shows
that the number of rows scanned has become less.

The previous scan was:

Bitmap Heap Scan on output (cost=223.46..45139.27 rows=12244 width=58) (actual time=2.271..245.854 rows=692 loops=1)

Check List

Tests

  • Unit test
  • Integration test
  • Manual test
  • No code ci-runs-only: [ quick_checks,linters ]

Release note

Title Only: Include only the PR title in the release note.

@EthanYuan EthanYuan requested a review from a team as a code owner July 1, 2024 02:40
@EthanYuan EthanYuan requested review from quake and zhangsoledad and removed request for a team July 1, 2024 02:40
@eval-exec eval-exec added t:performance Type: Performance tuning m:indexer module: ckb-indexer labels Jul 1, 2024
@eval-exec eval-exec changed the title Improve query performance of get_cells in rich-indexer [Need Migration] Improve query performance of get_cells in rich-indexer Jul 18, 2024
@EthanYuan EthanYuan requested a review from eval-exec September 3, 2024 03:03
@zhangsoledad zhangsoledad added this pull request to the merge queue Sep 25, 2024
Merged via the queue into nervosnetwork:develop with commit 21f49f2 Sep 25, 2024
32 checks passed
@doitian doitian mentioned this pull request Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
m:indexer module: ckb-indexer t:performance Type: Performance tuning
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants