Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(events): order events consistently when querying #12623

Merged
merged 1 commit into from
Nov 4, 2024
Merged

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Oct 22, 2024

(This PR is against #12421 until it's merged)

In ChainIndexer and the existing master version of this same code, we do an ORDER BY when querying for events but then we do a slices.Sort on the output, only sorting by epoch.

  1. The sort is not stable, so it can end up with events jumbled up even if the epoch is properly sorted
  2. We're not making use of the ORDER BY, in fact we do the reverse of what we ask for, so we should just do it there
  3. The existing ORDER BY relies on insertion order (_rowid_) of the event entries, which should mostly be fine (aside from reverts perhaps), but we have event_index that we use for logIndex on Eth output and we're not making use of that or properly ordering by it.

Here's a list of eth events from epoch 4374648, just listing their blockNumber, transactionIndex and logIndex:

  [ '0x42c078', '0x49', '0x80' ],
  [ '0x42c078', '0x35', '0x62' ],
  [ '0x42c078', '0x35', '0x61' ],
  [ '0x42c078', '0x35', '0x60' ],
  [ '0x42c078', '0x49', '0x81' ],
  [ '0x42c078', '0x49', '0x82' ]

They should be nicely ordered. Here is the same but with this fix:

  [ '0x42c078', '0x35', '0x60' ],
  [ '0x42c078', '0x35', '0x61' ],
  [ '0x42c078', '0x35', '0x62' ],
  [ '0x42c078', '0x49', '0x80' ],
  [ '0x42c078', '0x49', '0x81' ],
  [ '0x42c078', '0x49', '0x82' ]

Note that in the process here we're going with ASC for all of the fields - epochs, messages, events and event entries.

This is the query now for a from/to query of eth_getLogs:

SELECT e.id,
  tm.height, tm.tipset_key_cid,
  e.emitter_id, e.emitter_addr,
  e.event_index, tm.message_cid,
  tm.message_index, e.reverted,
  ee.flags, ee.key,
  ee.codec, ee.value
FROM event e  JOIN tipset_message tm ON e.message_id = tm.id
JOIN event_entry ee ON e.id = ee.event_id
HERE tm.height BETWEEN ? AND ? AND e.reverted=?
ORDER BY tm.height ASC, tm.message_index ASC, e.event_index ASC, ee._rowid_ ASC

And here's what the query plan looks like for that:

QUERY PLAN
|--SEARCH tm USING INDEX idx_height (height>? AND height<?)
|--SEARCH e USING INDEX idx_event_message_id (message_id=?)
|--SEARCH ee USING INDEX event_entry_event_id (event_id=?)
`--USE TEMP B-TREE FOR RIGHT PART OF ORDER BY

As long as we're hitting the height index first then we should be good.

Here's a larger range of epochs to see that it's doing ordering across epochs:

  [ '0x42c0db', '0x35', '0x78' ],
  [ '0x42c0de', '0x3c', '0x7c' ],
  [ '0x42c0e1', '0x37', '0x61' ],
  [ '0x42c0e4', '0x37', '0x63' ],
  [ '0x42c0e4', '0x39', '0x65' ],
  [ '0x42c0e7', '0x39', '0x6e' ],
  [ '0x42c0e9', '0x54', '0x9f' ],
  [ '0x42c0ec', '0x3b', '0x64' ],
  [ '0x42c0ee', '0x29', '0x58' ],
  [ '0x42c0ee', '0x29', '0x59' ],
  [ '0x42c0ee', '0x29', '0x5a' ],
  [ '0x42c0ee', '0x29', '0x5b' ],
  [ '0x42c0ee', '0x29', '0x5c' ],
  [ '0x42c0ee', '0x29', '0x5d' ],
  [ '0x42c0ee', '0x29', '0x5e' ],
  [ '0x42c0ee', '0x29', '0x5f' ]

@aarshkshah1992
Copy link
Contributor

@rvagg This is a nice UX improvement and we should merge it.

@aarshkshah1992
Copy link
Contributor

@rvagg Can I go ahead and merge this ?

Base automatically changed from feat/msg-eth-tx-index to master October 31, 2024 09:58
@rvagg rvagg merged commit 3a34832 into master Nov 4, 2024
83 checks passed
@rvagg rvagg deleted the rvagg/order-by branch November 4, 2024 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ☑️ Done (Archive)
Development

Successfully merging this pull request may close these issues.

3 participants