-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue in PFM prevents clearing of IBC channels #5969
Comments
This should be fixed by updating PFM (#5954) |
Thank you @nicolaslara, I've cherry-picked the merge commit of #5954 into the current v16.1.1 version, i was successfully able to build and sync the chain, but when trying to clear the channel unfortunately the gRPC gave the same response.. |
@nicolaslara could you maybe help me with a patched version of v16.1.x ? It would be very good to be able to time out these pending packets 🙏 |
interesting. I wonder if there's still an issue on the PFM unmarshaling. Let me check |
@clemensgg what's the last packet processed before getting the error (and, is there an easier way to talk synchronously? maybe tg?) |
sure! @ clemensg on tg :) thank you! |
This got partially resolved by quick fix on hermes that allows relayers to skip packets with certain sequence numbers (informalsystems/hermes@master...luca_joss/skip-sequence-on-clear). This allowed us to clear the other packet, so dealing with the stuck packet is now less urgent. We still need to figure out what is happening here and how to (1) ensure that packet properly times out, and (2) prevent this from happening again. I think the course of action here should be the following:
We have determined that the Timeout function in the message server terminates correctly, but the tx still doesn't get committed and the grpc call doesn't return. We need to figure out why. @clemensgg Could you provide the steps to set this up locally? Specifically:
If you prefer to provide this via tg that's also ok |
hi @nicolaslara, to replicate locally:
|
packets successfully timed out after |
System information
Osmosis version:
v16.1.0
,v16.1.1
OS & Version: Windows/Linux/OSX
Commit hash: 0dcae3392f23e44b8de436ff372c1373dc831b04
Other software versions:
hermes
: v1.6.0rly
: v2.3.1osmosisd
: v16.1.0 and v16.1.1neutrond
: v1.0.4Expected behaviour
Relayer is able to process / timeout pending IBC packets.
gRPC is able to successfully process the
send_tx_simulate
query for msgs containing some currently affectedibc-hooks
payloads.Actual behaviour
We've detected some packets on the
osmosis-1
>neutron-1
transfer channel that we currently not able to handle on our relayers.The issue is rooted in a complex
ibc-hooks
payload, thesend_tx_simulate
step fails due to a bug in the currently used version of thepacket-forward-middleware
(PFM) inosmosisd
.There are some subsequent "normal" (non-pfm) transfer packets pending (a total of 7), which can't be cleared due to how the relayer workflow handles these cases (clearing/flushing fails if one grpc query errors).
A fix for the issue has already been merged and is included in the latest version of the PFM in ibc-apps
We have a confirm by the osmosis team that the changes will be included in
v17
which "should come out relatively soon". Because normal packets are accumulating on the channel (in small quantities due to websocket instabilites, but still), and theoretically the issue could impact relayer operation / IBC UX on other channels as well, we'd like to request an expedited fix in the currentv16
version ofosmosisd
Current packet pending status:
Subsequent non-PFM packets pending:
Steps to reproduce the behaviour
deploy IBC relayer (
hermes
orrly
)have RPC & gRPC access to full nodes for
osmosis-1
andneutron-1
with sufficient block historyrun channel clearing ("flushing") command for respective IBC channel:
hermes clear packets --chain osmosis-1 --port transfer --channel channel-874
hermes tx packet-recv --src-chain osmosis-1 --src-port transfer --src-channel channel-874 --dst-chain neutron-1
rly transact flush neutron-osmosis --debug
Backtrace
Relayer logs
hermes
:Relayer logs
rly
:The text was updated successfully, but these errors were encountered: