Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in PFM prevents clearing of IBC channels #5969

Closed
Tracked by #5864
clemensgg opened this issue Aug 7, 2023 · 10 comments
Closed
Tracked by #5864

Issue in PFM prevents clearing of IBC channels #5969

clemensgg opened this issue Aug 7, 2023 · 10 comments

Comments

@clemensgg
Copy link
Contributor

System information

Osmosis version: v16.1.0, v16.1.1
OS & Version: Windows/Linux/OSX
Commit hash: 0dcae3392f23e44b8de436ff372c1373dc831b04

Other software versions:

Expected behaviour

Relayer is able to process / timeout pending IBC packets.

gRPC is able to successfully process the send_tx_simulate query for msgs containing some currently affected ibc-hooks payloads.

Actual behaviour

We've detected some packets on the osmosis-1 > neutron-1 transfer channel that we currently not able to handle on our relayers.

The issue is rooted in a complex ibc-hooks payload, the send_tx_simulate step fails due to a bug in the currently used version of the packet-forward-middleware (PFM) in osmosisd.

There are some subsequent "normal" (non-pfm) transfer packets pending (a total of 7), which can't be cleared due to how the relayer workflow handles these cases (clearing/flushing fails if one grpc query errors).

A fix for the issue has already been merged and is included in the latest version of the PFM in ibc-apps

We have a confirm by the osmosis team that the changes will be included in v17 which "should come out relatively soon". Because normal packets are accumulating on the channel (in small quantities due to websocket instabilites, but still), and theoretically the issue could impact relayer operation / IBC UX on other channels as well, we'd like to request an expedited fix in the current v16 version of osmosisd

Current packet pending status:

osmosis-1 channel-874 > neutron-1: 3597 (PFM packet)
osmosis-1 channel-874 > neutron-1: 3663 (PFM packet)

Subsequent non-PFM packets pending:

osmosis-1 channel-874 > neutron-1: 3958 (normal transfer packet)
osmosis-1 channel-874 > neutron-1: 3975 (normal transfer packet)
osmosis-1 channel-874 > neutron-1: 4025 (normal transfer packet)
osmosis-1 channel-874 > neutron-1: 4028 (normal transfer packet)
osmosis-1 channel-874 > neutron-1: 4102 (normal transfer packet)

Steps to reproduce the behaviour

  • deploy IBC relayer (hermes or rly)

  • have RPC & gRPC access to full nodes for osmosis-1 and neutron-1 with sufficient block history

  • run channel clearing ("flushing") command for respective IBC channel:

  • hermes clear packets --chain osmosis-1 --port transfer --channel channel-874

  • hermes tx packet-recv --src-chain osmosis-1 --src-port transfer --src-channel channel-874 --dst-chain neutron-1

  • rly transact flush neutron-osmosis --debug

Backtrace

Relayer logs hermes:

2023-07-28T19:14:27.625009Z  INFO ThreadId(01) running Hermes v1.6.0+1c1cf029-dirty
2023-07-28T19:14:27.634001Z  INFO ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}: 4 unreceived packets found: [ 3597, 3663, 3668, 3669 ] 
2023-07-28T19:14:28.906039Z  INFO ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}: pulled packet data for 4 events out of 4 sequences: 3597..=3597, 3663..=3663, 3668..=3669; events.total=4 events.left=0
2023-07-28T19:14:28.942516Z DEBUG ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:schedule{odata=packet-recv ->Source @1-1815345; len=4}: connection delay need not be taken into account: client update message will be prepended later
2023-07-28T19:14:28.942536Z DEBUG ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:relay{odata=packet-recv ->Source @1-1815345; len=4}: retrying retry.current=1 retry.max=5
2023-07-28T19:14:28.942546Z DEBUG ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:relay{odata=packet-recv ->Source @1-1815345; len=4}: prepending Source client update at height 1-1815346
2023-07-28T19:14:28.958884Z DEBUG ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:relay{odata=packet-recv ->Source @1-1815345; len=4}:foreign_client.wait_and_build_update_client_with_trusted{client=neutron-1->osmosis-1:07-tendermint-2823 target_height=1-1815346}:foreign_client.build_update_client_with_trusted{client=neutron-1->osmosis-1:07-tendermint-2823 target_height=1-1815346}: building a MsgUpdateAnyClient from trusted height 1-1815218 to target height 1-1815346
2023-07-28T19:14:28.959084Z  INFO ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:relay{odata=packet-recv ->Source @1-1815345; len=4}: assembled batch of 5 message(s)
2023-07-28T19:14:28.960526Z DEBUG ThreadId(99) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:relay{odata=packet-recv ->Source @1-1815345; len=4}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}: sending 5 messages as 1 batches to chain osmosis-1 in parallel
2023-07-28T19:14:28.960547Z DEBUG ThreadId(99) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:relay{odata=packet-recv ->Source @1-1815345; len=4}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=21105}: max fee, for use in tx simulation: Fee { amount: "39000uosmo", gas_limit: 15000000 }
2023-07-28T19:14:28.972853Z ERROR ThreadId(99) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:relay{odata=packet-recv ->Source @1-1815345; len=4}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=21105}:estimate_gas: failed to simulate tx. propagating error to caller: gRPC call `send_tx_simulate` failed with status: status: Unknown, message: "failed to execute message; message index: 1: timeout packet callback failed: error unmarshaling memo json: json: Unmarshal(nil *types.JSONObject)", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "10741712"} }
2023-07-28T19:14:28.972878Z ERROR ThreadId(99) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-874 dst_chain=neutron-1}:relay{odata=packet-recv ->Source @1-1815345; len=4}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=21105}: gas estimation failed or encountered another unrecoverable error error=gRPC call `send_tx_simulate` failed with status: status: Unknown, message: "failed to execute message; message index: 1: timeout packet callback failed: error unmarshaling memo json: json: Unmarshal(nil *types.JSONObject)", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "10741712"} }
2023-07-28T19:14:28.972911Z ERROR ThreadId(01) Failed to relay reverse recv and timeout packets: link failed with underlying error: gRPC call `send_tx_simulate` failed with status: status: Unknown, message: "failed to execute message; message index: 1: timeout packet callback failed: error unmarshaling memo json: json: Unmarshal(nil *types.JSONObject)", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "10741712"} }
SUCCESS []

Relayer logs rly:

rly transact flush neutron-osmosis --debug --max-tx-size 1 --max-msgs 1
2023-08-05T19:55:56.384812Z	debug	Entering main query loop	{"chain_name": "neutron", "chain_id": "neutron-1"}
2023-08-05T19:55:56.385190Z	debug	Queried latest height	{"chain_name": "neutron", "chain_id": "neutron-1", "latest_height": 1998887}
2023-08-05T19:55:56.385202Z	info	Chain is in sync	{"chain_name": "neutron", "chain_id": "neutron-1"}
2023-08-05T19:55:56.456977Z	debug	Entering main query loop	{"chain_name": "osmosis", "chain_id": "osmosis-1"}
2023-08-05T19:55:56.459121Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861149}
2023-08-05T19:55:56.459133Z	info	Chain is in sync	{"chain_name": "osmosis", "chain_id": "osmosis-1"}
2023-08-05T19:55:57.385602Z	debug	Queried latest height	{"chain_name": "neutron", "chain_id": "neutron-1", "latest_height": 1998887}
2023-08-05T19:55:57.459063Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861149}
2023-08-05T19:55:58.386573Z	debug	Queried latest height	{"chain_name": "neutron", "chain_id": "neutron-1", "latest_height": 1998888}
2023-08-05T19:55:58.392274Z	warn	Error querying block data	{"chain_name": "neutron", "chain_id": "neutron-1", "error": "RPC error -32603 - Internal error: could not find results for height #1998888"}
2023-08-05T19:55:58.459785Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861150}
2023-08-05T19:55:58.496925Z	debug	Observed IBC message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "update_client", "client_id": "07-tendermint-2318"}
2023-08-05T19:55:58.496968Z	debug	Retaining packet message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "acknowledge_packet", "sequence": 9583, "channel_id": "channel-492", "port_id": "transfer", "counterparty_channel_id": "channel-1", "counterparty_port_id": "transfer"}
2023-08-05T19:55:58.496979Z	debug	Observed IBC message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "acknowledge_packet", "sequence": 9583, "src_channel": "channel-492", "src_port": "transfer", "dst_channel": "channel-1", "dst_port": "transfer", "timeout_height": 4292408, "timeout_height_revision": 2}
2023-08-05T19:55:59.386819Z	debug	Queried latest height	{"chain_name": "neutron", "chain_id": "neutron-1", "latest_height": 1998888}
2023-08-05T19:55:59.460665Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861150}
2023-08-05T19:56:00.461531Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861150}
2023-08-05T19:56:01.461308Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861150}
2023-08-05T19:56:02.461742Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861150}
2023-08-05T19:56:03.112008Z	debug	Flushing	{"path_name": "neutron-osmosis", "chain_id": "neutron-1", "client_id": "07-tendermint-18", "channel": "channel-8", "port": "transfer"}
2023-08-05T19:56:03.112008Z	debug	Flushing	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "channel": "channel-874", "port": "transfer"}
2023-08-05T19:56:03.113117Z	debug	Queried latest height	{"chain_name": "neutron", "chain_id": "neutron-1", "latest_height": 1998888}
2023-08-05T19:56:03.119199Z	debug	Nothing to flush	{"path_name": "neutron-osmosis", "chain_id": "neutron-1", "client_id": "07-tendermint-18", "channel": "channel-8", "port": "transfer"}
2023-08-05T19:56:03.122128Z	debug	Will flush MsgRecvPacket	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "channel": "channel-874", "port": "transfer", "sequences": [3597, 3663, 3958, 3975, 4025, 4028, 4102]}
2023-08-05T19:56:03.463842Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861150}
2023-08-05T19:56:04.113351Z	debug	Queried latest height	{"chain_name": "neutron", "chain_id": "neutron-1", "latest_height": 1998891}
2023-08-05T19:56:04.152886Z	debug	Observed IBC message	{"chain_name": "neutron", "chain_id": "neutron-1", "event_type": "update_client", "client_id": "07-tendermint-0"}
2023-08-05T19:56:04.152914Z	debug	Retaining packet message	{"chain_name": "neutron", "chain_id": "neutron-1", "event_type": "recv_packet", "sequence": 394869, "channel_id": "channel-0", "port_id": "consumer", "counterparty_channel_id": "channel-568", "counterparty_port_id": "provider"}
2023-08-05T19:56:04.152926Z	debug	Observed IBC message	{"chain_name": "neutron", "chain_id": "neutron-1", "event_type": "recv_packet", "sequence": 394869, "src_channel": "channel-568", "src_port": "provider", "dst_channel": "channel-0", "dst_port": "consumer", "timeout_timestamp": 1693684544005907431}
2023-08-05T19:56:04.152932Z	debug	Retaining packet message	{"chain_name": "neutron", "chain_id": "neutron-1", "event_type": "write_acknowledgement", "sequence": 394869, "channel_id": "channel-0", "port_id": "consumer", "counterparty_channel_id": "channel-568", "counterparty_port_id": "provider"}
2023-08-05T19:56:04.152936Z	debug	Observed IBC message	{"chain_name": "neutron", "chain_id": "neutron-1", "event_type": "write_acknowledgement", "sequence": 394869, "src_channel": "channel-568", "src_port": "provider", "dst_channel": "channel-0", "dst_port": "consumer", "timeout_timestamp": 1693684544005907431}
2023-08-05T19:56:04.464804Z	debug	Queried latest height	{"chain_name": "osmosis", "chain_id": "osmosis-1", "latest_height": 10861151}
2023-08-05T19:56:04.507540Z	debug	Observed IBC message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "update_client", "client_id": "07-tendermint-2669"}
2023-08-05T19:56:04.507582Z	debug	Retaining packet message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "acknowledge_packet", "sequence": 12663, "channel_id": "channel-642", "port_id": "transfer", "counterparty_channel_id": "channel-5", "counterparty_port_id": "transfer"}
2023-08-05T19:56:04.507595Z	debug	Observed IBC message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "acknowledge_packet", "sequence": 12663, "src_channel": "channel-642", "src_port": "transfer", "dst_channel": "channel-5", "dst_port": "transfer", "timeout_height": 2601449, "timeout_height_revision": 1, "timeout_timestamp": 1691265932789028517}
2023-08-05T19:56:04.507662Z	debug	Retaining packet message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "send_packet", "sequence": 2085970, "channel_id": "channel-0", "port_id": "transfer", "counterparty_channel_id": "channel-141", "counterparty_port_id": "transfer"}
2023-08-05T19:56:04.507669Z	debug	Observed IBC message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "send_packet", "sequence": 2085970, "src_channel": "channel-0", "src_port": "transfer", "dst_channel": "channel-141", "dst_port": "transfer", "timeout_height": 16444737, "timeout_height_revision": 4}
2023-08-05T19:56:04.509500Z	debug	Observed IBC message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "update_client", "client_id": "07-tendermint-1666"}
2023-08-05T19:56:04.509518Z	debug	Retaining packet message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "acknowledge_packet", "sequence": 222744, "channel_id": "channel-113", "port_id": "transfer", "counterparty_channel_id": "channel-7", "counterparty_port_id": "transfer"}
2023-08-05T19:56:04.509527Z	debug	Observed IBC message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "acknowledge_packet", "sequence": 222744, "src_channel": "channel-113", "src_port": "transfer", "dst_channel": "channel-7", "dst_port": "transfer", "timeout_height": 8643013, "timeout_height_revision": 1}
2023-08-05T19:56:04.509578Z	debug	Retaining packet message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "send_packet", "sequence": 12664, "channel_id": "channel-642", "port_id": "transfer", "counterparty_channel_id": "channel-5", "counterparty_port_id": "transfer"}
2023-08-05T19:56:04.509585Z	debug	Observed IBC message	{"chain_name": "osmosis", "chain_id": "osmosis-1", "event_type": "send_packet", "sequence": 12664, "src_channel": "channel-642", "src_port": "transfer", "dst_channel": "channel-5", "dst_port": "transfer", "timeout_timestamp": 1691265559657000000}
2023-08-05T19:56:04.766501Z	debug	No MsgAcknowledgement to flush	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "channel": "channel-874", "port": "transfer"}
2023-08-05T19:56:04.766537Z	debug	No cached IBC header for client trusted height	{"chain_id": "neutron-1", "client_id": "07-tendermint-18", "height": 4863336}
2023-08-05T19:56:04.766543Z	debug	No cached IBC header for client trusted height	{"chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "height": 1998663}
2023-08-05T19:56:04.780384Z	debug	Had to query for client trusted IBC header	{"chain_id": "neutron-1", "counterparty_chain_id": "osmosis-1", "counterparty_client_id": "07-tendermint-2823", "height": 1998663, "latest_height": 1998888}
2023-08-05T19:56:04.788801Z	debug	Assembled packet message	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "msg": {"type": "timeout_packet", "src_port": "transfer", "src_channel": "channel-874", "dst_port": "transfer", "dst_channel": "channel-10", "sequence": 4102, "timeout_height": "1-1943824", "timeout_timestamp": 0, "data": "eyJhbW91bnQiOiIxNjY2MjEzNSIsImRlbm9tIjoidHJhbnNmZXIvY2hhbm5lbC04NzQvdW50cm4iLCJyZWNlaXZlciI6Im5ldXRyb24xbG5nOGFja3VkZ3BxdmZsbXVka3IwOGZqZmU4Y3czajNtbXN3eGEiLCJzZW5kZXIiOiJvc21vMWxuZzhhY2t1ZGdwcXZmbG11ZGtyMDhmamZlOGN3M2ozaGwydTJnIn0=", "ack": ""}}
2023-08-05T19:56:04.789526Z	debug	Assembled packet message	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "msg": {"type": "timeout_packet", "src_port": "transfer", "src_channel": "channel-874", "dst_port": "transfer", "dst_channel": "channel-10", "sequence": 4028, "timeout_height": "0-0", "timeout_timestamp": 1690989092968000000, "data": "eyJhbW91bnQiOiI4MDAwMDAwMCIsImRlbm9tIjoidHJhbnNmZXIvY2hhbm5lbC04NzQvdW50cm4iLCJyZWNlaXZlciI6Im5ldXRyb24xeGU4cmQ3MnZ1M3loMHJ4YWpoN3dsaHRqNmVjYTM2eHNlZzgwN24iLCJzZW5kZXIiOiJvc21vMXhlOHJkNzJ2dTN5aDByeGFqaDd3bGh0ajZlY2EzNnhzNHZhYWp4In0=", "ack": ""}}
2023-08-05T19:56:04.789605Z	debug	Assembled packet message	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "msg": {"type": "timeout_packet", "src_port": "transfer", "src_channel": "channel-874", "dst_port": "transfer", "dst_channel": "channel-10", "sequence": 3975, "timeout_height": "1-1909202", "timeout_timestamp": 0, "data": "eyJhbW91bnQiOiI0ODA4MjAxNyIsImRlbm9tIjoidHJhbnNmZXIvY2hhbm5lbC04NzQvdW50cm4iLCJyZWNlaXZlciI6Im5ldXRyb24xZ3d0NXpqZnJjbmtod2hlcHlzOG1sZGw3N2h0MDhzNWVtNHN0eHIiLCJzZW5kZXIiOiJvc21vMWd3dDV6amZyY25raHdoZXB5czhtbGRsNzdodDA4czVlaDMyZTJrIn0=", "ack": ""}}
2023-08-05T19:56:04.789616Z	debug	Assembled packet message	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "msg": {"type": "timeout_packet", "src_port": "transfer", "src_channel": "channel-874", "dst_port": "transfer", "dst_channel": "channel-10", "sequence": 3958, "timeout_height": "0-0", "timeout_timestamp": 1690910104064000000, "data": "eyJhbW91bnQiOiI1MTAwMDAwMCIsImRlbm9tIjoidHJhbnNmZXIvY2hhbm5lbC04NzQvdW50cm4iLCJyZWNlaXZlciI6Im5ldXRyb24xeGU4cmQ3MnZ1M3loMHJ4YWpoN3dsaHRqNmVjYTM2eHNlZzgwN24iLCJzZW5kZXIiOiJvc21vMXhlOHJkNzJ2dTN5aDByeGFqaDd3bGh0ajZlY2EzNnhzNHZhYWp4In0=", "ack": ""}}
2023-08-05T19:56:04.790671Z	debug	Assembled packet message	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "msg": {"type": "timeout_packet", "src_port": "transfer", "src_channel": "channel-874", "dst_port": "transfer", "dst_channel": "channel-10", "sequence": 3663, "timeout_height": "0-0", "timeout_timestamp": 1690567486746855183, "data": "eyJhbW91bnQiOiIxMDAwMCIsImRlbm9tIjoidW9zbW8iLCJtZW1vIjoie1wid2FzbVwiOntcImNvbnRyYWN0XCI6XCJuZXV0cm9uMW1ybTgweHhkdjh5aHJ0NmdxdngybjYzOHZqaDIzajAyM3hqNXl1ZmhhOXkwMmd2c2ttYXE2cHJyOHpcIixcIm1zZ1wiOntcInN3YXBfYW5kX2FjdGlvblwiOntcInVzZXJfc3dhcFwiOntcInN3YXBfdmVudWVfbmFtZVwiOlwibmV1dHJvbi1hc3Ryb3BvcnRcIixcIm9wZXJhdGlvbnNcIjpbe1wicG9vbFwiOlwibmV1dHJvbjF1NHY3eGN2a2h6OHN4czN1OW1qaHByd2M4dndjMnAwOHgwdGplNHVndHJya2poa2FnZHlzenR0NWRxXCIsXCJkZW5vbV9pblwiOlwiaWJjLzM3NjIyMkQ2RDlEQUUyMzA5MkUyOTc0MEU1NkI3NTg1ODA5MzVBNkQ3N0MyNEMyQUJENTdBNkE3OEExRjM5NTVcIixcImRlbm9tX291dFwiOlwidW50cm5cIn0se1wicG9vbFwiOlwibmV1dHJvbjFlMjJ6aDVwOG1lZGR4amNsZXZ1aGptZmo2OWp4ZnNhOHV1M2p2aHQ3MnJ2OWQ4bGtodmVzNnQ4dmVxXCIsXCJkZW5vbV9pblwiOlwidW50cm5cIixcImRlbm9tX291dFwiOlwiaWJjL0M0Q0ZGNDZGRDZERTM1Q0E0Q0Y0Q0UwMzFFNjQzQzhGREM5QkE0Qjk5QUU1OThFOUIwRUQ5OEZFM0EyMzE5RjlcIn1dfSxcIm1pbl9jb2luXCI6e1wiZGVub21cIjpcImliYy9DNENGRjQ2RkQ2REUzNUNBNENGNENFMDMxRTY0M0M4RkRDOUJBNEI5OUFFNTk4RTlCMEVEOThGRTNBMjMxOUY5XCIsXCJhbW91bnRcIjpcIjU5MFwifSxcInRpbWVvdXRfdGltZXN0YW1wXCI6MTY5MDU2NzE0NzkzMDQ3ODAwMCxcInBvc3Rfc3dhcF9hY3Rpb25cIjp7XCJiYW5rX3NlbmRcIjp7XCJ0b19hZGRyZXNzXCI6XCJuZXV0cm9uMTl2ZjVtZnI0MGF3dmtlZnc2OW5sNnAzbW1sc25hY21td2MweDZzXCJ9fSxcImFmZmlsaWF0ZXNcIjpbXX19fX0iLCJyZWNlaXZlciI6Im5ldXRyb24xbXJtODB4eGR2OHlocnQ2Z3F2eDJuNjM4dmpoMjNqMDIzeGo1eXVmaGE5eTAyZ3Zza21hcTZwcnI4eiIsInNlbmRlciI6Im9zbW8xOXZmNW1mcjQwYXd2a2VmdzY5bmw2cDNtbWxzbmFjbW16dTQ1azkifQ==", "ack": ""}}
2023-08-05T19:56:04.791104Z	debug	Assembled packet message	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "msg": {"type": "timeout_packet", "src_port": "transfer", "src_channel": "channel-874", "dst_port": "transfer", "dst_channel": "channel-10", "sequence": 3597, "timeout_height": "0-0", "timeout_timestamp": 1690513629288252803, "data": "eyJhbW91bnQiOiI1MzI0NjYwIiwiZGVub20iOiJ1b3NtbyIsIm1lbW8iOiJ7XCJ3YXNtXCI6e1wiY29udHJhY3RcIjpcIm5ldXRyb24xbXJtODB4eGR2OHlocnQ2Z3F2eDJuNjM4dmpoMjNqMDIzeGo1eXVmaGE5eTAyZ3Zza21hcTZwcnI4elwiLFwibXNnXCI6e1wic3dhcF9hbmRfYWN0aW9uXCI6e1wiZmVlX3N3YXBcIjp7XCJzd2FwX3ZlbnVlX25hbWVcIjpcIm5ldXRyb24tYXN0cm9wb3J0XCIsXCJjb2luX291dFwiOntcImRlbm9tXCI6XCJ1bnRyblwiLFwiYW1vdW50XCI6XCIyMDAwMDBcIn0sXCJvcGVyYXRpb25zXCI6W3tcInBvb2xcIjpcIm5ldXRyb24xdTR2N3hjdmtoejhzeHMzdTltamhwcndjOHZ3YzJwMDh4MHRqZTR1Z3RycmtqaGthZ2R5c3p0dDVkcVwiLFwiZGVub21faW5cIjpcImliYy8zNzYyMjJENkQ5REFFMjMwOTJFMjk3NDBFNTZCNzU4NTgwOTM1QTZENzdDMjRDMkFCRDU3QTZBNzhBMUYzOTU1XCIsXCJkZW5vbV9vdXRcIjpcInVudHJuXCJ9XX0sXCJ1c2VyX3N3YXBcIjp7XCJzd2FwX3ZlbnVlX25hbWVcIjpcIm5ldXRyb24tYXN0cm9wb3J0XCIsXCJvcGVyYXRpb25zXCI6W3tcInBvb2xcIjpcIm5ldXRyb24xdTR2N3hjdmtoejhzeHMzdTltamhwcndjOHZ3YzJwMDh4MHRqZTR1Z3RycmtqaGthZ2R5c3p0dDVkcVwiLFwiZGVub21faW5cIjpcImliYy8zNzYyMjJENkQ5REFFMjMwOTJFMjk3NDBFNTZCNzU4NTgwOTM1QTZENzdDMjRDMkFCRDU3QTZBNzhBMUYzOTU1XCIsXCJkZW5vbV9vdXRcIjpcInVudHJuXCJ9LHtcInBvb2xcIjpcIm5ldXRyb24xZTIyemg1cDhtZWRkeGpjbGV2dWhqbWZqNjlqeGZzYTh1dTNqdmh0NzJydjlkOGxraHZlczZ0OHZlcVwiLFwiZGVub21faW5cIjpcInVudHJuXCIsXCJkZW5vbV9vdXRcIjpcImliYy9DNENGRjQ2RkQ2REUzNUNBNENGNENFMDMxRTY0M0M4RkRDOUJBNEI5OUFFNTk4RTlCMEVEOThGRTNBMjMxOUY5XCJ9XX0sXCJtaW5fY29pblwiOntcImRlbm9tXCI6XCJpYmMvQzRDRkY0NkZENkRFMzVDQTRDRjRDRTAzMUU2NDNDOEZEQzlCQTRCOTlBRTU5OEU5QjBFRDk4RkUzQTIzMTlGOVwiLFwiYW1vdW50XCI6XCIzNjEyMzlcIn0sXCJ0aW1lb3V0X3RpbWVzdGFtcFwiOjE2OTA1MTMzMTA4OTQwNzg1MDAsXCJwb3N0X3N3YXBfYWN0aW9uXCI6e1wiaWJjX3RyYW5zZmVyXCI6e1wiaWJjX2luZm9cIjp7XCJzb3VyY2VfY2hhbm5lbFwiOlwiY2hhbm5lbC0xXCIsXCJyZWNlaXZlclwiOlwiY29zbW9zMWF5Z2R0ODc0MmdhbXh2OGNhOTl3emg1NnJ5NHh3NXMzOXNtbWhtXCIsXCJmZWVcIjp7XCJyZWN2X2ZlZVwiOltdLFwiYWNrX2ZlZVwiOlt7XCJkZW5vbVwiOlwidW50cm5cIixcImFtb3VudFwiOlwiMTAwMDAwXCJ9XSxcInRpbWVvdXRfZmVlXCI6W3tcImRlbm9tXCI6XCJ1bnRyblwiLFwiYW1vdW50XCI6XCIxMDAwMDBcIn1dfSxcIm1lbW9cIjpcIlwiLFwicmVjb3Zlcl9hZGRyZXNzXCI6XCJuZXV0cm9uMWF5Z2R0ODc0MmdhbXh2OGNhOTl3emg1NnJ5NHh3NXMzcDBqZWR1XCJ9fX0sXCJhZmZpbGlhdGVzXCI6W119fX19IiwicmVjZWl2ZXIiOiJuZXV0cm9uMW1ybTgweHhkdjh5aHJ0NmdxdngybjYzOHZqaDIzajAyM3hqNXl1ZmhhOXkwMmd2c2ttYXE2cHJyOHoiLCJzZW5kZXIiOiJvc21vMWF5Z2R0ODc0MmdhbXh2OGNhOTl3emg1NnJ5NHh3NXMzZHRndHBmIn0=", "ack": ""}}
2023-08-05T19:56:04.792228Z	debug	Assembled packet message	{"path_name": "neutron-osmosis", "chain_id": "osmosis-1", "client_id": "07-tendermint-2823", "msg": {"type": "timeout_packet", "src_port": "transfer", "src_channel": "channel-874", "dst_port": "transfer", "dst_channel": "channel-10", "sequence": 4025, "timeout_height": "0-0", "timeout_timestamp": 1690987783192000000, "data": "eyJhbW91bnQiOiI3NjAwMDAwMCIsImRlbm9tIjoidHJhbnNmZXIvY2hhbm5lbC04NzQvdW50cm4iLCJyZWNlaXZlciI6Im5ldXRyb24xeGU4cmQ3MnZ1M3loMHJ4YWpoN3dsaHRqNmVjYTM2eHNlZzgwN24iLCJzZW5kZXIiOiJvc21vMXhlOHJkNzJ2dTN5aDByeGFqaDd3bGh0ajZlY2EzNnhzNHZhYWp4In0=", "ack": ""}}
2023-08-05T19:56:04.792289Z	info	Found termination condition for flush, all caches cleared
@nicolaslara
Copy link
Contributor

This should be fixed by updating PFM (#5954)

@clemensgg
Copy link
Contributor Author

Thank you @nicolaslara, I've cherry-picked the merge commit of #5954 into the current v16.1.1 version, i was successfully able to build and sync the chain, but when trying to clear the channel unfortunately the gRPC gave the same response..

@clemensgg
Copy link
Contributor Author

@nicolaslara could you maybe help me with a patched version of v16.1.x ? It would be very good to be able to time out these pending packets 🙏

@nicolaslara
Copy link
Contributor

interesting. I wonder if there's still an issue on the PFM unmarshaling. Let me check

@nicolaslara
Copy link
Contributor

@clemensgg what's the last packet processed before getting the error (and, is there an easier way to talk synchronously? maybe tg?)

@clemensgg
Copy link
Contributor Author

sure! @ clemensg on tg :) thank you!

@nicolaslara
Copy link
Contributor

This got partially resolved by quick fix on hermes that allows relayers to skip packets with certain sequence numbers (informalsystems/hermes@master...luca_joss/skip-sequence-on-clear).

This allowed us to clear the other packet, so dealing with the stuck packet is now less urgent.

We still need to figure out what is happening here and how to (1) ensure that packet properly times out, and (2) prevent this from happening again.

I think the course of action here should be the following:

  • Replicate the issue locally
  • Step through the execution with gdb/delve

We have determined that the Timeout function in the message server terminates correctly, but the tx still doesn't get committed and the grpc call doesn't return. We need to figure out why.

@clemensgg Could you provide the steps to set this up locally? Specifically:

  • What's your hermes config?
  • Do you do anything to the osmosis node other than sync mainnet? Do you re-sync to a height each time or keep it in sync?
  • Do you have a neutron node running or just osmosis?

If you prefer to provide this via tg that's also ok

@clemensgg
Copy link
Contributor Author

hi @nicolaslara, to replicate locally:

  • sync osmosis mainnet on osmosisd v16.1.1 (can be pruned, must have txs available so statesync wouldn't work; use snapshot sync)
  • sync neutron mainnet on neutrond v1.0.4
  • RPC and gRPC access on both nodes
  • setup hermes v1.5.0 for auto-config (or v1.6.0 for manual config)
  • hermes config auto --chains osmosis neutron --output ~/.hermes/config.toml, adjust RPC & gRPC endpoints
  • hermes keys add --chain osmosis-1 --mnemonic-file ./mnemonic.txt
  • hermes keys add --chain neutron-1 --mnemonic-file ./mnemonic.txt
  • hermes clear packets --chain osmosis-1 --channel channel-874 --port transfer

@clemensgg
Copy link
Contributor Author

It seems the osmosis v17 mainnet upgrade has resolved the issue, unfortunately we didn't catch the event in our logs because we were running a cronjob clearing routine for that channel.
image

@clemensgg
Copy link
Contributor Author

packets successfully timed out after v17 upgrade

@github-project-automation github-project-automation bot moved this from Needs Triage 🔍 to Done ✅ in Osmosis Chain Development Aug 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants