Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rosetta]: remove the option to run rosetta in process #10776

Closed
Tracked by #13327
fdymylja opened this issue Dec 14, 2021 · 6 comments
Closed
Tracked by #13327

[rosetta]: remove the option to run rosetta in process #10776

fdymylja opened this issue Dec 14, 2021 · 6 comments
Assignees
Labels
C:Rosetta Issues and PR related to Rosetta

Comments

@fdymylja
Copy link
Contributor

fdymylja commented Dec 14, 2021

In the SDK we allow to run rosetta in process but we frequently get reports from people who have problems starting their node because rosetta is failing.

The reason for which this happens is because rosetta is attempting to connect to the node which is not ready yet, then after an X number of retries rosetta will fail, causing the node Start process to fail too (NOTE: rosetta will stop a node only at startup, not whilst it is running). This could be avoided by setting higher retries attempts, or higher wait time between attempts.

Despite this, we should disallow users from using rosetta in-process and run it as a standalone process instead. This until we get proper node readiness checks in the sdk.

@tac0turtle tac0turtle added the C:Rosetta Issues and PR related to Rosetta label Aug 10, 2022
@raynaudoe
Copy link
Contributor

I think for this having a standalone main.go for rosetta would be a nice thing.
The connections retries to the node can be safely done inside rosetta's binary without interfering with the nodes main loop.
What do you think @marbar3778 ?

@alexanderbez
Copy link
Contributor

Why not just have Rosetta to wait until the node is ready, maybe via some signaling mechanism, prior to having it try and connect?

@raynaudoe
Copy link
Contributor

raynaudoe commented Sep 27, 2022

Why not just have Rosetta to wait until the node is ready, maybe via some signaling mechanism, prior to having it try and connect?

on that case, which endpoint/rpc could I use to actually know that the node is ready ? I'm currently using the rpc for Health and Status to check if the node is 'alive' and 'well' respectively.

Also, I've seen several issues related to this one, what if we remove rosetta from here and if anyone wants to run rosetta would just need to run the proper cli command ?

@alexanderbez
Copy link
Contributor

First, what we must do is define what exactly the term "ready" means, then we can define the signaling/checking mechanism.

So what does "ready" mean? A block being produced?

@raynaudoe
Copy link
Contributor

So what does "ready" mean? A block being produced?

I guess, from a rosetta server POV, we could divide the 'readiness' status into two stages:

  • rosetta launch-time: rosetta expects at this stage that the node is 'reachable' meaning that rosetta can do a minimal query like asking for the node's name, version and sync status for example. This should be something that fails 'fast' IMO, but there are cases for example that a nodes' rpc server won't start before it loads its entire state, thing that could take a while. In the particular case of cosmos that shouldn't be a problem because in the func startInProcess the services are started secuentally, and rosetta is the last one to start(*). Different case is when adding the RosettaCmd to any rootCmd, in that case we don't know when this command will be launched and rosetta could try to connect to a rpc address that is not there yet.

  • rosetta online-time: at this stage we mostly care that the node is synced, status that should be queryable through rosetta's endpoint /network/status. So maybe after any query related to 'getting the latest block' or 'asking for an account balance' should first check that the node is synced.

(*) there are cases, for example this issue where rosetta will just fails exhausting all retries because the func client.Ready tries to get the genesis block from the blockstore on a node that is syncing using StateSync, and that will just fail. But we are fixing that one too on Zondax fork (and also here because we are trying to define what are the conditions to consider the node ready :) )

@alexanderbez
Copy link
Contributor

So what I would do is have Rosetta wait for a signal that states the node's RPC server has started and that syncing is false. Essentially a push mechanism. You could create a channel and pass that channel to the Rosetta object. The main server will push to this channel once the RPC server is (A) started, (B) is reachable, and (C) has syncing as false.

mergify bot pushed a commit that referenced this issue Nov 4, 2022
### Description

Closes:
#13083
#11402
#10678
#12358
#10776
#12934

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [x] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [x] updated the relevant documentation or specification
- [x] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:Rosetta Issues and PR related to Rosetta
Projects
None yet
Development

No branches or pull requests

6 participants