Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mainstay doc #64

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Mainstay doc #64

wants to merge 2 commits into from

Conversation

tomt1664
Copy link

Initial mainstay service integration proposal

@nicosey nicosey requested a review from ariard September 27, 2023 16:47
Copy link
Contributor

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main question is adapting the mainstay flow to a non http interface (i.e nostr websocket mainly) and how the flow execution is distributed between independent components.

From my understanding the flow is the following:

  • mainstay service registers as a civkit service with civkitd “PURPLE"
  • civkit client wishes to notarize published market order XYZ
  • civkit client sends XYZ wrapped as a nostr event to civkitd “PURPLE"
  • civkitd forwards XYZ to mainstay server hosted as a civkit service
  • mainstay server attests to XYZ and when it’s suitable commits XYZ in the bitcoin chain
  • once XYZ commitment tx is included in the bitcoin chain, mainstay server forwards back XYZ proof to civkitd “PURPLE”
  • civkitd “PURPLE” stores XYZ proof according to service-level-agreement and make it available to all clients

Few of the main advantage of this architecture:

  • proof storage can be replicated over many civkitd relays
  • downtime of mainstay server do not prevent client to access proofs
  • multiple civkitd relays self-hosted by mainstay server operator can be deployed to front-load large-scale number of clients demand
  • good fault-tolerance and low-migration cost in case of some civkitd relays being disruptive / disrupted


Assumptions:

Mainstay service is available over http interface (or via SOCKS5 Tor proxy).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the question to ask if what types of clients the mainstay integration aims to serve during the few first rollouts. Out of mind, I believe one of the main target is Nostr client (including civkit-sample) and civkit marketd service (notarize all the trade orders received) and a more long-term scale LSPs / Lightning delegated infrastructure (e.g watchtower).

If we’re considering those clients in priority, realistically the interface to prioritize are the following:

  • (unauthenticated / unencrypted) websocket over tcp
  • bolt8’s noise connection over tcp

Those ones are already wip in civkit-node.

W.r.t to communications between civkit-notaryd (i.e either mainstay service proxy or one of its main running process) and civkitd there is a tonic interface (civkitservices) using gRPC over HTTP/2.

Assumptions:

Mainstay service is available over http interface (or via SOCKS5 Tor proxy).
Mainstay service is available and funded with a valid `token_id` for verifiation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding - Client buy “publication slot” with a bitcoin payment, gets a credential and then redeem the service at anytime in the future (under max service policy time window) with cleartext credentials and an identifier. The identifier allows binding between the credentials redemption payload and the protocol-specific request.

This identifier can be the valid token_id mentioned here.

Note this is matching the issuance / redemption flow of the staking credential framework:
https://github.com/civkit/staking-credentials-spec/blob/main/60-staking-credentials-archi.md#credentials-issuance

The token_id can be the service_id implemented for ServiceDeliveranceRequest / ServiceDeliveranceResult here:
https://github.com/civkit/staking-credentials/blob/main/staking-credentials/src/common/msgs.rs#L119

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this makes sense.


Mainstay service is available over http interface (or via SOCKS5 Tor proxy).
Mainstay service is available and funded with a valid `token_id` for verifiation.
Funding (via LN payment) is performed in advance and out of band for subscrption. (i.e. `token_id` is already performed.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The funding can happens through the “issuance” protocol flow of staking credentials mentioned above. Pay-per-usage or subscription can be defined as service policy, though for privacy-preserving reasons if subscription is opted-in new credentials / tokens should be refreshed for every service unit deliverance.

A user A should not be able to be dissociated from user B based on its service consumption pattern (ideally).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fundamental issue with mainstay (or any proof-of-publication mechanism or service). The commitments must be provably unique in a given publication space, and so user A must have exclusive access to their own publication space (i.e. 'slot' in mainstay), necessitating user credentials. The credentials can be updated, but the identification of the publication space they are linked to cannot be - the service will always know it's the same user posting commitments to the same slot.

But I don't think there is an issue with this privacy-wise. The user can blind the commitments themselves if required, and store the blinding nonces with the proofs for verification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I see the provably unique requirement in a proof-of-publication, though on the exclusive access I wonder if a user signature (and therefore posession of a secret key) could be included in the commitment scope. If you have duplication or equivocation of a publication space it can be disregarded at both client / server level. If my understanding of proof-of-publication space is correct.

Otherwise yes credential can be re-used indefinitely by the user, like the service provider binds a slot at the first credential redemption, and allow re-use of it.

Mainstay service is available over http interface (or via SOCKS5 Tor proxy).
Mainstay service is available and funded with a valid `token_id` for verifiation.
Funding (via LN payment) is performed in advance and out of band for subscrption. (i.e. `token_id` is already performed.)
Mainstay proofs are stored and made available, but that verification against `bitcoind` and staychain occur separately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the nice advantage with current civkitd architecture, there is a separate logic in-charge of the disk operations NoteProcessor, and this aims to handle storage for “hosted” civkit services (such as civkit-notaryd or civkit-martked instance). In the future, if it becomes its own process it could be run independently on the civkit service, not the civkitd one.

One advantage of dissociating mainstay service from the backend storage is to enable the replication of storage over multiple civkitd nodes instance for redundancy.

Storage service requirement will need to be agree on as it can become a source of denial-of-service.

I think it’s good than verification against bitcoind and staychain occurs on the client-side and proofs are just fetched by them when they need it.

Lastly, I believe it would be very valuable to have standardization of the mainstay proofs, that way it can be consumed by civkit-sample scoring / reputation engine to rank market board services e.g.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, yes.

The mainstay model as it currently works is:

  • user creates commitments to data
  • sends commitments to mainstay service API.
  • User queries mainstay periodically for commitment status. Once the root is committed to a confirmed bitcoin transaction, the user queries the mainstay API for both the TxID of the root commitment and the proof (path).
  • User stores the TxIDs and proof locally.

The user can then choose either: just trust the mainstay service provider that the tx is confirmed, the proof is valid and done correctly, and just keep the data in case it is needed for future dispute. OR verify the commitment once it's received against bitcoind.

In the current service, verification is handled by the pymainstay client.

The proof format (i.e. a single slot proof) returned by the API is currently like:

        "attestation":
        {
            "merkle_root": "f46a58a0cc796fade0c7854f169eb86a06797ac493ea35f28dbe35efee62399b",
            "txid": "38fa2c6e103673925aaec50e5aadcbb6fd0bf1677c5c88e27a9e4b0229197b13",
            "confirmed": true,
            "inserted_at": "16:06:41 23/01/19"
        },
        "merkleproof":
        {
            "position": 1,
            "merkle_root": "f46a58a0cc796fade0c7854f169eb86a06797ac493ea35f28dbe35efee62399b",
            "commitment": "5555c29bc4ac63ad3aa4377d82d40460440a67f6249b463453ca6b451c94e053",
            "ops": [
            {
                "append": false,
                "commitment": "21b0a66806bdc99ac4f2e697d05cb17c757ae10deb851ee869830d617e4f519c"
            },
            {
                "append": true,
                "commitment": "622d1b5efe11e9031f1b25aac11587e0ff81a37e9565ded16ee8e82bbc0c2fc1"
            },
            {
                "append": true,
                "commitment": "406ab5d975ae922753fad4db83c3716ed4d2d1c6a0191f8336c76000962f63ba"
            }]
        }
    ```

A chain of thses (along with the data sequence) gives the full history/publication proof. 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, the mainstay mode is quite simple and I think it fits well in the civkit service framework.

There is just a relay (i.e civkitd) added as an intermediary between the user and mainstay service API. Multiple relays can be used to front-load or duplicate proof storages.

Good to have proofs that can be queried from service or in by the client (in case of service unavailability).

Mainstay proof format is simple, that’s good.

position: u64,
token: String,
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one key element which sounds missing from a mainstay service is a long-term pubkey, ideally using a public key on Bitcoin’s secp256k1 curve, see introduction of https://github.com/lightning/bolts/blob/master/08-transport.md#bolt-8-encrypted-and-authenticated-transport

I think the url can stay and it could be announced in the future where is civkit service gossip periodically issued by the civkitd to announce itself to the rest of the network, see https://github.com/lightning/bolts/blob/master/07-routing-gossip.md#the-node_announcement-message

Unclear what will be a slot index, like where in a batched mainstay proof this client proof is inserted. Authentication token or credential is assumed to be dynamic thanks to the issuance flow. Other fields that could be added is the list of “mainstay” features supported, though this can become more sophisticated later I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - so the long term pubkey is to receive messages via tcp (as opposed to an onion address).

The slot index is unique to a user/client. It is assigned by the mainstay service when a user first pays. The slot index cannot change for a single proof-of-publication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - so the long term pubkey is to receive messages via tcp (as opposed to an onion address)

In fact both, see BOLT4 on how pubkey is used for onion routing: https://github.com/lightning/bolts/blob/master/04-onion-routing.md

Understood the slot index unique to a user/client.

};
```

`commitment` is a 32 byte value encoded as a 64 character hex string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the payload can be defined as a new tlv_stream (see https://github.com/lightning/bolts/blob/master/01-messaging.md#type-length-value-format) for future backward-compatibility addition of new fields to existing message types. Then this tlv_stream can be added as the content of nostr EVENT and signed by the client, then forward to civkitd.

I think it’s a bit of protocol hacking though in the future this allow nice thing, like leveraging the nostr tag field to have “mempool” like semantic of relay messages, or extract the tlv_record to be wrapped as an onion and routed accordingly.

doc/mainstay.md Outdated

Initially assume every event will be committed to the mainstay service endpoint.

It may be more efficient to compress several events into a single commitment and then only commit every `commitment_interval`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as soon as compression is wished, the trade-off of the compression format have to be weighted in, as they leak on the storage / retrieval efficiency / robustness / cost.

doc/mainstay.md Outdated

The node will construct commitments from specified *events* () in `src/events.rs`.

The commitment can be simply constructed from the sha256 hash of each event (encoded as a string) similar to:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think effectively it is best to have an API receiving a String (which can be the sha256 of a nostr event or a bolt 11 invoice) and then build an attestation.

I think it would be valuable to precise the data format of the attestation, like what is included inside beyond the sha256 e.g block hash / timestamp and service counter-signature. I understand a mainstay proof is the attestation + “included-in-the-chain” proof-of-publication.

doc/mainstay.md Outdated

## Commitment construction

The node will construct commitments from specified *events* () in `src/events.rs`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be noted the nostr events is just a data communication transport and ideally attestation could scope more generic data payload. Note the events in src/events.rs are the civkitd internal events, even if partially overlapping with Nostr ones.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - so we need to define precisely what it is that needs to be committed and proven in a dispute. There's no reason this can't be everything that is stored permanently by the node?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason this can't be everything that is stored permanently by the node?

I don’t get exactly your question, like assuming everything is forever stored by the node? I think you have denial-of-service if proofs can be freely stored or freely queried by clients. Even if you have subscription, there is a need to a data limit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is the general one of what data needs to be committed. So all events are saved in the db when write_new_event_db is called - is anything else saved in the db apart from events? Strictly, with a proof-of-publication, you should be able to verify everything that has a commitment made from it - so anything you include in a commitment needs to be stored for as long as you want to be able to prove history. So I was meaning that if all events are saved by default indefinitely, then we commit to each one. Or a subset of these events?

To verify, all data that formed the commitment needs to be available. Should we just hash Event or DbEvent objects? Seems they can be recreated exactly from the values inserted into the DB?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is the general one of what data needs to be committed. So all events are saved in the db when write_new_event_db is called - is anything else saved in the db apart from events?

Other elements to be saved in the DB:

  • client
  • peers
  • event subscriptions

So I was meaning that if all events are saved by default indefinitely, then we commit to each one

Yes I think too, though note the trade-off in term of disk denial-of-service. E.g I ask you to store a proof-of-publication with no time limit and not regular payment. It’s less an issue with what data is saved than client-server interactions.

To verify, all data that formed the commitment needs to be available. Should we just hash Event or DbEvent objects?

I think we can just have Event, db event is a superset or at least you should be able to recreate event from db data elements (at least not now though in the future yes).

}
```

This will need to be stored in a new DB table corresponding to events.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this assumes than the mainstay server look on periodically at bitcoind to get a list of confirmed txids, and when a target “notarization_tx” (commitment / anchor name already widely used in lightning parlance) has been included, the proof is finalized by the mainstay server and shared back to civkitd for storage and retrieval by clients.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - mainstay server can just be queried for txids and proofs as they become available. Checking against bitcoind only required if you want to check its all been included correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

@tomt1664
Copy link
Author

I think the main question is adapting the mainstay flow to a non http interface (i.e nostr websocket mainly) and how the flow execution is distributed between independent components.

From my understanding the flow is the following:

  • mainstay service registers as a civkit service with civkitd “PURPLE"
  • civkit client wishes to notarize published market order XYZ
  • civkit client sends XYZ wrapped as a nostr event to civkitd “PURPLE"
  • civkitd forwards XYZ to mainstay server hosted as a civkit service
  • mainstay server attests to XYZ and when it’s suitable commits XYZ in the bitcoin chain
  • once XYZ commitment tx is included in the bitcoin chain, mainstay server forwards back XYZ proof to civkitd “PURPLE”
  • civkitd “PURPLE” stores XYZ proof according to service-level-agreement and make it available to all clients

Yes, makes sense. By "mainstay server forwards back XYZ proof", you mean the mainstay server should publish (and send via civkitd) all proofs? It may be much more efficient for the mainstay server to simply publish the whole Merkle tree for each Tx and all users can extract their own proofs from this.

Few of the main advantage of this architecture:

  • proof storage can be replicated over many civkitd relays
  • downtime of mainstay server do not prevent client to access proofs
  • multiple civkitd relays self-hosted by mainstay server operator can be deployed to front-load large-scale number of clients demand
  • good fault-tolerance and low-migration cost in case of some civkitd relays being disruptive / disrupted

OK, I see how this can be better for retrieval and distribution of proofs.

@ariard
Copy link
Contributor

ariard commented Sep 29, 2023

By "mainstay server forwards back XYZ proof", you mean the mainstay server should publish (and send via civkitd) all proofs?

I think there is one deployment option where the XYZ proof is shared back from the mainstay server to all the reachable civkitd relay if you wanna high-availability of the proofs. Flooding of the proofs by default to all clients depend really what is the data content of the proof. If you’re notarizing your market board orders flow, you might flow them by default, though it’s really application-dependent. Good to have both mechanism flexibility, I think.

@ariard
Copy link
Contributor

ariard commented Sep 29, 2023

Forked the BIP repo under the civkit repo if we wanna start to sketch out bips of mainstay proof format: #64 (comment) Kinda background priority, though can be nice to design future civkit scoring engines.

@ariard
Copy link
Contributor

ariard commented Dec 13, 2023

I think this can be re-worked after #110 and #109, if more mainstay documentation should be added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants