Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip unactionable providers in check #70

Closed
2color opened this issue Sep 27, 2024 · 17 comments · Fixed by #72
Closed

Skip unactionable providers in check #70

2color opened this issue Sep 27, 2024 · 17 comments · Fixed by #72
Labels
dif/medium Prior experience is likely helpful effort/days Estimated to take multiple days, but less than a week P1 High: Likely tackled by core team if no one steps up

Comments

@2color
Copy link
Member

2color commented Sep 27, 2024

Problems

  • With feat: add ipni checks #66, we now get providers from the IPNI, however the Check backend only supports probing peers that support Bitswap, since both transport-graphsync-filecoinv1 and transport-ipfs-gateway-http are unsupported by both Boxo, Kubo and Rainbow.
  • The IPNI sometimes return duplicate providers (see screenhot)
  • Since we limit the number of providers we test to 10, we often end up in a situation where we get multiple unactionable providers from the IPNI and don't test any providers from the DHT (because it's usually slower to respond) and the loop exits once the max has reached, irrespective of the source of providers.

Screenshot 2024-09-27 at 1 00 16 PM

Example

Potential solutions

  • limit the number of providers from each source (DHT/IPNI) so that we always try to get providers from both
  • Skip providers that only transport-graphsync-filecoinv1 and [transport-ipfs-gateway-http]
  • Deduplicate providers from the IPNI so that we only test distinct PeerIDs once
@aschmahmann
Copy link
Contributor

Skip providers that only transport-graphsync-filecoinv1 and [transport-ipfs-gateway-http]

That this isn't happening is IIUC a bug, or at least inconsistent, behavior in boxo related to the ContentRouting interface wrapper for routing-v1 https://github.com/ipfs/boxo/blob/4af06fdc16292155a6ecb0dafa2a8c7d082a2b64/routing/http/contentrouter/contentrouter.go#L116.

  • It used to only support Bitswap, but now supports the PeerSchema. However, supporting the PeerSchema is likely not a good idea as most users cannot benefit from it and the PeerSchema should also be checked for Bitswap if the Protocols field is present.
    • Note: this brings up an interesting case around what happens if proxying through delegated routing both provider records where the data transfer protocol is unknown and where it is known. Perhaps we should introduce a convention for a protocol called "unknown" so that if a user only wants peers that speak a given protocol (e.g. Bitswap) they can use either that protocol name or "unknown".

@aschmahmann
Copy link
Contributor

Deduplicate providers from the IPNI so that we only test distinct PeerIDs once

@gammazero I thought IPNI's (or at least cid.contact's) /routing/v1 endpoints were supposed to deduplicate identical responses instead of sending one-per-contextID as happens on the IPNI-specific endpoints (e.g. /cid) and that this used to happen. Am I mistaken?

Even if they're supposed to it might still make sense to handle deduplication in boxo to protect from buggy servers.

@gammazero
Copy link
Contributor

@aschmahmann I thought that they were removing duplicates. Maybe it is not happening due to how streaming results are handled. Investigating.

@2color
Copy link
Member Author

2color commented Sep 27, 2024

It used to only support Bitswap, but now supports the PeerSchema. However, supporting the PeerSchema is likely not a good idea as most users cannot benefit from it and the PeerSchema should also be checked for Bitswap if the Protocols field is present.

I thought the point of PeerSchema is to allow for flexibility without requiring explicit code changes.

If we only allow PeerRecords with Protocols containing transport-bitswap here, this would cause someguy to only return bitswap providers, which would defeat the point of ipfs/specs#484.

Note: this brings up an interesting case around what happens if proxying through delegated routing both provider records where the data transfer protocol is unknown and where it is known. Perhaps we should introduce a convention for a protocol called "unknown" so that if a user only wants peers that speak a given protocol (e.g. Bitswap) they can use either that protocol name or "unknown

We did exactly that in ipfs/specs#484, where we use unknown as a special "protocol name" to return provider peers from the DHT where bitswap is implied.

@gammazero
Copy link
Contributor

Just to confirm, in the results above, the CID bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4 returns results for a number of peers where the schema type is "peer". This is what IPNI now returns when the schema type is unknown.

See also: ipni/indexstar#147

@gammazero
Copy link
Contributor

gammazero commented Sep 27, 2024

Deduplication should take place here: https://github.com/ipni/indexstar/blob/main/delegated_translator.go#L110-L137

However, if the advertised metadata is different for the different records but everything else is the same, then the records will still look like duplicates in ipfs-check because it is not showing that the metadata is different.

It is common for the same provider to store the same CID in multiple deals, and therefore with only different metadata. The metadata values may have the same protocols, but still be different.

@gammazero
Copy link
Contributor

@2color Does routerClient.FindProvidersAsync need a parameter to specify which schema types to include in results?

@aschmahmann
Copy link
Contributor

aschmahmann commented Sep 27, 2024

I thought the point of PeerSchema is to allow for flexibility without requiring explicit code changes.

If we only allow PeerRecords with Protocols containing transport-bitswap here, this would cause someguy to only return bitswap providers, which would defeat the point of ipfs/specs#484.

Inside of boxo there are two different APIs used for content routing:

In practice there is AFAIK no code that uses the ContentRouting interface and handles Bitswap + another protocol, so it seems fine to use that wrapper to handle just Bitswap / unknown rather than also shuffling say graphsync through it only for it to result in wasted connections for all real consumers.

So when we want to start checking other protocols besides bitswap we can use the libp2p content routing interface wrapper, since we'd want to surface the protocols anyway.

  • Note: The ContentRouting interface from go-libp2p will overtime likely be insufficient anyway since it doesn't convey protocol hints (even though those could get stashed in the peerstore), and doesn't handle non-libp2p data sources (e.g. if we support webseeds-like HTTP retrieval without requiring HTTP PeerID Auth)

@aschmahmann
Copy link
Contributor

aschmahmann commented Sep 27, 2024

However, if the advertised metadata is different for the different records but everything else is the same, then the records will still look like duplicates in ipfs-check because it is not showing that the metadata is different.

Updated: @gammazero it looks like you're right. The metadata is different for graphsync-filecoinv1 (IIRC it contains the piece CID info). This will go away once we filter out the graphsync data though.

https://cid.contact/cid/bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4

{
  "MultihashResults": [
    {
      "Multihash": "EiBKWqAmrrNEhR0fKqBTZtkobWSilby7owIOhZ7tD9Wdpw==",
      "ProviderResults": [
        {
          "ContextID": "AXESIEK12TUpsBSIBZA78Tad7HPd4xku8EhJyhbhPDVQEvkD",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAglY1YQlUd4B4zVErJKM/0y9HRcPQKcBekn/nyt3QfTQ9sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWHKChM2uYi4EXREaCGtaxersCsp7hbFiMqMUK8o7CgV6Q",
            "Addrs": [
              "/ip4/72.52.65.166/tcp/26101"
            ]
          }
        },
        {
          "ContextID": "AXESIKpYVbeP7qniTSPt8QwXFo8LmL3RryTIm9sK7gQk+u2N",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESILmGFsZPVCJsLnWU8wvhhcgbILO1UUUlgzTnTmYfrPR2",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESINzi+Iqhx5cA3v83f4Ktlbq1ji8dQ5YpECfAsxagzSFs",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgbhxbMz8lI1/T4HaSPP81psg2FpKO3YR5T2Cp/XIS9CxsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESIJo2bVKzbxliWAq6nrmv/fPbMX1buntk756RXwa0tWyS",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESIHtQGFnBStB/xCW5Q3U6NGGgt0HjiTKeS5kA/Bqp9Azk",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESICgE5q2z1cYBSSQtQboAdQunuZiDxRYj6f3xF/Jwcv4G",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESIEM4NOyL3ZwcCjuzCTI0SD30wtWV0VB8rL5WsYlg9Zk2",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESIEPQd1HYFrKSc34PKn6oy4aNMy7CnTlKeMwckBwir/Dw",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESIG9yNTy7y0pky6k8TSyZeGy1pTlo7iA2ZbYNODeuSxn3",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESIPcSMkeiE579t24XOVIruzS1zOJ6AAN8tv3VjLYYWAuO",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
            "Addrs": [
              "/ip4/45.141.104.43/tcp/11337"
            ]
          }
        },
        {
          "ContextID": "AXESIJkFRaucIyzXiKtw4n18bb/tM8h6AQpu40Wcgbjj5Gc6",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgbhxbMz8lI1/T4HaSPP81psg2FpKO3YR5T2Cp/XIS9CxsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWEHx9v2SGvFewuatTvzzGxQC1PQZPRer2ys8fYipiM1zi",
            "Addrs": [
              "/ip4/185.7.192.36/tcp/24001"
            ]
          }
        },
        {
          "ContextID": "AXESIORM2nP4DYNOM3zyMCxkC8xkuWEUiCyik8LfFccZwHLD",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgw6NyxJBpxR8WCVYQfv4FYYspUh+7n4vDT5/QPyWr0QVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWEHx9v2SGvFewuatTvzzGxQC1PQZPRer2ys8fYipiM1zi",
            "Addrs": [
              "/ip4/185.7.192.36/tcp/24001"
            ]
          }
        },
        {
          "ContextID": "AXESIObjfsWDWI1EGts/Dpq27po9Y56nuPYcbcZZEmposQzT",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgII35EFH9oiKqROD/HdP1GL1ay/aP0+HvTGmz4dITQgVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWEHx9v2SGvFewuatTvzzGxQC1PQZPRer2ys8fYipiM1zi",
            "Addrs": [
              "/ip4/185.7.192.36/tcp/24001"
            ]
          }
        },
        {
          "ContextID": "AXESIMhVzlP0+2Z4mIx+Emc0mmqFPHkY7J3KRLAyLNDMM38u",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgw6NyxJBpxR8WCVYQfv4FYYspUh+7n4vDT5/QPyWr0QVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWNTSFywHjGbmGN1aEqJNp54pDKaiwqpthRrvFZETo37pW",
            "Addrs": [
              "/ip4/209.151.233.210/tcp/23789"
            ]
          }
        },
        {
          "ContextID": "AXESIIvo0TazAO6FjYRUtPMA1cv/pDT1C5nRN4ZB2Qpjh3iV",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgbhxbMz8lI1/T4HaSPP81psg2FpKO3YR5T2Cp/XIS9CxsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWNTSFywHjGbmGN1aEqJNp54pDKaiwqpthRrvFZETo37pW",
            "Addrs": [
              "/ip4/209.151.233.210/tcp/23789"
            ]
          }
        },
        {
          "ContextID": "AXESIB610tLWUa/o4ZOmZy63Umtk/quWj2i45EClo/ncYDo0",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgII35EFH9oiKqROD/HdP1GL1ay/aP0+HvTGmz4dITQgVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWHEzPJNmo4shWendFFrxDNttYf8DW4eLC7M2JzuXHC1hE",
            "Addrs": [
              "/ip4/212.6.53.27/tcp/24002"
            ]
          }
        },
        {
          "ContextID": "AXESIB610tLWUa/o4ZOmZy63Umtk/quWj2i45EClo/ncYDo0",
          "Metadata": "gBI=",
          "Provider": {
            "ID": "12D3KooWFRyJZPuJjAVeSMBjYwBXYhQykKfHPfAVHnfTmac4hPo1",
            "Addrs": [
              "/ip4/212.6.53.27/tcp/8888"
            ]
          }
        },
        {
          "ContextID": "AXESIB610tLWUa/o4ZOmZy63Umtk/quWj2i45EClo/ncYDo0",
          "Metadata": "oBIA",
          "Provider": {
            "ID": "12D3KooWHEzPJNmo4shWendFFrxDNttYf8DW4eLC7M2JzuXHC1hE",
            "Addrs": [
              "/ip4/212.6.53.27/tcp/80/http"
            ]
          }
        },
        {
          "ContextID": "AXESIE3lMsTkv45ef0I5fgIDamI5RXzhzie4yP18Pby8PUAu",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgbhxbMz8lI1/T4HaSPP81psg2FpKO3YR5T2Cp/XIS9CxsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWHEzPJNmo4shWendFFrxDNttYf8DW4eLC7M2JzuXHC1hE",
            "Addrs": [
              "/ip4/212.6.53.27/tcp/24002"
            ]
          }
        },
        {
          "ContextID": "AXESIE3lMsTkv45ef0I5fgIDamI5RXzhzie4yP18Pby8PUAu",
          "Metadata": "gBI=",
          "Provider": {
            "ID": "12D3KooWFRyJZPuJjAVeSMBjYwBXYhQykKfHPfAVHnfTmac4hPo1",
            "Addrs": [
              "/ip4/212.6.53.27/tcp/8888"
            ]
          }
        },
        {
          "ContextID": "AXESIE3lMsTkv45ef0I5fgIDamI5RXzhzie4yP18Pby8PUAu",
          "Metadata": "oBIA",
          "Provider": {
            "ID": "12D3KooWHEzPJNmo4shWendFFrxDNttYf8DW4eLC7M2JzuXHC1hE",
            "Addrs": [
              "/ip4/212.6.53.27/tcp/80/http"
            ]
          }
        },
        {
          "ContextID": "AXESIItZ0k0ThGuxzvQunTGkswh+Qu3QFcqg0LNJLFNsMoSI",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgw6NyxJBpxR8WCVYQfv4FYYspUh+7n4vDT5/QPyWr0QVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWJ8YAF6DiRxrzcxoeUVjSANYxyxU55ruFgNvQB4EHibpG",
            "Addrs": [
              "/ip4/212.6.53.28/tcp/24002",
              "/ip6/2a10:2080::28/tcp/24002"
            ]
          }
        },
        {
          "ContextID": "AXESIItZ0k0ThGuxzvQunTGkswh+Qu3QFcqg0LNJLFNsMoSI",
          "Metadata": "gBI=",
          "Provider": {
            "ID": "12D3KooWRPVDpphaViorF5ZSRzQbNbQUhdPkmnoGhWiXr3HDhg1E",
            "Addrs": [
              "/ip4/212.6.53.28/tcp/8888"
            ]
          }
        },
        {
          "ContextID": "AXESIItZ0k0ThGuxzvQunTGkswh+Qu3QFcqg0LNJLFNsMoSI",
          "Metadata": "oBIA",
          "Provider": {
            "ID": "12D3KooWJ8YAF6DiRxrzcxoeUVjSANYxyxU55ruFgNvQB4EHibpG",
            "Addrs": [
              "/ip4/212.6.53.28/tcp/80/http"
            ]
          }
        },
        {
          "ContextID": "AXESIB1u7/J+EoLNq6HfMElooG0RiqbIQGHRXwjWLbGQceKu",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgII35EFH9oiKqROD/HdP1GL1ay/aP0+HvTGmz4dITQgVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWNTSFywHjGbmGN1aEqJNp54pDKaiwqpthRrvFZETo37pW",
            "Addrs": [
              "/ip4/209.151.233.210/tcp/23789"
            ]
          }
        },
        {
          "ContextID": "WjJsa09pOHZabWxzWldKaGMyVXZRV1IyWlhKMGFYTmxiV1Z1ZEM4ek5EUTROREU=",
          "Metadata": "gBI=",
          "Provider": {
            "ID": "12D3KooWGtYkBAaqJMJEmywMxaCiNP7LCEFUAFiLEBASe232c2VH",
            "Addrs": [
              "/dns4/bitswap.filebase.io/tcp/443/wss"
            ]
          }
        },
        {
          "ContextID": "AXESICW+MhFHda5WHfBDD7BGAGoF/yr4q4Uxa4mnuW1qsvQo",
          "Metadata": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAglY1YQlUd4B4zVErJKM/0y9HRcPQKcBekn/nyt3QfTQ9sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q==",
          "Provider": {
            "ID": "12D3KooWNTSFywHjGbmGN1aEqJNp54pDKaiwqpthRrvFZETo37pW",
            "Addrs": [
              "/ip4/209.151.233.210/tcp/23789"
            ]
          }
        }
      ]
    }
  ]
}

https://cid.contact/routing/v1/providers/bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4

{
  "Providers": [
    {
      "Addrs": [
        "/ip4/72.52.65.166/tcp/26101"
      ],
      "ID": "12D3KooWHKChM2uYi4EXREaCGtaxersCsp7hbFiMqMUK8o7CgV6Q",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAglY1YQlUd4B4zVErJKM/0y9HRcPQKcBekn/nyt3QfTQ9sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/45.141.104.43/tcp/11337"
      ],
      "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAg/Pmxek7JA0WYspyz9HWcqEcyy65olDmsauKmkR2k6w5sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/45.141.104.43/tcp/11337"
      ],
      "ID": "12D3KooWKGCcFVSAUXxe7YP62wiwsBvpCmMomnNauJCA67XbmHYj",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgbhxbMz8lI1/T4HaSPP81psg2FpKO3YR5T2Cp/XIS9CxsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/185.7.192.36/tcp/24001"
      ],
      "ID": "12D3KooWEHx9v2SGvFewuatTvzzGxQC1PQZPRer2ys8fYipiM1zi",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgbhxbMz8lI1/T4HaSPP81psg2FpKO3YR5T2Cp/XIS9CxsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/185.7.192.36/tcp/24001"
      ],
      "ID": "12D3KooWEHx9v2SGvFewuatTvzzGxQC1PQZPRer2ys8fYipiM1zi",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgw6NyxJBpxR8WCVYQfv4FYYspUh+7n4vDT5/QPyWr0QVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/185.7.192.36/tcp/24001"
      ],
      "ID": "12D3KooWEHx9v2SGvFewuatTvzzGxQC1PQZPRer2ys8fYipiM1zi",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgII35EFH9oiKqROD/HdP1GL1ay/aP0+HvTGmz4dITQgVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/209.151.233.210/tcp/23789"
      ],
      "ID": "12D3KooWNTSFywHjGbmGN1aEqJNp54pDKaiwqpthRrvFZETo37pW",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgw6NyxJBpxR8WCVYQfv4FYYspUh+7n4vDT5/QPyWr0QVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/209.151.233.210/tcp/23789"
      ],
      "ID": "12D3KooWNTSFywHjGbmGN1aEqJNp54pDKaiwqpthRrvFZETo37pW",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgbhxbMz8lI1/T4HaSPP81psg2FpKO3YR5T2Cp/XIS9CxsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/212.6.53.27/tcp/24002"
      ],
      "ID": "12D3KooWHEzPJNmo4shWendFFrxDNttYf8DW4eLC7M2JzuXHC1hE",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgII35EFH9oiKqROD/HdP1GL1ay/aP0+HvTGmz4dITQgVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/212.6.53.27/tcp/8888"
      ],
      "ID": "12D3KooWFRyJZPuJjAVeSMBjYwBXYhQykKfHPfAVHnfTmac4hPo1",
      "Protocols": [
        "transport-bitswap"
      ],
      "Schema": "peer",
      "transport-bitswap": "gBI="
    },
    {
      "Addrs": [
        "/ip4/212.6.53.27/tcp/80/http"
      ],
      "ID": "12D3KooWHEzPJNmo4shWendFFrxDNttYf8DW4eLC7M2JzuXHC1hE",
      "Protocols": [
        "transport-ipfs-gateway-http"
      ],
      "Schema": "peer",
      "transport-ipfs-gateway-http": "oBIA"
    },
    {
      "Addrs": [
        "/ip4/212.6.53.27/tcp/24002"
      ],
      "ID": "12D3KooWHEzPJNmo4shWendFFrxDNttYf8DW4eLC7M2JzuXHC1hE",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgbhxbMz8lI1/T4HaSPP81psg2FpKO3YR5T2Cp/XIS9CxsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/212.6.53.28/tcp/24002",
        "/ip6/2a10:2080::28/tcp/24002"
      ],
      "ID": "12D3KooWJ8YAF6DiRxrzcxoeUVjSANYxyxU55ruFgNvQB4EHibpG",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgw6NyxJBpxR8WCVYQfv4FYYspUh+7n4vDT5/QPyWr0QVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/ip4/212.6.53.28/tcp/8888"
      ],
      "ID": "12D3KooWRPVDpphaViorF5ZSRzQbNbQUhdPkmnoGhWiXr3HDhg1E",
      "Protocols": [
        "transport-bitswap"
      ],
      "Schema": "peer",
      "transport-bitswap": "gBI="
    },
    {
      "Addrs": [
        "/ip4/212.6.53.28/tcp/80/http"
      ],
      "ID": "12D3KooWJ8YAF6DiRxrzcxoeUVjSANYxyxU55ruFgNvQB4EHibpG",
      "Protocols": [
        "transport-ipfs-gateway-http"
      ],
      "Schema": "peer",
      "transport-ipfs-gateway-http": "oBIA"
    },
    {
      "Addrs": [
        "/ip4/209.151.233.210/tcp/23789"
      ],
      "ID": "12D3KooWNTSFywHjGbmGN1aEqJNp54pDKaiwqpthRrvFZETo37pW",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgII35EFH9oiKqROD/HdP1GL1ay/aP0+HvTGmz4dITQgVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    },
    {
      "Addrs": [
        "/dns4/bitswap.filebase.io/tcp/443/wss"
      ],
      "ID": "12D3KooWGtYkBAaqJMJEmywMxaCiNP7LCEFUAFiLEBASe232c2VH",
      "Protocols": [
        "transport-bitswap"
      ],
      "Schema": "peer",
      "transport-bitswap": "gBI="
    },
    {
      "Addrs": [
        "/ip4/209.151.233.210/tcp/23789"
      ],
      "ID": "12D3KooWNTSFywHjGbmGN1aEqJNp54pDKaiwqpthRrvFZETo37pW",
      "Protocols": [
        "transport-graphsync-filecoinv1"
      ],
      "Schema": "peer",
      "transport-graphsync-filecoinv1": "kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAglY1YQlUd4B4zVErJKM/0y9HRcPQKcBekn/nyt3QfTQ9sVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="
    }
  ]
}

Notice:

  1. ...M1zi is in the IPNI response 3 times
  2. In the IPNI responses the protocol data is identical, but the filecoin graphsync metadata is not (iirc contains pieceCID info of the deal it's assocaited with)
  3. It's also in the routing-v1 response 3 times
  4. In routing-v1 response the protocol data is also identical

@2color
Copy link
Member Author

2color commented Sep 30, 2024

Just to confirm, in the results above, the CID bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4 returns results for a number of peers where the schema type is peer. This is what IPNI now returns when the schema type is unknown.
See also: ipni/indexstar#147

That's useful to know. Does cid.contact return any other schemas from its delegated routing endpoint? Or only PeerSchema now

Deduplication should take place here: https://github.com/ipni/indexstar/blob/main/delegated_translator.go#L110-L137

However, if the advertised metadata is different for the different records but everything else is the same, then the records will still look like duplicates in ipfs-check because it is not showing that the metadata is different.

It is common for the same provider to store the same CID in multiple deals, and therefore with only different metadata. The metadata values may have the same protocols, but still be different.

So from the perspective of the IPNI, it sounds like you don't consider this a bug. Moreover, based on @aschmahmann's response:

This will go away once we filter out the graphsync data though.

IIUC @aschmahmann, your suggestion is to:

  • filter out graphsync providers in the delegated routing client
  • Forgo changing the contentrouter in boxo to accept filtering parameters.

What I'm still trying to understand from the discussion above is what the following looks like in practice:

So when we want to start checking other protocols besides bitswap we can use the libp2p content routing interface wrapper, since we'd want to surface the protocols anyway.

Currently, the wrapper implementation is built to satisfy the go-libp2p routing interface. Does this mean that as soon as we start supporting other protocols, i.e. gateway retrieval, in Boxo, we will need to evolve the wrapper in contentrouter anyways in a way that breaks compatibility with the go-libp2p routing interfaces?

@aschmahmann
Copy link
Contributor

Does this mean that as soon as we start supporting other protocols, i.e. gateway retrieval, in Boxo, we will need to evolve the wrapper in contentrouter anyways in a way that breaks compatibility with the go-libp2p routing interfaces?

Maybe, there's a few ways to make the changes but the end result is that we'll need an abstraction that allows passing more information from the content routing system into the data retrieval system. Maybe that looks like the interface we already have for delegated routing, maybe it looks like something else, we'll have to see how it goes as we start building

@2color
Copy link
Member Author

2color commented Oct 1, 2024

Update:

@gammazero gammazero added the P1 High: Likely tackled by core team if no one steps up label Oct 1, 2024
@lidel lidel added dif/medium Prior experience is likely helpful effort/days Estimated to take multiple days, but less than a week labels Oct 1, 2024
@lidel
Copy link
Member

lidel commented Oct 1, 2024

Triage note:

  • we should(?) default check.ipfs.network tool to filtering based on boxo#678 → update: done in feat: filter bitswap providers in ipni client #72
    • namely, skip unactionable (in IPFS Mainnet) providers by default (only ask for IPIP-484's WithProtocolFilter("unknown","transport-bitswap", "transport-ipfs-gateway-http") + give user ability to remove / adjust filtering as opt-in

@2color
Copy link
Member Author

2color commented Oct 1, 2024

  • we should(?) default check.ipfs.network tool to filtering based on boxo#587

Are you referring to boxo#587 or ipfs/boxo#678?.

WithProtocolFilter("unknown","transport-bitswap", "transport-ipfs-gateway-http")

Currently we use Vole to conduct the bitswap check. What should we use for providers with transport-ipfs-gateway-http?

  • give user ability to remove / adjust filtering as opt-in

How is this useful presuming that we only support transport-bitswap?

@lidel
Copy link
Member

lidel commented Oct 1, 2024

I meant ipfs/boxo#678, sorry, we did not notice #72 before.

Gist of triage discussion was that ipfs-check should filter out useless things for IPFS Mainnet by default (done in #72), but allow people to disable filtering via advanced settings (like they can change cid.contact to different endpoint) – this is not a blocker, but a nice to have.

What should we use for providers with transport-ipfs-gateway-http?

If we want to probe HTTP, we can make trustless gateway HEAD request with Accept: application/vnd.ipld.raw and ?format=raw
https://specs.ipfs.tech/http-gateways/path-gateway/#only-if-cached-head-behavior

How is this useful presuming that we only support transport-bitswap?

Mostly educational about what is mandatory, and what is optional at protocol level.
I think it would be healthy if we supported transport-ipfs-gateway-http as second protocol, just to avoid people assuming and hardcoding bitswap everywhere.

For example, https://delegated-ipfs.dev/routing/v1/providers/bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi has providers that use transport-ipfs-gateway-http and these are used by https://inbrowser.link when found and https.

Ability to adjust filters from GUI maintains open system where new protocols can be added by people in ecosystem without being blessed by us, PLN, or anyone else. For example, someone may start announcing custom protocol on routing systems like IPNI, and use our tooling for querying / debugging.

@2color
Copy link
Member Author

2color commented Oct 2, 2024

I think it would be healthy if we supported transport-ipfs-gateway-http as second protocol, just to avoid people assuming and hardcoding bitswap everywhere.

I agree.

Ability to adjust filters from GUI maintains open system where new protocols can be added by people in ecosystem without being blessed by us, PLN, or anyone else. For example, someone may start announcing custom protocol on routing systems like IPNI, and use our tooling for querying / debugging.

I agree that we should keep things open ended and configurable.

Though realistically for things new protocols to actually work in all existing tooling requires coordination, and I don't think it's within our ability to completely remove our need for coordination. Just as an example, even just adding support for http gateway retrievals require a lot of coordinated changes and it still doesn't currently work as it requires implementing support for it.

--

Since this issue was mostly concerned with fixing open bugs that arose by having providers from the IPNI, I have opened a separate issue to discuss http gateway checks: #73

@2color 2color closed this as completed in #72 Oct 7, 2024
@2color
Copy link
Member Author

2color commented Oct 7, 2024

I can confirm that the the original example has been fixed in production

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dif/medium Prior experience is likely helpful effort/days Estimated to take multiple days, but less than a week P1 High: Likely tackled by core team if no one steps up
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants