refactor: async iterables #2547

alanshaw · 2019-10-17T11:36:24Z

~~This PR allows ipfsx to be used by calling IPFS.create(options) with { EXPERIMENTAL: { ipfsx: true } } options.~~

It adds a single API method add that returns an iterator that yields objects of the form { cid, path, size }. ~~The iterator is decorated with a first and last function so users can conveniently await on the first or last item to be yielded as per the proposal here~~.

In order to boot up a new ipfsx node I refactored the boot procedure to enable the following:

Remove the big stateful blob "self" - components are passed just the dependencies they need to operate. Right now it is opaque as to which components require which parts of an IPFS node without inspecting the entirety of the component's code. This change makes it easier to look at a component and know what aspects of the IPFS stack it uses and consequently allows us to understand which APIs should be available at which points of the node's lifecycle. It makes the code easier to understand, more maintainable and easier to mock dependencies for unit tests.
Restrict usage of subsystem APIs to appropriate lifecycle stage(s). This PR introduces an ApiManager that allows us to update the API that is exposed at any given point. It allows us to (for example) disallow ipfs.add before the node is initialized or access libp2p before the node is started. The lifecycle methods init, start and stop each define which API methods are available after they have run avoiding having to put boilerplate in every method to check if it can be called when the node is in a particular state. See What does it mean to start an IPFS node? #1438
Safer and more flexible API usage. The ApiManager allows us to temporarily change APIs to stop init from being called again while it is already running and has the facility to rollback to the previous API state if an operation fails. It also enables piggybacking so we don't attempt 2 or more concurrent start/stop calls at once. See state machine, transitions and events #1061 Error: Not able to stop from state: stopping #2257
Enable config changes at runtime. Having an API that can be updated during a node's lifecycle will enable this feature in the future.

FEEDBACK REQUIRED: The changes I've made here are a little...racy. They have a bunch of benefits, as I've outlined above but the ApiManager is implemented as a Proxy, allowing us to swap out the underlying API at will. How do y'all feel about that? Is there a better way or got a suggestion?

resolves #1438
resolves #1061
resolves #2257
resolves #2509
resolves #1670
refs ipfs-inactive/interface-js-ipfs-core#394

BREAKING CHANGE: IPFS.createNode removed

BREAKING CHANGE: IPFS is not a class that can be instantiated - use IPFS.create. An IPFS node instance is not an event emitter.

BREAKING CHANGE: Instance .ready property removed. Please use IPFS.create instead.

BREAKING CHANGE: Callbacks are no longer supported on any API methods. Please use a utility such as callbackify on API methods that return Promises to emulate previous behaviour.

BREAKING CHANGE: PeerId and PeerInfo classes are no longer statically exported from ipfs-http-client since they are no longer used internally.

BREAKING CHANGE: pin.add results now contain a cid property (a CID instance) instead of a string hash property.

BREAKING CHANGE: pin.ls now returns an async iterable.

BREAKING CHANGE: pin.ls results now contain a cid property (a CID instance) instead of a string hash property.

BREAKING CHANGE: pin.rm results now contain a cid property (a CID instance) instead of a string hash property.

BREAKING CHANGE: add now returns an async iterable.

BREAKING CHANGE: add results now contain a cid property (a CID instance) instead of a string hash property.

BREAKING CHANGE: addReadableStream, addPullStream have been removed.

BREAKING CHANGE: ls now returns an async iterable.

BREAKING CHANGE: ls results now contain a cid property (whose value is a CID instance) instead of a hash property.

BREAKING CHANGE: files.readPullStream and files.readReadableStream have been removed.

BREAKING CHANGE: files.read now returns an async iterable.

BREAKING CHANGE: files.lsPullStream and files.lsReadableStream have been removed.

BREAKING CHANGE: files.ls now returns an async iterable.

BREAKING CHANGE: files.ls results now contain a cid property (whose value is a CID instance) instead of a hash property.

BREAKING CHANGE: files.ls no longer takes a long option (in core) - you will receive all data by default.

BREAKING CHANGE: files.stat result now contains a cid property (whose value is a CID instance) instead of a hash property.

BREAKING CHANGE: get now returns an async iterable. The content property value for objects yielded from the iterator is now an async iterable that yields BufferList objects.

BREAKING CHANGE: stats.bw now returns an async iterable.

BREAKING CHANGE: addFromStream has been removed. Use add instead.

BREAKING CHANGE: addFromFs has been removed. Please use the exported globSource utility and pass the result to add. See the glob source documentation for more details and an example.

BREAKING CHANGE: addFromURL has been removed. Please use the exported urlSource utility and pass the result to add. See the URL source documentation for more details and an example.

BREAKING CHANGE: name.resolve now returns an async iterable. It yields increasingly more accurate resolved values as they are discovered until the best value is selected from the quorum of 16. The "best" resolved value is the last item yielded from the iterator. If you are interested only in this best value you could use it-last to extract it like so:

const last = require('it-last')
await last(ipfs.name.resolve('/ipns/QmHash'))

BREAKING CHANGE: block.rm now returns an async iterable.

BREAKING CHANGE: dht.findProvs, dht.provide, dht.put and dht.query now all return an async iterable.

BREAKING CHANGE: dht.findPeer, dht.findProvs, dht.provide, dht.put and dht.query now yield/return an object { id: CID, addrs: Multiaddr[] } instead of a PeerInfo instance(s).

BREAKING CHANGE: refs and refs.local now return an async iterable.

BREAKING CHANGE: object.data now returns an async iterable that yields Buffer objects.

BREAKING CHANGE: ping now returns an async iterable.

BREAKING CHANGE: repo.gc now returns an async iterable.

BREAKING CHANGE: swarm.peers now returns an array of objects with a peer property that is a CID, instead of a PeerId instance.

BREAKING CHANGE: swarm.addrs now returns an array of objects { id: CID, addrs: Multiaddr[] } instead of PeerInfo instances.

BREAKING CHANGE: block.stat result now contains a cid property (whose value is a CID instance) instead of a key property.

BREAKING CHANGE: bitswap.wantlist now returns an array of CID instances.

BREAKING CHANGE: bitswap.stat result has changed - wantlist and peers values are now an array of CID instances.

BREAKING CHANGE: the init option passed to the IPFS constructor will now not take any initialization steps if it is set to false. Previously, the repo would be initialized if it already existed. This is no longer the case. If you wish to initialize a node but only if the repo exists, pass init: { allowNew: false } to the constructor.

BREAKING CHANGE: removed file ls command from the CLI and HTTP API.

BREAKING CHANGE: Delegated peer and content routing modules are no longer included as part of core (but are still available if starting a js-ipfs daemon from the command line). If you wish to use delegated routing and are creating your node programmatically in Node.js or the browser you must npm install libp2p-delegated-content-routing and/or npm install libp2p-delegated-peer-routing and provide configured instances of them in options.libp2p. See the module repos for further instructions:

achingbrain · 2019-10-17T11:53:55Z

Much excite, will digest and comment. I am 👎 on extending iterators though, convenient that the functions are, I don't think we should mess with core language features.

alanshaw · 2019-10-17T13:15:14Z

N.B. this PR is NOT yet finished, I'm not sure it even runs and it has no tests, just looking for general feedback right now before I invest the last chunk of time to finish it.

I am 👎 on extending iterators though, convenient that the functions are, I don't think we should mess with core language features.

Can you elaborate? Many iterable objects in JavaScript have other functions available on them, is it really "messing" with core language features to return an iterable object that has it's own functions?

It's really beneficial to have a first/last function as it reduces boilerplate on a very common use case. I guess alternatively we could export a utility function from IPFS or mandate that people install and use their own...? I'm open to suggestions. Maybe a different API 😅?

lidel

Regarding reducing boilerplate related to first/last, I believe that key offenders are pretty specific:

Adding a single thing with ipfs.add and getting "array with one result" which needs to be unpacked.
Adding a directory tree with ipfs.add just to get the root CID (last item on the results array)

Personally I really like the uniformity of having first/last, but if its not the best way to do it, addressing those two cases I mentioned could get us mostly there.

src/core/errors.js

lidel · 2019-10-18T13:08:48Z

src/core/components-ipfsx/add/index.js

+    const iterator = pipe(
+      normaliseAddInput(source),
+      source => importer(source, ipld, opts),
+      transformFile(dag, opts),
+      preloadFile(preload, opts),
+      pinFile(pin, opts)
+    )


I really like pipelines like this one.
Easy to understand what needs to happen just by scanning only a few lines without going too deep into details.
Doing more of this would make it easier to onboard new contributors to the codebase.

daviddias · 2019-10-21T06:44:45Z

Remove the big stateful blob "self" - components are passed just the dependencies they need to operate.

👏🏽 👍🏽 👏🏽 👍🏽

Restrict APIs to appropriate lifecycle stage(s).

I believe the terms API and subsystem here is getting mixed. When a subsystem is not yet active, the API should not be exposed.

Safer and more flexible API usage.

Nice

Enable config changes at runtime.

🤔

Overall, this PR proposes a lot of really good refactors to the internals, I like where it is going :) Thank you!

alanshaw · 2019-10-21T09:33:57Z

I believe the terms API and subsystem here is getting mixed.

Yes this should have read something more like "Restrict usage of subsystem APIs to appropriate lifecycle stage(s)"

When a subsystem is not yet active, the API should not be exposed.

I think it's better to throw a meaningful error explaining why the API cannot be accessed rather than the user getting undefined is not a function. The former is what happens in this PR.

Overall, this PR proposes a lot of really good refactors to the internals, I like where it is going :) Thank you!

❤️

achingbrain · 2019-10-21T14:27:16Z

I think it's better to throw a meaningful error explaining why the API cannot be accessed rather than the user getting undefined is not a function

Always!

Can you elaborate?

Sure!

Many iterable objects in JavaScript have other functions available on them

The most obvious example that springs to mind is readable streams in node, but when this sort of thing happens I don't think the intention is for them to be consumed as streams and async iterators, rather streams or async iterators. So the interface a result presents should only do one thing and not leave the user in the position of:

Well, I got this Async Iterator from this module and I got this Async Iterator from that module, why can I only call this function on this one but not that one?

..when in fact they haven't iterated asyncly over the result at all.

Better to just let them pass the result to a function that extracts the first result or the last or the nth or whatever they need. Chances are they have a growing collection of modules for dealing with async iterators already anyway - piping, making things duplex, etc.

add section on error codes Ref. ipfs/js-ipfs#2547 (comment)

* Update CONTRIBUTING_JS.md add section on error codes Ref. ipfs/js-ipfs#2547 (comment) * Update CONTRIBUTING_JS.md Co-Authored-By: Teri Chadbourne <[email protected]> * docs: add Error Codes example License: MIT Signed-off-by: Marcin Rataj <[email protected]> * Update CONTRIBUTING_JS.md Co-Authored-By: Teri Chadbourne <[email protected]>

alanshaw · 2019-11-12T09:11:07Z

@achingbrain @hugomrdias any chance you can review this PR?

achingbrain

I am still 👎 on extending iterators as it encourages people to not iterate and if they're not iterating, maybe iterators are the wrong thing to return.

That said I'd like to start playing with this API so gogogogogogo!

This PR allows ipfsx to be used by calling `IPFS.create(options)` with `{ EXPERIMENTAL: { ipfsx: true } }` options. It adds a single API method `add` that returns an iterator that yields objects of the form `{ cid, path, size }`. The iterator is decorated with a `first` and `last` function so users can conveniently `await` on the first or last item to be yielded as per the [proposal here](https://github.com/ipfs-shipyard/ipfsx/blob/master/API.md#add). In order to boot up a new ipfsx node I refactored the boot procedure to enable the following: 1. **Remove the big stateful blob "`self`" - components are passed just the dependencies they need to operate.** Right now it is opaque as to which components require which parts of an IPFS node without inspecting the entirety of the component's code. This change makes it easier to look at a component and know what aspects of the IPFS stack it uses and consequently allows us to understand which APIs should be available at which points of the node's lifecycle. It makes the code easier to understand, more maintainable and easier to mock dependencies for unit tests. 1. **Restrict APIs to appropriate lifecycle stage(s).** This PR introduces an `ApiManager` that allows us to update the API that is exposed at any given point. It allows us to (for example) disallow `ipfs.add` before the node is initialized or access `libp2p` before the node is started. The lifecycle methods `init`, `start` and `stop` each define which API methods are available after they have run avoiding having to put boilerplate in every method to check if it can be called when the node is in a particular state. See #1438 1. **Safer and more flexible API usage.** The `ApiManager` allows us to temporarily change APIs to stop `init` from being called again while it is already running and has the facility to rollback to the previous API state if an operation fails. It also enables piggybacking so we don't attempt 2 or more concurrent start/stop calls at once. See #1061 #2257 1. **Enable config changes at runtime.** Having an API that can be updated during a node's lifecycle will enable this feature in the future. **FEEDBACK REQUIRED**: The changes I've made here are a little...racy. They have a bunch of benefits, as I've outlined above but the `ApiManager` is implemented as a `Proxy`, allowing us to swap out the underlying API at will. How do y'all feel about that? Is there a better way or got a suggestion? resolves #1438 resolves #1061 resolves #2257 refs #2509 refs #1670 License: MIT Signed-off-by: Alan Shaw <[email protected]>

alanshaw · 2019-12-10T11:40:28Z

Ok so I'm changing tact with this PR - people seem generally happy with the boot process approach and seeing as we now have a fully refactored ipfs-http-client with a new API and fully refactored interface-ipfs-core test suite that tests against the new API I'm going to use this PR as a base to PR against and bubble these changes in js-ipfs.

This means we won't have a EXPERIMENTAL flag for ipfsx - as I think this can get sorted pretty quickly.

PS. I've also removed the .first() and .last() iterator utils.

BREAKING CHANGE: `ipfs.add` results now contain a `cid` property (a [CID instance](https://github.com/multiformats/js-cid)) instead of a string `hash` property. refs ipfs-inactive/interface-js-ipfs-core#394

daviddias · 2019-12-13T08:31:48Z

All 👍🏽 for all the breaking changes, just unsure about:

Most users will fall flat on their faces by not having a convenient method. We've been there in the past, that is why we added these auxiliary methods in the first place.

alanshaw · 2019-12-13T13:52:30Z

Note taken. Just one question - was it problematic because there wasn't a way to do it without rolling your own?

Both js-ipfs and js-ipfs-http-client export urlSource and globSource to give users the power to perform what addFromURL and addFromFs were doing without expanding the API to deal with specific input types/locations.

This will be on the README here, but it should definitely also be in the interface-ipfs-core docs for ipfs.add. I'll make a note.

This keeps our API surface area small and will maybe encourage the community to create other *Source sources! - torrentSource, ftpSource?

Have I convinced you this is ok?

P.S. thank you for taking the time to review ❤️

* refactor: convert pubsub API to async/await * refactor: switch wording to not enabled

* refactor: convert object API to async/await * fix: object export

* refactor: convert swarm API to async/await * refactor: use libp2p.connections

daviddias · 2019-12-17T09:06:05Z

Oh #2547 (comment) nice! Looks super simple to me :) Thank you!

alanshaw · 2020-01-24T11:37:55Z

superseded by #2726

alanshaw requested review from lidel, hugomrdias and achingbrain October 17, 2019 11:36

lidel reviewed Oct 18, 2019

View reviewed changes

alanshaw mentioned this pull request Oct 21, 2019

Awesome Endeavour: Async Iterators #1670

Closed

lidel added a commit to ipfs/community that referenced this pull request Nov 4, 2019

Update CONTRIBUTING_JS.md

a56d710

add section on error codes Ref. ipfs/js-ipfs#2547 (comment)

lidel mentioned this pull request Nov 4, 2019

Add section on "Error Codes" to CONTRIBUTING_JS.md ipfs/community#445

Merged

achingbrain approved these changes Nov 15, 2019

View reviewed changes

alanshaw mentioned this pull request Dec 9, 2019

⚡️ v0.41.0 RELEASE 🚀 #2656

Closed

57 tasks

alanshaw force-pushed the feat/ipfsx-boot-and-add branch from 0e27751 to 5f44bca Compare December 9, 2019 21:04

alanshaw changed the title ~~feat: EXPERIMENTAL ipfsx API - boot procedure and add API method~~ refactor: async iterators Dec 10, 2019

Alan Shaw added 3 commits December 10, 2019 11:08

refactor: standard errors and remove first/last utils

8deed6a

fix: use standard errors

4f9ba62

alanshaw marked this pull request as ready for review December 10, 2019 11:41

Alan Shaw added 2 commits December 10, 2019 11:45

refactor: no separate boot procedure

d8e82b7

fix: convert config API as it used by ipfsd-ctl

cbdf9e6

alanshaw force-pushed the feat/ipfsx-boot-and-add branch from b2dfbc6 to cbdf9e6 Compare December 10, 2019 11:45

This was referenced Dec 11, 2019

refactor: convert id API to async/await #2665

Merged

refactor: convert object API to async/await #2666

Merged

refactor: return CID instance from ipfs.add

21e62cb

BREAKING CHANGE: `ipfs.add` results now contain a `cid` property (a [CID instance](https://github.com/multiformats/js-cid)) instead of a string `hash` property. refs ipfs-inactive/interface-js-ipfs-core#394

alanshaw changed the title ~~refactor: async iterators~~ refactor: async iterables Dec 12, 2019

alanshaw requested a review from dirkmc December 12, 2019 15:33

Alan Shaw added 2 commits December 13, 2019 17:38

refactor: convert ping API to async/await (#2671)

0921a82

refactor: convert pubsub API to async/await (#2672)

78f361e

* refactor: convert pubsub API to async/await * refactor: switch wording to not enabled

alanshaw mentioned this pull request Dec 16, 2019

Update ipfs-utils to the latest version 🚀 #2655

Closed

alanshaw added awesome endeavour exp/expert Having worked on the specific codebase is important P0 Critical: Tackled by core team ASAP labels Dec 16, 2019

Alan Shaw added 5 commits December 16, 2019 17:01

refactor: convert bitswap API to async/await (#2659)

1449103

refactor: convert id API to async/await (#2665)

20cbd6d

refactor: convert version API to async/await (#2679)

ed5e9ef

refactor: convert object API to async/await (#2666)

920e399

* refactor: convert object API to async/await * fix: object export

refactor: convert swarm API to async/await (#2681)

7e932b1

* refactor: convert swarm API to async/await * refactor: use libp2p.connections

alanshaw mentioned this pull request Jan 14, 2020

refactor: async await roundup #2683

Closed

alanshaw closed this Jan 24, 2020

alanshaw deleted the feat/ipfsx-boot-and-add branch January 24, 2020 11:38

alanshaw mentioned this pull request Feb 10, 2020

refactor: remove proxy api object and detect initialisation state #2762

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: async iterables #2547

refactor: async iterables #2547

alanshaw commented Oct 17, 2019 •

edited

Loading

achingbrain commented Oct 17, 2019

alanshaw commented Oct 17, 2019

lidel left a comment

lidel Oct 18, 2019

daviddias Oct 22, 2019

daviddias commented Oct 21, 2019

alanshaw commented Oct 21, 2019

achingbrain commented Oct 21, 2019

alanshaw commented Nov 12, 2019

achingbrain left a comment

alanshaw commented Dec 10, 2019 •

edited

Loading

daviddias commented Dec 13, 2019

alanshaw commented Dec 13, 2019 •

edited

Loading

daviddias commented Dec 17, 2019

alanshaw commented Jan 24, 2020

refactor: async iterables #2547

refactor: async iterables #2547

Conversation

alanshaw commented Oct 17, 2019 • edited Loading

achingbrain commented Oct 17, 2019

alanshaw commented Oct 17, 2019

lidel left a comment

Choose a reason for hiding this comment

lidel Oct 18, 2019

Choose a reason for hiding this comment

daviddias Oct 22, 2019

Choose a reason for hiding this comment

daviddias commented Oct 21, 2019

alanshaw commented Oct 21, 2019

achingbrain commented Oct 21, 2019

alanshaw commented Nov 12, 2019

achingbrain left a comment

Choose a reason for hiding this comment

alanshaw commented Dec 10, 2019 • edited Loading

daviddias commented Dec 13, 2019

alanshaw commented Dec 13, 2019 • edited Loading

daviddias commented Dec 17, 2019

alanshaw commented Jan 24, 2020

alanshaw commented Oct 17, 2019 •

edited

Loading

alanshaw commented Dec 10, 2019 •

edited

Loading

alanshaw commented Dec 13, 2019 •

edited

Loading