-
Notifications
You must be signed in to change notification settings - Fork 108
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not happy baking this into IPLD. {'/': ...}
was a hack to get JSON working.
This will need some strong arguments/motivations.
Links.md
Outdated
+--------------------+ +---------------------+ | ||
``` | ||
|
||
A codec may represent object types and tree structures any way it wishes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"s/codec/format"?
Links.md
Outdated
etc) or even new custom serializations. We will refer to this as the | ||
**representation**. | ||
|
||
Therefor, a **format** is the standardized representation of IPLD Links and Paths in a given **representation**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the "format" is describes how to translate between structured data and binary.
# Canonical Link Representation | ||
|
||
Codec **serializers** MUST reserve the following canonical | ||
representation of link encoding. The canonical representation is an object with a single key of `"/"` and a base encoded string of the link's CID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We originally said that no objects can have slashes in keys (keys must be valid path components) but backed off when we realized that wasn't going to work. At this point, I'm not sure if we can introduce a restriction like this. CBOR objects definitely can have a single "/" keys.
We really do need to sit down and think through what can and can't go into an IPLD object because I think we're getting closer and closer to "everything goes". That might be fine but we need to address it explicitly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note, this was never intended to be the canonical representation. It was a hack to get JSON working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was somewhat aware of the history. But I think that we do need some form of canonical representation that can be represented in pure JSON in order to open a path for people to encode objects from one format to another.
However, I don't think, and am actively trying to change in dag-cbor
, the default use of the canonical representation in the deserializer. It's a horrible pain to work with and, while I want to reserve it for interop, I don't want it to be in common use but instead buried in the implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We originally said that no objects can have slashes in keys (keys must be valid path components) but backed off when we realized that wasn't going to work
It is worth nothing that CID encoded in base64 will have slashes in them. I am wondering it was really a good idea to allow that encoding, it will mess up paths like /ipfs/<base 64 cid>/file.txt
.
So, it feels like this is trying to work around the fact that js-ipld has a single In go, we have typed nodes. |
I'm not actually thinking much about the What I'm mostly thinking about is how to define interop between implementations, specifically
What I've tried to do is leave the door totally open to people doing this kind of stuff in the serializer/deserializer. What I'm not comfortable with is defining the types that must be used in a particular language or serializer/deserialzer. There's a whole lot of preferences and opinions I'd prefer to just not step on or potentially exclude. I really don't like working with the For instance, let node = dagJSON.from(block || buffer)
let transcoded = dagSomeFormat.serialize(JSON.parse(dagJSON.stringify(node))) That only works if we have some canonical representation each serializer/deserializer has reserved. If we want to just completely give up on that, we can, but we won't have a good way to transcode nodes. |
Pushed some fixes for the other comments. I also removed the yaml example because I find that it just complicates the messaging. The purpose of the form reservation isn't for expressing in the DSL but for expression in code between codecs. |
FWIW, on that front: I've been playing with some fresh takes on go-ipld APIs in a little sandbox off to the side, and one of the ideas I'm playing with that might have merit turned up these ideas:
Now, like I said, that's just in a little toy experiment somewhere, and I don't actually know if it's a good idea. But maybe it's interesting food for thought, as another example of how {the way we operate on the data} versus {the codec we use for the data and hashing it} can be distinct. /2c, I'll go back to lurking now :) |
I don't think that @Stebalien and @diasdavid are in alignment about the future of the canonical JSON representation. @diasdavid could you please weigh in so that we can move forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I'm onboard but I would like to see some examples to ensure that we and our future selfs are on the same page.
implementation of `dag-json` includes a method called `stringify()` which | ||
returns a standard JSON string with links encoded in the canonical format. | ||
This makes trans-encoding of nodes into other formats much easier since | ||
they are required to accept the canonical format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikeal can you add a few examples to this RFC that show how objects with links will be serialized and deserialized (and then again serialized and deserialized) by the dag-json and dag-cbor formats?
It will provide a ton of clarity to implementers and users and what is the expected behavior and how dag-json differs from dag-cbor and just plain JavaScript objects (!== JSON).
Ok, I think we're closer to alignment now, but after some recent conversations I'm thinking about re-naming/re-scoping this document. Essentially, what we care about here a JSON representation that can be used to convert between implementations. It's not just about links, we may want to reserve space for converting between other types in the future. To that end, I'd like to re-name to something like "Canonical JSON Representation" and also take a crack at standardizing a binary form, possibly something along the lines of |
The issue here is that you're using the normal JSON parser. A dagJSON deserializer should, IMO, turn the CIDs into a special link type that the One could write:
(Cid could even have a toJSON method that converts it to
I see. So you're not saying that the format necessarily needs to use this, just that if I hand This is really looking like a JavaScript UX issue, not an IPLD format issue. We do need a consistent way to represent IPLD objects in-memory in javascript, but that doesn't have to conform to the DagJSON. |
At the end of the day, my objection to this is in the motivation. If we had an "IPLD needs this" motivation and we couldn't find a reasonable alternative, I'd be fine with it (albeit really unhappy as it (the
The catch is that we're working with DagJSON, not JSON. Brainstorming solutions:
Personally, I prefer option 1 but there are probably more we haven't considered. FYI:
In general, we still won't be able to transcode between formats until we get a type system. There was an endeavor to try to find a set of primitives to allow for this (see: #56) but this hit a dead-end (see the comment I just added). Basically, we agreed on a set of primitives and then realized that they wouldn't quite cut it, rinse, repeat, until we realized it just wasn't going to work. Unfortunately, without a concrete set of primitives, translating between formats isn't going to happen. |
You're right, this is my mistake and I shouldn't have done this. Rather, what this should be is something close to a standard
Exactly. How the codec decides to encode links is completely at the codec's discretion. The codec is also free to take any object it could interpret as a Link and encode it into the Link format it chooses. All we're asking is that, if any codec serializer see's this representation To recap:
However, because we are reserving the interpretation of this form in the serializer it will necessarily make it impossible to use the same form to represent something that is not a Link.
Again, my apologies for relying on the standard parser in my example. I think that this use case, transcoding nodes from one codec to another, is a broader need than just in JS. The closest thing we have to a cross-language basic type system is JSON. Every language supports JSON and has a way to represent JSON types as types native in that language and encode those same types back into JSON. In a way, this isn't actually a "canonical JSON representation" it's a "canonical simple types representation." We're saying, "the language you write a serializer in will support these basic types, please interpret this encoding of links in those simple types as a link." If we said that "IPLD Types: Level 0" is just the types that are in JSON, we would describe this with language along the lines of "this is how you describe a Link in strictly L0 types." I hope that clears things up. This conversation makes it clear that this particular document needs to be scrapped as the approach the document has taken is confusing. Instead, I'm going to define IPLD terminology generally in a document, which will include the definitions at the top of this document related to codec serializers and formats. Once that lands I'll take a pass at an RFC for how to support transcoding links (and maybe binary). |
Closing. Canonical representations are out. ipld/ipld#50 |
This is a bit different than what we initially discussed in ipld/ipld#44
After implementing
dag-json
I felt comfortable enough writing up a solid set of recommendations for codec implementations.I think this strikes the right balance of flexibility and interoperability. It avoids restricting a developers ability to use language and encoding features but still requires enough support for a canonical serialization that we can trans-encode nodes between codecs.