-
Notifications
You must be signed in to change notification settings - Fork 6
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5 +/- ##
==========================================
- Coverage 23.22% 16.20% -7.01%
==========================================
Files 7 11 +4
Lines 797 1945 +1148
==========================================
+ Hits 185 315 +130
- Misses 596 1591 +995
- Partials 16 39 +23
Continue to review full report at Codecov.
|
Nice. I'm currently trying to finalise the new JS dag-pb implementation adhering strictly to the schema and I'm keen to see if we can make the same happen here and I don't think this is quite there--I don't quite understand how ipld-prime deals with "absent" elements though, so maybe it is even with the "AssignX" calls. Specifically, when decoding a block using go-merkledag the following should happen:
Further, I'm adding all of these details into the spec notes under the schema @ ipld/specs#297 and pointing to this repo as the new-generation DAG-PB implementation (even though it doesn't have a full builder impl yet). I'd really like to make sure that the |
if link.d.Name.Maybe == schema.Maybe_Value { | ||
tmp := link.d.Name.Value.x | ||
if link.FieldName().Exists() { | ||
tmp := link.FieldName().Must().String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd write a review about how all this causes unfortunate allocations, but... it's unavoidable in interacting with protobufs, as I understand it. Ah well.
(Fun fact if we ever come back to optimize this though, without just tearing deep through and doing the parse protobuf ourselves completely: I think we could bump these tmp
things into a struct and make sure the whole thing escapes to heap at once, and it'll shave a constant factor off.)
if link.d.Hash.Maybe == schema.Maybe_Value { | ||
cid := link.d.Hash.Value.x.(cidlink.Link).Cid | ||
if link.FieldHash().Exists() { | ||
cid := link.FieldHash().Must().Link().(cidlink.Link).Cid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this field really is This is fine. optional
according to the schema, a Must
here is fragile, isn't it?Exists
is checked right above the Must
. (Silly pinhole-optimizer brain of mine...) Leaving the comment here to remind myself I already checked this.
To @rvagg 's comments about ensuring we understand and standardize some of this stuff so that it's behaviorally consistent with other IPLD libraries that handle dag-pb --
I'll understand if @hannahhoward doesn't want to follow us on these details in this PR today though. Getting this repo this much closer to master on these interfaces is a big step forward and nice to do as quickly as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey this is kind of awesome.
Is it true that the generated files now have no manual patches inside them? If so that's super great -- means this'll be way easier to maintain going forward.
It also looks like the Encode*
and Decode*
methods attached to the types here are... not actually required to be attached to the types anymore: we could apply them on almost any ol' ipld.Node
(except that a few shortcut methods are used that are specialized to the codegen types). So it would be pretty trivial to hoist this into a general codec function that people could use with other Node implementations if they really wanted to! That's cool. Not important to do today, of course. But super super cool that it's now just about trivial to take that step.
// from RawNode__Builders | ||
func (nb *_RawNode__Builder) DecodeDagRaw(r io.Reader) error { | ||
return fluent.Recover(func() { | ||
fnb := fluent.WrapAssembler(nb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will definitely cause at least one extra alloc, and idk if it's paying for itself in simplicity added in this case. But I'm not complaining enough to demand a change, either. Probably not a big issue unless we profile it and find out that it is.
return fluent.Recover(func() { | ||
fb := fluent.WrapAssembler(nb) | ||
fb.CreateMap(0, func(fmb fluent.MapAssembler) { | ||
fmb.AssembleEntry("Links").CreateList(len(pbn.Links), func(flb fluent.ListAssembler) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think @rvagg 's comment about the links list here suggests there should be a branch on if len(pbn.Link) == 0
where we then skip assembling a data-model Links
entry at all.
I can't decide if I think that's weird or not, though, now that I think about it. I'm hedging on if I think we should update our IPLD Schema describing this to just say that the list is required... and then if it's zero-length, sure, we can encode it to the proto wire format as missing.
That's a discussion that probably @rvagg and I should hash out on our specs documents about it though. If this is the behavior this code has had already, I think it's fine to keep it today. (I don't want to force you through testing a change like this in either direction today...)
hash, err := cid.Cast(link.GetHash()) | ||
return fluent.Recover(func() { | ||
fb := fluent.WrapAssembler(nb) | ||
fb.CreateMap(0, func(fmb fluent.MapAssembler) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd mildly rather that these be -1 when they're saying "shrug idk deal with it". The contract about this isn't strict, but there are some logic branches that look at 0 as a "ah, expect 0" and look at -1 as "ah, take a guess". It's also helpful for reading.
... but none of those codepaths happen apply here in a codegen'd struct, so, this is a nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to improve the godoc on this, I've realized. Mea culpa.
}, | ||
}, | ||
if err != nil { | ||
panic(fluent.Error{Err: fmt.Errorf("unmarshal failed. %v", err)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rvagg : I went deep on exactly what happens here if the protobuf has a nil value for hash, because of its relevance to our other discussions about documenting this behavior:
tl;dr: a lack of hash here is not actually tolerated by this code: we get an error returned here.
(In detail: cid.Cast
passes the nil down until some varint code tries to read from it, and at that point we get cid.ErrVarintBuffSmall
and that bubbles back up from here.)
So I think this is a pretty solid argument that we don't have to consider Hash
an optional
in our IPLD Schema describing this.
} | ||
flb.AssembleValue().CreateMap(0, func(fmb fluent.MapAssembler) { | ||
fmb.AssembleEntry("Hash").AssignLink(cidlink.Link{Cid: hash}) | ||
fmb.AssembleEntry("Name").AssignString(link.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm moderately confused what the protobuf library is doing here -- we had to hand the protobuf lib a pointer to the name string elsewhere, because it's option in protobuf's mind, right? But here it's returning a string (no pointer). Does this panic if the string is missing? Or silently coerce it to an empty string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea it silently coerces it tp empty string. Which is weird behavior except -- that's exactly what go-merkledag uses in in IPFS. My basic approach is follow as close as possible so as to not break things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, cool. I appreciate that, thanks for talking through it with us so we have it also as input to the other standardization quests @rvagg mentioned :)
flb.AssembleValue().CreateMap(0, func(fmb fluent.MapAssembler) { | ||
fmb.AssembleEntry("Hash").AssignLink(cidlink.Link{Cid: hash}) | ||
fmb.AssembleEntry("Name").AssignString(link.GetName()) | ||
fmb.AssembleEntry("Tsize").AssignInt(int(link.GetTsize())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above about name strings. Does this panic if the field isn't in the protobuf? Or silently coerce it to a zero?
btw here's a deep-dive into the forms DAG-PB can take encoded and how go-merkledag represents it in the structs: ipfs/go-merkledag#58 So, there's an unfortunate behaviour with the getters in go-merkledag, they handle
I see no good reason to avoid using the getters here unless there's a chance your |
re: detaching from types -- Encode does still use some of the Fields$$$ methods, so not quite -- but still very close. Probably it'd just be a bit more verbose. |
Goals
Get ipld-prime proto to a version of ipld-prime near master (specifically including byte buffer fix)
Implementation
The changes to code gen seemed to have gotten to the point where a more significant change to the generation script was needed, so I went ahead and implemented it, matching the styles used by the tests in node/gendemo of go-ipld-prime.
Then, since I was relying on private members of the node structures that had changed (#oops) I switched over to only accessing data through public methods.
The assembling of nodes for the PBNode decoder was EXTREMELY verbose without fluent, so I went ahead and used it along with fluent.Recover, hopefully in a way that wasn't too out of the design goals.