Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: include Environment in snapshot #32761

Closed
wants to merge 8 commits into from

Conversation

addaleax
Copy link
Member

@addaleax addaleax commented Apr 10, 2020

This PR adds snapshotting support for everything that currently runs as part of our bootstrapping, for improved startup performance (+36 % currently until the main script starts). It should also enable us to include more data in the snapshot in the future.

/cc @joyeecheung (and thanks to her for a bunch of help looking into this!)


src: introduce BaseObject base FunctionTemplate

This enables us to tell whether a JS bindings object is associated
with a BaseObject or not.

src: collect external references for snapshot

Gather external references from the different native modules that
are loaded during bootstrapping, for inclusion in the snapshot.

lib,src: fix process.features.cached_builtins when snapshotted

This value is not a constant, in the sense that its value when
running node_mksnapshot and when running the default node binary
are different.

buffer: refactor zero-fill-field passing

Instead of storing the zero-fill-field as a fixed ArrayBuffer
on the binding, only load it via a function call once bootstrapping
is complete. This makes sure that the value is set correctly
after loading from a snapshot (i.e. that the accessor for the
field is stored in the snapshot, not the field itself).

deps: V8: cherry-pick bb9f0c2b2fe9

Original commit message:

[snapshot] Improve snapshot docs and error printing

- Minor improvements to the documentation for snapshotting.
- Add newlines to printed errors where necessary.

Change-Id: I822e7e850adb67eae73b51c23cf34e40ba3106f0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2144954
Reviewed-by: Toon Verwaest <[email protected]>
Commit-Queue: Toon Verwaest <[email protected]>
Cr-Commit-Position: refs/heads/master@{#67111}

Refs: v8/v8@bb9f0c2

deps: V8: cherry-pick ea0719b8ed08

Original commit message:

[snapshot] Do not defer ArrayBuffers during snapshotting

ArrayBuffer instances are serialized by first re-assigning a index
to the backing store field, then serializing the object, and then
storing the actual backing store address again (and the same for the
ArrayBufferExtension). If serialization of the object itself is deferred,
the real backing store address is written into the snapshot, which cannot be
processed when deserializing, leading to a crash.

This fixes this by not deferring ArrayBuffer serialization and adding a DCHECK
for the crash that previously occurred.

Change-Id: Id9bea8268061bd0770cde7bfeb6695248978f994
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2144123
Commit-Queue: Jakob Gruber <[email protected]>
Reviewed-by: Dan Elphick <[email protected]>
Cr-Commit-Position: refs/heads/master@{#67114}

Refs: v8/v8@ea0719b

deps: V8: cherry-pick 9baf2865671c

Original commit message:

rehash JSMap and JSSet during deserialization

To rehash JSMap and JSSet, we simply replace the backing store
with a new one created with the new hash.

Bug: v8:9187

Refs: joyeecheung/v8@9baf286

src: include Environment in snapshot

This snapshots a lot more data for startup than what we did previously
for increased startup performance.

                                                                                   confidence improvement accuracy (*)   (**)  (***)
misc/startup.js mode='process' script='benchmark/fixtures/require-cachable' dur=1        ***      9.73 %       ±3.78% ±5.03% ±6.54%
misc/startup.js mode='process' script='test/fixtures/semicolon' dur=1                    ***     36.11 %       ±3.91% ±5.23% ±6.86%
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Apr 10, 2020
@addaleax addaleax added the blocked PRs that are blocked by other issues or PRs. label Apr 10, 2020
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

src/env.h Outdated Show resolved Hide resolved
CHECK(blob.CanBeRehashed());
snapshot_data.PrintErrorsAndAbortIfAny();
// XXX(addaleax) disabling this check is not good
// CHECK(blob.CanBeRehashed());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I’ll make sure to try that out as soon as I get to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know if this is helpful or something that you already know, but for e.g. builtInObjects from lib/internal/util/inspect.js, this restore the contents of the Set, but .has('Array') returns false after deserialization even though it is listed in the Set

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I can't reproduce this with v8 directly, will try to get a smaller reproduction from this branch

Copy link
Member

@joyeecheung joyeecheung Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet gotten to the bottom of this but this seems to be related to the fact that we don't actually use the native Set/Maps in core, but the primordials ones. Somehow adding a line of NativeModule.dummy = new Set([1]); (that is, including a random non-empty native set in the snapshot) in the bootstrap script would do the trick for Sets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let me know if this is something that you could use any help with. 👍 In any case, I’ve included your WIP patch here to get an overview over what works and what doesn’t.

Copy link
Member

@joyeecheung joyeecheung Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some additional digging and it seems

  1. This does not reproduce if only loaders.js is in the snapshot
  2. This does not reproduce if loaders.js is in the snapshot while nativeModuleRequire('internal/util/inspect') is also performed early there
  3. It goes away with adding NativeModule.internalBindingWhitelist = internalBindingWhitelist; in the loaders.js (while including node.js in the snapshot as well)
  4. Something done below the point of process.config = JSONParse(internalBinding('native_module').config); in node.js caused this - I am not able to go further at the moment because moving the rest into loaders.js result in uninitialized performance_states buffers. Not sure what caused this yet (the section below that point creates unaliased TypedArrays in closures, though).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it probably has to be an issue in the V8 snapshotting implementation, though? I can also try to dig a bit deeper later…

Copy link
Member

@joyeecheung joyeecheung Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far the smallest repro I have is

  1. Change to snapshot loaders only
  2. Add this to the loaders
const EventEmitter = nativeModuleRequire('events');
const origProcProto = primordials.ObjectGetPrototypeOf(process);
primordials.ObjectSetPrototypeOf(origProcProto, EventEmitter.prototype);
const timers = nativeModuleRequire('timers');

This (and the NativeModule.internalBindingWhitelist = internalBindingWhitelist; fix) seems somewhat random, my guess is this could be GC-dependent, and/or some map transition needs to be taken care of. I could try doing in-place rehashing and see if this goes away, but no matter which approach is taken a smaller repro would be useful for leaving a test against this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um, I think I've found one of the Sets that wasn't properly rehashed during deserialization, so something in the rehashing logic is probably wrong. Still need to debug a bit deeper to create a repro for it, though.

src/env.cc Outdated Show resolved Hide resolved
snapshot_data->WriteBool(has_run_bootstrapping_code());
snapshot_data->WriteBool(can_call_into_js());

snapshot_data->WriteUint32(module_id_counter_);
Copy link
Member

@joyeecheung joyeecheung Apr 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale behind doing WriteBool/WriteUint32 etc. instead of serializing C++ structures into the code being emitted? Low-level writers make sense for V8 because they emit assembly for the blob, but we generate C++ files with static data for these blobs and I think we should take advantage of that. These kinds of low-level writers and readers can be quite annoying and make it more difficult for new contributors to understand and debug crashes/mismatch, from my experience.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale behind doing WriteBool/WriteUint32 etc. instead of serializing C++ structures into the code being emitted? Low-level writers make sense for V8 because they emit assembly for the blob, but we generate C++ files with static data for these blobs and I think we should take advantage of that.

I’m not entirely sure if that’s what you mean, but if you suggest emitting code in the snapshot file that creates these objects on process startup, then that’s something I’d really like to avoid.I don’t think it would be significantly easier to do that. But more importantly, with this approach things like userland snapshotting can eventually become possible, and that’s definitely a major goal here for me.

These kinds of low-level writers and readers can be quite annoying and make it more difficult for new contributors to understand and debug crashes/mismatch, from my experience.

I’ve thought about JSON as an alternative, but that ultimately seemed like it would come with overhead that we’d want to avoid. I’ve done my best here to provide a good debugging experience, though, and I while obviously can’t speak for new contributors, it enabled me to rather easily figure out mismatches. And as far as crashes go, I think the only ones that I experienced came from reading empty handles from V8 and then accessing them, but the API here should be good enough to handle that by now imo.

I think writing out C++ structures into the snapshot would provide a much worse debugging experience, too.

Copy link
Member

@joyeecheung joyeecheung Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you suggest emitting code in the snapshot file that creates these objects on process startup

Sort of, but not really - I am talking about emitting the data as static, plain objects being passed into initializers, like what we do for generating the map of builtin JS code. Then there will be an intermediate representation for a static graph of data that will be further unfolded into the runtime during startup. Then instead of wondering "whether the env->some_count() is X", you'll just see in the generated code that X is placed in the initializer of a structure, which will be later unfolded into the Environment with firstly static_data = { X... } in the generated code, and thenenv->set_some_count(static_data.count), where the shape of static_data is predefined and can be understood by the debugger automatically.

But more importantly, with this approach things like userland snapshotting can eventually become possible, and that’s definitely a major goal here for me.

I don't see how those are contradicting with each other? We could have, say, a static instance of a well-defined class whose fields are initialized in a human-readable way in the generated C++ code for bootstrapping the built-in instance, as well as dynamic instances of that class that can be mapped from contiguous blobs at run time. Then at least when debugging that builtin blob what's in there is already visible in the C++ file, and can be displayed nicely in a human readable format with a native debugger (since they'll be associated with the right symbols at program startup).

If you can only see whether there's something wrong as you unfold these bytes into the Environment, it could be quite confusing once there are loops/conditionals for optional/variable-sized data - then you'd need to follow through the unfolding to figure out what's wrong, and remember what bytes come after what - because the place where it crashes can be far away from where it starts to go wrong, and having side-effects while you unfold from that raw, contiguous blob would make that even more difficult to figure out. Since we are adding support for built-in support first, why not just start with static initializers first, and add dynamic initializers later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can only see whether there's something wrong as you unfold these bytes into the Environment, it could be quite confusing once there are loops/conditionals for optional/variable-sized data - then you'd need to follow through the unfolding to figure out what's wrong, and remember what bytes come after what - because the place where it crashes can be far away from where it starts to go wrong, and having side-effects while you unfold from that raw, contiguous blob would make that even more difficult to figure out.

Just in case you missed it, there are typechecking and scoping tags in the generated snapshot data, which have been very effective for me in avoiding situations like that – while working on this, whenever I had an issue like this, it was always caught very early when serialization and deserialization were out of sync in some way.

I like the idea of having a human-readable format of the snapshot data. I’ve added a --dump-snapshot here, that seems like it could be helpful. When deserialization doesn’t match the expected types, both a description of the mismatch is given, and the index where the error occurred, which can be pretty easily looked up in the node --dump-snapshot output.

For example, if I “accidentally” forget writing the base object count, this is the output when trying to deserialize:

$ ./node -p 42
Snapshot error: At [5198] Environment:BaseObjects: Unexpected tag kEntryStart (expected kUint64)
Snapshot error: At [5198] Environment:BaseObjects: Entries left on snapshot stack
 1: 0x564833e5d9df node::DumpBacktrace(_IO_FILE*) [./node]
 2: 0x564833f26efa node::Abort() [./node]
 3: 0x5648340698cc node::SnapshotDataBase::PrintErrorsAndAbortIfAny() [./node]
 4: 0x564833f98bd0 node::NodeMainInstance::CreateMainEnvironment(int*) [./node]
[…]
Aborted (core dumped)
$ ./node --dump-snapshot
[…]
5115     String: "internal/validators"
5144     String: "path"
5158     String: "timers"
5174     EndEntry
5175   StartEntry: [BaseObjects]
5197     StartEntry: [NoBindingData]
5221       Object index: 32
5231       EndEntry
5232     EndEntry
5233   Object index: 33
5243   EndEntry

I don’t think you’d get that kind of debugging from C++ compiler errors, tbh.

Since we are adding support for built-in support first, why not just start with static initializers first, and add dynamic initializers later?

That seems like a lot of extra complexity to me, especially since dynamic storage is something that I’d implement anyway and I still don’t quite see concrete advantages to writing static objects.

Copy link
Member

@joyeecheung joyeecheung Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are typechecking and scoping tags in the generated snapshot data

Yes, I've noticed them, but still it requires unfolding the data on-the-fly to make use of them (or I guess you can try to search for tags in a blob of bytes and map them mentally with prior knowledge about the encoding, but that sounds like poor-man's DWARF here and not quite straightforward)

I don’t think you’d get that kind of debugging from C++ compiler errors, tbh.

Why not? I think the equivalent would just be errors for forgetting to initialize one of the fields in a initializer and other uncompilable things. If there is a predefined structure that describe a set of plain data that need to be unfolded into the Environment, in the debugger you can easily just inspect instances of them and do evaluations on them (and that'll support whatever GUI debugger with nesting for that matter). That's what you can do with node::native_module::NativeModuleLoader::instance_.source_, for example.

The --dump-snapshot sounds like a good idea, but I can't see why they make these direct low-level readers/writers a must. I think it'd be even easier to implement it with a predefined structure. Then you can either use the debugger to print a structure on-the-fly, because that encoding is already built into the binary by the debugger, or just decode each field of that structure at one go to stdout for a command line option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joyeecheung Okay, yeah, the class names are included in the snapshot currently, both for debugging and for knowing which kind of object we’re supposed to serialize.

Copy link
Member

@joyeecheung joyeecheung Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is that these class names are not shared with the current format, and hence for every object in the snapshot, they will be repeated even though this is just structural information for debugging and IMO this information should not be proportional to the number of instances in the snapshot. Setting whether builtin snapshot should be emitted as C++ code aside, having pre-defined structures would reduce the overhead for each C++ object significantly (it's probably 1 byte for the entire object, v.s. number of fields + number of bytes of the class string + 1 byte for start + 1 byte for end, the latter could be more if we want to nest C++ types and make that nesting debuggable).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I don’t think size is currently an issue for this blob, considering that the corresponding V8 snapshot is ~250 times larger currently.

If we do think that size becomes an issue eventually, we could probably push the size down quite a bit by optionally encoding 64-bit and 32-bit integers using smaller sizes if they are available, and if necessary use backreferences for things like class names.

it's probably 1 byte for the entire object

That sounds like an approach where the type information would be encoded as an enum, which I’d definitely want to avoid if at all possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like an approach where the type information would be encoded as an enum, which I’d definitely want to avoid if at all possible.

Why would that be undeseriable? (Referencing my last comments in #32761 (comment))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joyeecheung To be clear, I think it’s generally a bad idea to keep global list of all possible cases of something, whether that’s BaseObjects or something else. It’s okay if nothing else works, but given that we don’t have to do this, I would not do it at all.

@addaleax addaleax force-pushed the env-snapshot branch 2 times, most recently from 2055b0a to e34dcf2 Compare April 13, 2020 17:00
src/env-inl.h Outdated
}
T* data = new T(this, obj);
if (allocation == nullptr) allocation = operator new(sizeof(T));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do this here instead of using DeserializeInternalFieldsCallback?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joyeecheung How would that look like?

Copy link
Member

@joyeecheung joyeecheung Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC BaseObject::SerializeInternalFields here are just noops (other than they perform some checks...in addition to the checks that V8 already performs), even though the V8 API is designed to save the association between the JS value (externals here) and the embedder fields. I think we are bypassing the entire mechanism here by assuming that the index returned by V8 starts from zero incrementally and thus matching how we unfold the bytes from the SnapshotCreateData here? Although if we play along with the API, the payload should be retrieved and reinterpreted from the DeserializeInternalFieldsCallback as serialized in in the SerializeInternalFieldsCallback passed into AddContext, and what you need to save additionally to the snapshot data are just indices, instead of payloads being read/written using the low-level readers/writers here (V8 will just find the payload in the blob for you when given indices) - so ideally we should only use readers/writers for ints when serializing/deserializing these values (and reserve the other types for special data serialized along with the Environment itself that are not associated with any JS values)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC BaseObject::SerializeInternalFields here are just noops (other than they perform some checks...in addition to the checks that V8 already performs),

Yes, that’s correct.

even though the V8 API is designed to save the association between the JS value (externals here) and the embedder fields.

It’s not really clear to me what this means. I thought the V8 internal fields serialization/deserialization hooks were designed to save the association between the JS object and the native object(s) that the internal fields refer to?

I think we are bypassing the entire mechanism here by assuming that the index returned by V8 starts from zero and thus matching our internal indices of the referenced payload here?

No, nothing in this codes makes such assumptions.

Although if we play along with the API, the payload should be retrieved and reinterpreted from the DeserializeInternalFieldsCallback as serialized in in the SerializeInternalFieldsCallback passed into AddContext,

Right, we could add a callback for that if we ever run into a case in which the internal fields cannot be set properly otherwise.

Currently, there is no need to do so, because all objects with internal fields that are being serialized (and all objects that I could imagine will be added to the snapshot in the future) are BaseObjects. Since BaseObjects are V8 persistent handles tied together with C++ objects, and we have to serialize the persistent handles anyway, we already have one serialization step that performs the necessary storage of data during the “serialize all global handles” phase. When deserializing, we know how to re-create the BaseObject based on the deserialized data, and the internal fields will be set as they usually would during regular program execution.

This is why the BaseObject internal fields callback serializer only is a direct no-op for the first internal field – the first one is one where we know that the BaseObject constructor resets it to a proper value, and where we know it is semantically valid to do so. For other potential fields, that’s not a given.

and what you need to save additionally to the snapshot data are just indices, instead of payloads being read/written using the low-level readers/writers here (V8 will just find the payload in the blob for you when given indices) - so ideally we should only use readers/writers for ints when serializing/deserializing these values (and reserve the other types for special data serialized along with the Environment itself that are not associated with any JS values)

Tbh, I don’t really understand what you are referring to here. It would be good if you could expand on that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and what you need to save additionally to the snapshot data are just indices, instead of payloads being read/written using the low-level readers/writers here (V8 will just find the payload in the blob for you when given indices) - so ideally we should only use readers/writers for ints when serializing/deserializing these values (and reserve the other types for special data serialized along with the Environment itself that are not associated with any JS values)

Tbh, I don’t really understand what you are referring to here. It would be good if you could expand on that.

It just occurred to me that you’re probably referring to the kind of index returned by SnapshotCreator::AddData()/passed to GetDataFromSnapshotOnce(). 🙂 For those, the classes that are being serialized, like Environment, don’t currently have to deal with passing the index around, and instead leave that up to the snapshot writer/reader API in this PR (look around for V8SnapshotIndex, I’ve added a separate typedef for that in the last fixups commit to be more explicit about what kind of indices those are).

Copy link
Member

@joyeecheung joyeecheung Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but it makes the V8 snapshot larger by the same amount.

BTW, I don't think it does - with the current approach, we need one tag per field, plus start and end entries, as well as class name strings that aren't shared across these blobs. With predefined structures, we just need one tag per StartupData (which can include many fields, and there is no need to further tag these fields), and the structural information won't be proportional to the number of objects in the snapshots (they'll be in e.g. DWARF, shared among objects of the same type).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t see how this would affect the amount of serialization code that we use, tbh.

Because we then process things by value, instead of by procedures? I find it easier to understand "what should I assign to in blob.index, which will come back the same way when I reinterpret the blob at deserialization", instead of wondering "what should I write into a blob as an index, which I will need to remember to read again as an index later in the same order I write them".

I’m really not sure what you mean here. You have to write the v8::Global’s index either way, and you don’t need to maintain a specific order of deserialization manually, either.

Given that this would make debugging harder

How would this make debugging harder, though? These blobs will have pre-defined structures that can be inspected by name (either in the debugger, or in the callback). If it's talking about the ability to inspect all the native states in the entire snapshot, I am not sure how much saving them all in our own blob would help, since the corresponding JS states are always buried in V8's blob and most things we see are indices.

It would remove the BaseObject serializations from the serialization printout, for example – both with what this PR suggests and your plain C++ object suggestion. Yes, the correponding JS values are stored on V8’s side and not easily accessible, but even knowing what types of BaseObjects we have stored, how many, and what they look like on the native side seems very useful to me.

I’m inclined to keep the code simple instead

Why is this simpler than letting V8 read/write the values and invoke the callbacks for us?

I could ask the converse question – why does it seem simpler for you to create a serialization structure, then write that/read that from the V8 snapshot?

For example, we are maintaining our own maps of BaseObjectDeserializers as a result?

What else would you do? Maintain a list of all possible BaseObject types? That’s a code smell, imo. We should not be maintaining a global list of those.

Copy link
Member

@joyeecheung joyeecheung Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m really not sure what you mean here. You have to write the v8::Global’s index either way, and you don’t need to maintain a specific order of deserialization manually, either.

I am talking about the native states of a C++ object wrapped inside a JS object (i.e. the internal fields). For example if the C++ object (likely a BaseObject) contains a bool, a uint32_t, and a string, with the current approach we need to invoke StartWriteEntry, WriteBool, WriteUint32, WriteString and EndWriteEntry, and during deserialization, we need to do the Reads in the exact same order to restore states for this C++ object. Using V8's callback, we will predefined a structure, assign fields to an instance of that accordingly (the order would not matter) and return it as a StartupData during serialization. At deserialization we just need to reinterpret the StartupData passed into the callback, and restore the states into the C++ object with information in that StartupData mostly by assigning fields, again the order does not matter - no error should occur if we assign the string first (probably by invoking the string constructor with a length and a pointer in that blob), then the uint32 later.

I could ask the converse question – why does it seem simpler for you to create a serialization structure, then write that/read that from the V8 snapshot?

As explained above, the order would not matter - the fields are identified by name inside that blob, and they will be put into place during deserialization, we just need to cast them, instead of reading them one-by-one from the blob in a specific order.

Maintain a list of all possible BaseObject types? That’s a code smell, imo. We should not be maintaining a global list of those.

Why would that be a code smell? We already do that for AsyncWraps, it doesn't seem so bad to extend that to other subclasses of BaseObjects. We could return this in MemoryInfoName(), that gets converted to strings by the MemoryTracker, which helps us make sure that all names like this have a proper declaration (ideally, if we put a documentation about all existing BaseObject types above that declaration of types, I think it would be tremendously helpful for people to understand the structure of the code base), and unify the class names in both the heap snapshot and the startup snapshot. Also theoretically, if we put that type directly into the BaseObject, tools like llnode can also reuse that field for interpreting internal fields inside objects associated with BaseObjects - llnode was made possible because V8 has such global list of heap object types.

Copy link
Member

@joyeecheung joyeecheung Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to be clear, I don't think we have to use the V8 callbacks as-is, but I think we should go over why they can't be used for our use cases, and ideally, if they don't work for us, we can try improving the API to make them more feasible, instead of bypassing the entire machinery and using our home-grown ones in an uncoordinated way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintain a list of all possible BaseObject types? That’s a code smell, imo. We should not be maintaining a global list of those.

Why would that be a code smell? We already do that for AsyncWraps, it doesn't seem so bad to extend that to other subclasses of BaseObjects.

Yes, also a code smell for AsyncWraps, imo – it’s just a bit hard to get rid of that given that we store the names for them in IsolateData.

We could return this in MemoryInfoName()

I don’t think it’s a good idea to tie identifiers and debugging names together, tbh. There’s always a possibility that we want to provide extra information in MemoryInfoName() that we wouldn’t need for identifying the type.

@addaleax
Copy link
Member Author

Fwiw, regarding the v8::External performance issue … ugh. The internal implementation of v8::External::Value doesn’t quite look like something that would be turned into an inline function easily, so I’m not sure whether there’s a lot that we can do about it.

The example binding function in #26382 was obviously not a real-world one, and I think it’s quite possible that the relative impact for our actual binding functions is not all that large either way. My primary motivation behind that PR wasn’t performance anyway, but rather enabling something like #32538 as a second step.

@joyeecheung Any thoughts?

@addaleax addaleax force-pushed the env-snapshot branch 2 times, most recently from a8f306f to 2ce4aeb Compare April 16, 2020 04:41
@joyeecheung
Copy link
Member

joyeecheung commented Apr 16, 2020

The internal implementation of v8::External::Value doesn’t quite look like something that would be turned into an inline function easily, so I’m not sure whether there’s a lot that we can do about it.

I have given a look into the issue before and doing what #26382 (review) suggests did not show any performance improvement locally - I think the performance difference does not come from boxing as described by #26382 (review), but instead comes from the fact that the way v8::External stores that pointer involves two atomic loads, whereas v8::Object makes use of the fact that access from the API is always synced with GC so it just reads normally from an offset - v8::External can be turned into that way as well, though it involves specializing the layout for v8::External which is now complicated by the work of pointer compression in V8. I'd happy to continue working on my fix for it to reduce the perf impact of this PR, though I don't see the perf issue of v8::External a blocker for snapshot support for the reason you mentioned above - the rehashing issue is more important IMO. (On the other hand, if #32761 (comment) is implemented tthe performance v8::External would not have mattered here)

virtual void Serialize(SnapshotCreateData* snapshot_data) const;
};

class ExternalReferences {
Copy link
Member

@joyeecheung joyeecheung Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking, for static references, can we do this via the same kind of machinery we use to register module bindings? Like adding a variant similar to NODE_MODULE_CONTEXT_AWARE_INTERNAL, which can later be exposed to the user land somehow. So in addition to an initialize function, modules can register a register function for these static references, which will be invoked at program startup eagerly in the snapshot builder. Ideally some checks can also be added in the snapshot builder to make sure that all the foreign references created later in the initializer are registered in the register function (like right after the initializer is called, when we have context of the internalBinding/linkedBinding/dlopen call, instead of waiting until V8 tells us that this is not registered while everything is out of scope).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking, for static references, can we do this via the same kind of machinery we use to register module bindings?

@joyeecheung Yes – what’s the advantage of that?

(Aside, dlopen() is not going to work with the current V8 snapshot API – we’d need to know about external references before they are even loaded in memory.)

Ideally some checks can also be added in the snapshot builder to make sure that all the foreign references created later in the initializer are registered in the register function

Yeah, I guess that would be pretty straightforward. This PR only adds external references for the bindings that are loaded during bootstrap, so I guess we’d first have to extend that for all external references before we could do that?

@addaleax
Copy link
Member Author

@joyeecheung To be honest, this review process is, once again, quite frustrating. A lot here boils down to you not liking the approaches in this PR, which is fine – expecting people to like what I suggest is clearly not reasonable – but it’s just not very actionable for me. Your comments tend to set different priorities ad-hoc – sometimes you oppose the code in this PR because you think it’s not debugging-friendly enough, while elsewhere you suggest changes that remove debugging abilities. I’ve done my best to address the concrete issues that were brought up so far.

Ultimately, this is taking up a lot of energy, and I’ll take a break from this for a week or so, and probably most other Node.js core stuff too.

@joyeecheung
Copy link
Member

joyeecheung commented Apr 17, 2020

@addaleax Sorry about the frustration, I did not mean to force any design decisions into this PR.

Your comments tend to set different priorities ad-hoc – sometimes you oppose the code in this PR because you think it’s not debugging-friendly enough, while elsewhere you suggest changes that remove debugging abilities.

I think I was primarily basing my expectations on the debugging experience I had with my prototype, as well as dealing with previous crashes during deserialization when I changed object layout for unrelated projects in V8 - perhaps this is why I came across as intrusive and ad-hoc. I would not mind throwing my previous work away if a better implementation comes around, but I had given some thoughts about the points I raised questions about here when I worked on my prototype, so I did not want to see the possibility of alternatives to the current design went ignored.

In hindsight, the clash was pretty unfortunate and should've been avoided if I communicated more clearly about my progress in e.g. the TSC meetings (my last update was last December). I did fail to post my prototype as promised in #26382 (comment) since a lot of things happened at the beginning of this year, and I wasn't rebasing my prototype as often (my last rebase was in mid-March after I landed #31884 in the upstream), since I shifted focus to address the blocking upstream issues described in #17058 (comment) which led to conflicts with more changes recently landed on master (and probably led my preference to go back to the old status quo). I am grateful for the work you did here that addressed some issues that had bothered me (for instance https://chromium-review.googlesource.com/c/v8/v8/+/2144954 addressed #17058 (comment)) as well as uncovered issues that I was not aware of (for instance #32761 (comment) for https://chromium-review.googlesource.com/c/v8/v8/+/2143983)

This review process has been taxing for me as well, as I was not expecting to review a competing PR (and one that does things quite differently in a few aspects that I had given a lot of thoughts about) . Moving forward, maybe we could lay out our differences about the design decisions in a doc, listing the pros and cons, possibly inviting others to give more high-level opinions before making a pick? This should also help posterity in understanding the design of the snapshot process. I will work on this before waiting for you to get back - it's OK and totally reasonable if you think this would not help settling down the differences we have and engaging would lead to more harm than good. But I think this would at least help me look at these options from a higher-level perspective instead of being blinded by my own prejudice, and come back with a clearer head. Meanwhile I'll also continue working on https://chromium-review.googlesource.com/c/v8/v8/+/2143983 to unblock the snapshotting of JSMap/JSSet.

@joyeecheung
Copy link
Member

As promised I wrote down the differences in design in https://docs.google.com/document/d/15bu038I36oILq5t4Qju1sS2nKudVB6NSGWz00oD48Q8/edit# ( as well as exploring some other options that came to mind there. I invited @addaleax to edit but anyone who wants to contribute feel free to ping me to get permissions). I also opened #32984 that contains the WIP I had been working on before this came along. That WIP is still immature - apart from missing a rehashing fix that is also blocking this, it also works around some checks in the Environment cleanup routine to make it place well with V8, but at least it should serve as a proof of concept for the suggestions I made here.

A few more observations I've made:

As it turns out, my suggestion in #32761 (comment) is not quite practical - we still need something to keep the binding data alive, and the easiest way is to just have Global handles to them, which is what BaseObject is about. Although instead of dummy objects (maybe just for NoBindingData), we could associate other bindings with the JS Object returned by internalBinding() - theocratically they should have a closely-related life cycle and most of the times the native states inside those C++ bindings are also connected to the JS ones (take those AliasedBuffers in HTTP2 and FS for example). I implemented my idea in #32761 (comment) (moving the bindings to a list referenced by context slots) in a commit in #32984 and I think it's less hacky to get rid of the template context-dependency issue this way, we might consider just land this first in master to get the blocker out of the way (and it also unblocks vm context builtins to some extent).

@addaleax
Copy link
Member Author

As promised I wrote down the differences in design in https://docs.google.com/document/d/15bu038I36oILq5t4Qju1sS2nKudVB6NSGWz00oD48Q8/edit# ( as well as exploring some other options that came to mind there. I invited @addaleax to edit but anyone who wants to contribute feel free to ping me to get permissions).

I’ve left a number of comments there, but I think mostly they are just re-hashing the conversation above. Based on that document, I don’t see any disadvantages to the choices made in this PR compared to the alternatives.

A few more observations I've made:

  • I pondered about more issues that might come with C++ addons and I think it will just be quite difficult to include them in the snapshots..unless we don't attempt to serialize them at all

I came to the same conclusion 👍

Right, I thought about that too, but the only real alternatives seem to be a) tooling that generates the list of external references from the source code and generates a .cc file with the information or b) dynamically iterating over symbols in the current process (likely too slow to be practical).

I implemented my idea in #32761 (comment) (moving the bindings to a list referenced by context slots) in a commit in #32984 and I think it's less hacky to get rid of the template context-dependency issue this way, we might consider just land this first in master to get the blocker out of the way (and it also unblocks vm context builtins to some extent).

Right, this is something I still consider to be independent from both PRs, but that I’m in favour of landing in master separately.

@addaleax
Copy link
Member Author

@joyeecheung Also, maybe to give an overview over my priorities here: The points that I feel most strongly about are the ones labelled “Dynamic External Reference Registration” and “Serialization format for Node.js’s own states” in the Google doc.

I still think that this PR makes the better tradeoff in the other cases as well (setting the context-dependency of bindings aside), but those aren’t hills that I’m willing to die on.

@addaleax
Copy link
Member Author

addaleax commented Apr 29, 2020

@joyeecheung Fyi, I’ve updated this with your patch for binding data with a few modifications, but I still believe that that is largely independent from this PR and will PR that separately (edit: #33139). I’ll also look into using the internal fields serialization hooks now, you made a good point in the google doc that we might run into danger of accidentally attempting to serialize unreachable BaseObjects.

@addaleax addaleax force-pushed the env-snapshot branch 2 times, most recently from 9352ff3 to bfdb5ad Compare May 6, 2020 04:45
addaleax and others added 8 commits May 8, 2020 05:23
This enables us to tell whether a JS bindings object is associated
with a `BaseObject` or not.
Gather external references from the different native modules that
are loaded during bootstrapping, for inclusion in the snapshot.
This value is not a constant, in the sense that its value when
running `node_mksnapshot` and when running the default `node` binary
are different.
Instead of storing the zero-fill-field as a fixed `ArrayBuffer`
on the binding, only load it via a function call once bootstrapping
is complete. This makes sure that the value is set correctly
after loading from a snapshot (i.e. that the accessor for the
field is stored in the snapshot, not the field itself).
Original commit message:

    [snapshot] Improve snapshot docs and error printing

    - Minor improvements to the documentation for snapshotting.
    - Add newlines to printed errors where necessary.

    Change-Id: I822e7e850adb67eae73b51c23cf34e40ba3106f0
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2144954
    Reviewed-by: Toon Verwaest <[email protected]>
    Commit-Queue: Toon Verwaest <[email protected]>
    Cr-Commit-Position: refs/heads/master@{#67111}

Refs: v8/v8@bb9f0c2
Original commit message:

    [snapshot] Do not defer ArrayBuffers during snapshotting

    ArrayBuffer instances are serialized by first re-assigning a index
    to the backing store field, then serializing the object, and then
    storing the actual backing store address again (and the same for the
    ArrayBufferExtension). If serialization of the object itself is deferred,
    the real backing store address is written into the snapshot, which cannot be
    processed when deserializing, leading to a crash.

    This fixes this by not deferring ArrayBuffer serialization and adding a DCHECK
    for the crash that previously occurred.

    Change-Id: Id9bea8268061bd0770cde7bfeb6695248978f994
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2144123
    Commit-Queue: Jakob Gruber <[email protected]>
    Reviewed-by: Dan Elphick <[email protected]>
    Cr-Commit-Position: refs/heads/master@{#67114}

Refs: v8/v8@ea0719b
Original commit message:

    rehash JSMap and JSSet during deserialization

    To rehash JSMap and JSSet, we simply replace the backing store
    with a new one created with the new hash.

    Bug: v8:9187

Refs: joyeecheung/v8@9baf286
This snapshots a lot more data for startup than what we did previously
for increased startup performance.

                                                                                       confidence improvement accuracy (*)   (**)  (***)
    misc/startup.js mode='process' script='benchmark/fixtures/require-cachable' dur=1        ***      9.73 %       ±3.78% ±5.03% ±6.54%
    misc/startup.js mode='process' script='test/fixtures/semicolon' dur=1                    ***     36.11 %       ±3.91% ±5.23% ±6.86%
@addaleax addaleax removed the blocked PRs that are blocked by other issues or PRs. label May 8, 2020
@addaleax addaleax marked this pull request as ready for review May 8, 2020 03:24
Copy link
Member

@joyeecheung joyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As much as I hate to do this I'll have to block this PR on the ground that #32984 is also blocked without the design disagreement between the two settled. Maybe a more viable way to move forward is to split the snapshot integration into smaller steps that we can agree on (or the TSC can agree on if we can't reach consensus) and get them landed one step at a time. I think we'll need input from at least another party to break the tie.

@addaleax
Copy link
Member Author

See #32984 (review). Sigh.

@addaleax addaleax closed this May 15, 2020
@addaleax addaleax deleted the env-snapshot branch May 15, 2020 00:48
@addaleax addaleax restored the env-snapshot branch May 15, 2020 00:48
@jasnell
Copy link
Member

jasnell commented May 15, 2020

@addaleax ... Before this is abandoned, let's put the issue on the next week @nodejs/tsc agenda to discuss. I was also hoping that more folks would get involved in the discussion and I'm disappointed but not surprised that others didn't jump in but given that we're both NearFormers I didn't want to risk a one sided pile on while technical issues were being worked through. Fwiw, I'm feeling the same frustration about lack of engagement on the quic prs... Which have sat for months with only two tsc members showing much interest outside of you and I.

I understand if you're too personally frustrated with the process to move this forward, I can step in and take it on because I think this work is definitely important. It would just be good to raise the visibility directly with the tsc.

@addaleax
Copy link
Member Author

@jasnell Yeah, feel free to re-open this and take over if you like 👍

@joyeecheung
Copy link
Member

@addaleax @jasnell Thanks for being so patient and understanding! I agree we need to think of a better process to prevent the same frustration from happening again - I think one thing to note is that TSC isn't the best body to assign the task of sorting out technical reviews, since the codebase is diverse enough so that different TSC members can totally be interested in/have the expertise in different things. I will try to be in the next meeting, if I can't, I'll type something into the meeting issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lib / src Issues and PRs related to general changes in the lib or src directory.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants