build: speed up startup with V8 code cache #21405

joyeecheung · 2018-06-19T14:32:18Z

build: speed up startup with V8 code cache

This patch speeds up the startup time and reduce the startup memory
footprint by using V8 code cache when comiling builtin modules.

The current approach is demonstrated in the with-code-cache
Makefile target (no corresponding Windows target at the moment).

Build the binary normally (src/node_code_cache_stub.cc is used),
by now internalBinding('code_cache') is an empty object
Run tools/generate_code_cache.js with the binary, which generates
the code caches by reading source code of builtin modules off source
code exposed by require('internal/bootstrap/cache').builtinSource
and then generate a C++ file containing static char arrays of the
code cache, using a format similar to node_javascript.cc
Run configure with the --code-cache-path option so that
the newly generated C++ file will be used when compiling the
new binary. The generated C++ file will put the cache into
the internalBinding('code_cache') object with the module
ids as keys
The new binary tries to read the code cache from
internalBinding('code_cache') and use it to compile
builtin modules. If the cache is used, it will put the id
into require('internal/bootstrap/cache').compiledWithCache
for bookkeeping, otherwise the id will be pushed into
require('internal/bootstrap/cache').compiledWithoutCache

This patch also added tests that verify the code cache is
generated and used when compiling builtin modules.

The binary with code cache:

Is ~1MB bigger than the binary without code cahe
Consumes ~1MB less memory during start up
Starts up about 60% faster (in other words, ~2x faster). On a low-end Linux VPS with one core perf stat ./node -e ";" was ~80ms and is now ~40ms.

Performance stats

OS: Darwin Kernel Version 17.5.0
Compiler: Apple LLVM version 9.1.0 (clang-902.0.39.1)
CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

Benchmark result:

                       confidence improvement accuracy (*)   (**)  (***)
 misc/startup.js dur=1        ***     66.19 %       ±1.84% ±2.45% ±3.20%

	Without code cache	With code cache
Time to finish JS tests	3:00	2:25
Binary Size	37863108	38914164
RSS after startup	20365312	19210240
heapTotal after startup	6062080	5537792
heapUsed after startup	3781136	`3227328`

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
documentation is changed or added
commit message follows commit guidelines

nodejs-github-bot · 2018-06-19T14:32:19Z

@joyeecheung build started: https://ci.nodejs.org/blue/organizations/jenkins/node-test-pull-request-lite-pipeline/detail/node-test-pull-request-lite-pipeline/98/pipeline

joyeecheung · 2018-06-19T14:34:31Z

src/node.cc

@@ -1581,6 +1582,10 @@ static void GetBinding(const FunctionCallbackInfo<Value>& args) {
  } else if (!strcmp(*module_v, "natives")) {
    exports = Object::New(env->isolate());
    DefineJavaScript(env, exports);
+  } else if (!strcmp(*module_v, "code_cache")) {
+    // internalBinding('code_cache')


I meant to add this to the internalBinding but it is now in the legacy binding for debugging purposes.

devsnek · 2018-06-19T14:37:50Z

what are those process properties for exactly

joyeecheung · 2018-06-19T14:39:02Z

Makefile

+CODE_CACHE_FILE ?= $(CODE_CACHE_DIR)/node_code_cache.cc
+
+.PHONY: with-code-cache
+with-code-cache:


It may be possible to put this into the gyp file, but given that gyp has not been supported by V8 and we are trying to migrate into a new build system, it's easier to put the two-pass build step into Makefile (but to support that on Windows we would need to port to vcbuild.bat)

It may also be possible to generate the file using an executable compiled from C++ that includes node_javascript.cc for the sources and calls V8 APIs to generate the cache instead of using a JS script, like the mk* targets in V8, but to do that we would still need to refactor the gyp file as well so I picked the easier route.

joyeecheung · 2018-06-19T14:45:07Z

what are those process properties for exactly

@devsnek Right now there are three things added to the process object, all of them for debugging purposes:

process.binding('code_cache') I can put that into internalBinding, this is a moduleID-to-cache-buffer map, when building without the configure option it's an empty object.
process.cacheAcceptedList records builtin modules compiled with code cache
process.cacheRejectedList records builtin modules compiled without code cache

I can move the two lists somewhere else, or only expose them in debug builds, or just delete them, suggestions are definitely welcomed. Mind that we should avoid doing the bookkeeping by controlling the behavior with any env variables or flags because if we want to introduce the V8 snapshot in the future, we will need NativeModule.prototype.compile not to depend on those things otherwise there will not be much left that can be snapshotted.

joyeecheung · 2018-06-19T14:51:39Z

tools/generate_code_cache.js

+
+for (const key of Object.keys(natives)) {
+  if (key === 'config') continue;
+  const wrapper = [


Currently there are no way to get the native module wrapper in the user land so this is just copy-pasted. We could put the wrapper in a file and transform that in the loader using macros in js2c and read it off the disk here, or just put a comment here. In any cases if the source do not match, the module would appear in process.cacheRejectedList

require('module).wrapper?

@benjamingr That's the wrapper we use for user land modules. The native modules uses NativeModule.wrapper which is internal and subject to change. We have tests to make sure the internal loaders and their properties could not be leaked to user land.

hashseed · 2018-06-19T15:51:08Z

tools/generate_code_cache.js

+
+  const script = new vm.Script(code, {
+    filename: `${key}.js`,
+    produceCachedData: true


This seems to use the old API to produce cache right after compile. This either compiles everything if specified to compiled eagerly, or only the toplevel function.

With the new API for which @devsnek is writing the binding, you could precisely include the functions you require for startup.

Maybe something to think about for the future :)

@hashseed Yeah this is also why I think it's a good idea to return error information in that PR, for example if I run this script with --inspect-brk and open the devtools I would not be able to create any cache at all. I know it's because it's in debug mode but that's only because I've looked into the API before. If we want to open up tooling opportunities to users it would be nice to be able to tell then why the code cache cannot be created.

This either compiles everything if specified to compiled eagerly, or only the toplevel function.

FYI: We do not actually support eager compilation through the vm.Script API, so this is only the top-level function.

@TimothyGu We should probably add support for the full range of CompileOptions (and possibly NoCacheReason) to ContextifyScript/vm.Script (in another PR of course)

hashseed · 2018-06-19T18:11:33Z

lib/internal/bootstrap/loaders.js

+        source, this.filename, 0, 0,
+        codeCache[this.id], false, undefined
+      );
+      if (!codeCache[this.id] || script.cachedDataRejected) {


I don't think this should happen. If it does, something bad must have happened. Probably best to crash here.

@hashseed I assume you are talking about the second condition? The first happens when the cache is not built into the binary. Throwing on the second one (if (codeCache[this.id] && script.cachedDataRejected)) sounds reasonable to me.

TimothyGu · 2018-06-19T19:42:18Z

tools/generate_code_cache.js

+  } else {
+    const { def, initializer } = getInitalizer(key, Buffer.alloc(0));
+    cacheDefs.push(def);
+    initializers.push(initializer);


Is there a reason why we are still adding the definition and the initializer to the output source file when the cached data failed to generate?

@TimothyGu I think it makes sense to make the build step more lenient. If the cache data cannot be created it'll just be empty and in the current implementation it's going to be added to cacheRejectedList when the loader fails to compile a module with the cache. If we are going to throw on bad caches we can skip the cache if it's an empty buffer, or just fail at the build phase.

Ah I see. Thanks for the explanation.

TimothyGu · 2018-06-19T19:43:37Z

src/node.cc

+    exports = InitModule(env, mod, module);
+  } else {
+    return ThrowIfNoSuchModule(env, *module_v);
+  }


Is there a functional difference here?

@TimothyGu I was adding the code_cache binding to internalBindnig but later moved that to process.binding for debugging purposes. This is the legacy of that cut-and-paste. I think we should move it back in the final incarnation of this PR..

benjamingr · 2018-06-21T17:48:05Z

tools/generate_code_cache.js

+const fs = require('fs');
+let count = 0;
+
+function human(num) {


Can we give this a better name?

benjamingr · 2018-06-21T17:51:55Z

tools/generate_code_cache.js

+
+namespace node {
+
+${cacheDefs.join('\n\n')}


Why the double newline? Just debugging?

The double new line will result in an empty new line between definitions, which is what we generally have (one after the current definition, one empty line). Alternatively we could add the newline into the elements of cacheDefs itself.

This patch speeds up the startup time and reduce the startup memory footprint by using V8 code cache when comiling builtin modules. The current approach is demonstrated in the `with-code-cache` Makefile target (no corresponding Windows target at the moment). 1. Build the binary normally (`src/node_code_cache_stub.cc` is used), by now `internalBinding('code_cache')` is an empty object 2. Run `tools/generate_code_cache.js` with the binary, which generates the code caches by reading source code of builtin modules off source code exposed by `require('internal/bootstrap/cache').builtinSource` and then generate a C++ file containing static char arrays of the code cache, using a format similar to `node_javascript.cc` 3. Run `configure` with the `--code-cache-path` option so that the newly generated C++ file will be used when compiling the new binary. The generated C++ file will put the cache into the `internalBinding('code_cache')` object with the module ids as keys 4. The new binary tries to read the code cache from `internalBinding('code_cache')` and use it to compile builtin modules. If the cache is used, it will put the id into `require('internal/bootstrap/cache').compiledWithCache` for bookkeeping, otherwise the id will be pushed into `require('internal/bootstrap/cache').compiledWithoutCache` This patch also added tests that verify the code cache is generated and used when compiling builtin modules. The binary with code cache: - Is ~1MB bigger than the binary without code cahe - Consumes ~1MB less memory during start up - Starts up about 60% faster

joyeecheung · 2018-06-25T22:29:39Z

I have updated this PR based on the feedback:

The cache related internals are now exposed through a special module internal/bootstrap/cache and used in tools/generate_code_cache.js and the code-cache tests. It's not ideal that we expose them into the user land but being guarded by --expose-internals and being mostly clones should be safe enough. Hopefully in the future when we refactor the bootstrap code and the build files to a point where it's easy to do the two-pass build steps with separate targets, we can get rid of this special module (... or not if we want to keep the tests).
Added a test that verifies the code cache is generated and used
I still allow the cache to be rejected because that may happen when any dependency of node_js2c is touched and the node_code_cache.cc is not regenerated or compiled. Fixing that requires more surgery of the build files.

Note that this patch still does not affect normal workflows and releases. To use the code cache by default we'll need to make changes to Makefile and vcbuild.bat so that the two-pass build steps are run in those workflows - either that or we need to refactor the gyp files, which I would like to avoid since we are moving away from them, and we may still need to refactor node.cc which is already blocking too many things right now.

I think it's a good start to land this first and iterate on the build files later (and ideally tools/generated_code_cache.js is supposed to be an executable like mksnapshot). This is ready for reviews now cc @TimothyGu @benjamingr @hashseed

jdalton

Super neat!

jdalton · 2018-06-25T22:44:02Z

lib/internal/bootstrap/loaders.js

@@ -184,7 +223,7 @@
      if (id === loaderId) {
        return false;
      }
-      return NativeModule.exists(id);
+      return id === cacheId || NativeModule.exists(id);


Can it be handled as any other internal (in its own file)? That way it avoids conditional checks NativeModule.require and NativeModule.nonInternalExists.

@jdalton Good idea, I'll do that

addaleax · 2018-06-25T22:43:35Z

test/code-cache/code-cache.status

@@ -0,0 +1,21 @@
+prefix v8-updates


prefix code-cache? :)

oops, copy paste error :P

addaleax · 2018-06-25T22:45:33Z

tools/generate_code_cache.js

+  v8::Local<v8::Uint8Array> ${defName}_array =
+    v8::Uint8Array::New(${defName}_ab, 0, ${cache.length});
+  target->Set(context,
+              OneByteString(isolate, "${key}"),


Can we use FIXED_ONE_BYTE_STRING since the length is known at compile time?

addaleax · 2018-06-25T22:49:53Z

tools/generate_code_cache.js

+                     `${cache.join(',')}\n};`;
+  const initializer = `
+  v8::Local<v8::ArrayBuffer> ${defName}_ab =
+    v8::ArrayBuffer::New(isolate, ${defName}_raw, ${cache.length});


Not sure, but maybe as a future optimization we could use a single array buffer for all of these?

@addaleax Do you mean creating one buffer ane using offsets on it when creating the Uint8Arrays? I think that's doable but I am not sure how big a performance impact that would be.

Yes, that’s what I mean – I wouldn’t expect it to make a huge difference, and it’s definitely not something that needs to be thought about in this PR :)

@joyeecheung For a reference, whenever it's tackled, you can check out v8-compile-cache which uses a single buffer and an offset map.

addaleax · 2018-06-25T22:51:06Z

Makefile

@@ -91,6 +91,22 @@ $(NODE_G_EXE): config.gypi out/Makefile
 	$(MAKE) -C out BUILDTYPE=Debug V=$(V)
 	if [ ! -r $@ -o ! -L $@ ]; then ln -fs out/Debug/$(NODE_EXE) $@; fi

+CODE_CACHE_DIR ?= out/$(BUILDTYPE)/obj.target
+CODE_CACHE_FILE ?= $(CODE_CACHE_DIR)/node_code_cache.cc


I think most all generated code is in out/$(BUILDTYPE)/obj/gen?

@addaleax Yes, not sure why I picked obj.target...thanks for pointing that out, I'll fix it

joyeecheung · 2018-06-25T23:16:52Z

@jdalton @addaleax Thanks for the reviews, updated.

~~CI: https://ci.nodejs.org/job/node-test-pull-request/15619/~~

EDIT: forgot to exclude the code-cache in the default tests suites...new CI: https://ci.nodejs.org/job/node-test-pull-request/15620/

hashseed · 2018-06-26T04:36:25Z

I still allow the cache to be rejected because that may happen when any dependency of node_js2c is touched and the node_code_cache.cc is not regenerated or compiled. Fixing that requires more surgery of the build files.

Relying on V8's code caching sanity checks to catch this is a bad idea. The only check V8 performs wrt source string is that its length is as expected. You could modify the source but keep the length, and V8 will happily accept the cache data. You will end up with confused developers who cannot figure out why their changes to something in lib/ does not have any effect.

Please perform some more surgery to the build files, with this PR or later :)

joyeecheung · 2018-06-26T10:30:12Z

@hashseed Thanks for the reminder, I'll add a FIXME there. I think one way to solve the mismatch issue is to calculate checksums for the source both in js2c and in tools/generate_cache_data.js and compare them when compiling the modules. Simply fixing build dependencies is not enough since the confusion may still arise if the build dependencies somehow get broken and do not work properly.

devsnek · 2018-06-26T21:42:03Z

tools/generate_code_cache.js

+  // NativeModule.prototype.compile
+  const script = new vm.Script(code, {
+    filename: `${key}.js`,
+    produceCachedData: true


Script#createCachedData has landed so i wonder if maybe there's another way to do this so we can snapshot them after initial evaluation. (maybe build node with a --produce-snapshots flag?)

@devsnek Did you mean snapshot or code cache? I am not sure what initial evaluation means, right now to make enough progress on the snapshot front we still need some proper refactoring of node.cc and node.js and pick out the parts that can be snapshotted. Code cache is context-free so it's a lower hanging fruit.

yeah sorry for using some confusing terms. I mean we should see if we can produce code caches right after the js runs but before we run user code or smth similar.

@devsnek Does that make a difference though? The options passed to ScriptCompiler would probably still be the same and the script is still unbound.

@joyeecheung we would make a code cache from after the code first evaluated, which as i understand it is essentially a "fast-forward" to that state when you use the code cache

@devsnek Did you mean creating the code cache out of a Function? Otherwise I cannot see a difference if the unbound scripts are produced by passing the same arguments to CompileUnboundScript? I have not checked but I'd expect the code cache produced here and the code cache produced in the loaders to be the same?

@joyeecheung right now this pr creates the caches before evaluation. i'm suggesting we create them after evaluation.

@devsnek Did you mean we can create the code cache after unbound_script->BindToCurrentContext() and bounded_script->Run()? I am still not sure what difference that makes and how to get the core to spit the cache out to a file without relying on any environment variable or flags in loaders.js...maybe we can leave that for a future PR?

Let me clear up some confusion :)

This current PR creates the code cache right after each script compiles. At that time, V8 only compiled the toplevel function. Any inner functions that are called during bootstrapping needs to be compiled as we go. We call this lazy compilation. Lazily compiled functions are not part of the code cache.

Script#createCachedData introduces a way to create the code cache at arbitrary time, as long as you have the v8::UnboundScript object. The benefit is that if you call it after you have executed the script for bootstrapping, lazily compiled functions during that execution are now also included in the code cache. Therefore, when bootstrapping with code cache, you no longer need to compile previously lazily compiled functions.

Neither of the two ways of code caching captures state (to fast-forward to). Please do not confuse code cache with startup snapshot :)

joyeecheung · 2018-06-27T04:52:24Z

CI before landing https://ci.nodejs.org/job/node-test-pull-request/15643/

joyeecheung · 2018-06-27T13:23:03Z

Landed in 4750ce2, thanks!

This patch speeds up the startup time and reduce the startup memory footprint by using V8 code cache when comiling builtin modules. The current approach is demonstrated in the `with-code-cache` Makefile target (no corresponding Windows target at the moment). 1. Build the binary normally (`src/node_code_cache_stub.cc` is used), by now `internalBinding('code_cache')` is an empty object 2. Run `tools/generate_code_cache.js` with the binary, which generates the code caches by reading source code of builtin modules off source code exposed by `require('internal/bootstrap/cache').builtinSource` and then generate a C++ file containing static char arrays of the code cache, using a format similar to `node_javascript.cc` 3. Run `configure` with the `--code-cache-path` option so that the newly generated C++ file will be used when compiling the new binary. The generated C++ file will put the cache into the `internalBinding('code_cache')` object with the module ids as keys 4. The new binary tries to read the code cache from `internalBinding('code_cache')` and use it to compile builtin modules. If the cache is used, it will put the id into `require('internal/bootstrap/cache').compiledWithCache` for bookkeeping, otherwise the id will be pushed into `require('internal/bootstrap/cache').compiledWithoutCache` This patch also added tests that verify the code cache is generated and used when compiling builtin modules. The binary with code cache: - Is ~1MB bigger than the binary without code cahe - Consumes ~1MB less memory during start up - Starts up about 60% faster PR-URL: #21405 Reviewed-By: John-David Dalton <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gus Caplan <[email protected]>

Notable changes: * build: * Node.js should now be about 60% faster to startup than the previous version, thanks to the use V8's code cache feature for core modules. [#21405](#21405) * dns: * An experimental promisified version of the dns module is now available. Give it a try with `require('dns').promises`. [#21264](#21264) * fs: * `fs.lchown` has been undeprecated now that libuv supports it. [#21498](#21498) * lib: * `Atomics.wake` is being renamed to `Atomics.notify` in the ECMAScript specification ([reference](tc39/ecma262#1220)). Since Node.js now has experimental support for worker threads, we are being proactive and added a `notify` alias, while emitting a warning if `wake` is used. [#21413](#21413) [#21518](#21518) * n-api: * Add API for asynchronous functions. [#17887](#17887) * util: * `util.inspect` is now able to return a result instead of throwing when the maximum call stack size is exceeded during inspection. [#20725](#20725) * vm: * Add `script.createCachedData()`. This API replaces the `produceCachedData` option of the `Script` constructor that is now deprecated. [#20300](#20300) * worker: * Support for relative paths has been added to the `Worker` constructor. Paths are interpreted relative to the current working directory. [#21407](#21407) PR-URL: #21629

nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Jun 19, 2018

joyeecheung commented Jun 19, 2018

View reviewed changes

joyeecheung requested a review from hashseed June 19, 2018 14:40

joyeecheung commented Jun 19, 2018

View reviewed changes

joyeecheung added module Issues and PRs related to the module subsystem. build Issues and PRs related to build files or the CI. process Issues and PRs related to the process subsystem. labels Jun 19, 2018

hashseed reviewed Jun 19, 2018

View reviewed changes

TimothyGu reviewed Jun 19, 2018

View reviewed changes

benjamingr reviewed Jun 21, 2018

View reviewed changes

joyeecheung force-pushed the cache-code branch from 1c9f647 to 5862b13 Compare June 25, 2018 22:10

joyeecheung changed the title ~~WIP: speed up startup with V8 code cache~~ build: speed up startup with V8 code cache Jun 25, 2018

joyeecheung force-pushed the cache-code branch from 5862b13 to 005a274 Compare June 25, 2018 22:14

jdalton approved these changes Jun 25, 2018

View reviewed changes

jdalton reviewed Jun 25, 2018

View reviewed changes

addaleax approved these changes Jun 25, 2018

View reviewed changes

joyeecheung added 2 commits June 26, 2018 07:13

fixup! build: speed up startup with V8 code cache

30a3827

fixup! build: speed up startup with V8 code cache

4cca13e

fixup! build: speed up startup with V8 code cache

29cc730

fixup! build: speed up startup with V8 code cache

7c4b001

joyeecheung added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Jun 26, 2018

devsnek reviewed Jun 26, 2018

View reviewed changes

devsnek approved these changes Jun 27, 2018

View reviewed changes

joyeecheung closed this Jun 27, 2018

joyeecheung mentioned this pull request Jun 27, 2018

build: tracking issue for V8 code cache intergration #21563

Closed

7 tasks

targos mentioned this pull request Jul 3, 2018

v10.6.0 proposal #21629

Merged

MylesBorins mentioned this pull request Jul 31, 2018

Development kit and Deployment kit releases nodejs/Release#341

Closed


		namespace node {

		${cacheDefs.join('\n\n')}

build: speed up startup with V8 code cache #21405

build: speed up startup with V8 code cache #21405

Conversation

joyeecheung commented Jun 19, 2018 • edited Loading