-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Post-mortem out of memory analysis #239
Comments
Not in node core, but I tend to use https://github.com/nodejs/llnode to diagnose coredumps produced during OOM (it probably needs a few tweaks to generate coredumps in docker when the process crashes though). You can at least see the stacktrace (C++ and JS) when the OOM happens which tend to be useful (unless the process is not running the code that causes the problem when it crashes). You can also scan the heap and find the types with the most instances which is also helpful most of the times. |
Thanks! I have seen that one in the past but found it to be a bit too low-level for my purposes. |
@paulrutter There is an experimental JS API of llnode https://github.com/nodejs/llnode/blob/master/JSAPI.md which allows you to install it as a normal C++ Node.js addon and use the JS API to access info in the core dump (it's not yet completed, of course). There is also https://github.com/nodejs/node-report which can dump a summary when the process crashes, probably a bit closer to the Java heapdump experience. The problem with heapdumps is that they only include information from the VM's managed heap, but sometimes the cause may come from the native layer, so it's sometimes inevitable to dig into there. |
Also, node-report is being integrated into core nodejs/node#22712 the current implementation allows you to use command line flags to configure when the report will be generated. |
Thanks, that's very helpful. I'll take a look into node-report to see if that's helpful 👍 |
Another useful tool is https://github.com/v8/sampling-heap-profiler, which shows allocation per function call. This tool is also more suited for production scenarios because taking heapdumps is expensive (high memory consumption) and slow, but it shouldn't be a problem if the process keeps memory usage around 80MB. |
Thanks, that's another one not known to me until now. |
@paulrutter please let me know your experiences with node-report. Would like to hear about what worked well as well as other information that might be useful in node-report. In my session coming up at NodeConfEU one of the demo's included is using nore-report to help when you have an OOM. |
Hi @mhdawson, I tried node-report yesterday (the fatalerror.js example included in the source), and although this is definitely helpful information in case of an OoM, i still miss information about what the heap contained at the time. |
@paulrutter taking a heap snapshot is also one of the core diagnostics capabilities that I believe should be in core. We have a PR working to get node-report into core itself (in a bit different form). After that I want to look at the best practice for getting a heapdump. @richardlau at one point I had thought node-report was going to enable triggering a heap dump as well but maybe that was https://github.com/RuntimeTools/appmetrics instead. You can get a heapdump with the https://www.npmjs.com/package/heapdump module and also through the inspector APIs (I've not had time to validate the later myself quite) |
Thanks! For the latter, we now created our own module called node-oom-heapdump, which works very well for us. But it would be nice to have this functionality in the core like you mentioned. Just a matter of time, i guess? I could help on getting a pull request for node-report, if that would help? |
@mhdawson I have a feeling it was appmetrics and that ended up using the heapdump module. |
@paulrutter you mean a PR to add heapdump trigger support in node-report or something else? |
By the way, the heap snapshot doesn't seem to work on larger heaps: https://bugs.chromium.org/p/chromium/issues/detail?id=826697 |
@paulrutter if you are interested we might collaborate on the path of having a heapdump solution in core. I'm quite interested in that but have just not had time to make progress. One thing people have mentioned is that it is possible through the inspector API so first step is to understand that. If its sufficient then documenting it as part of the diagnostics group's best practices and if not making the case for what we think should be in core. Adding @gireeshpunathil as another interested party. |
@mhdawson That is certainly possible. This is the snippet I used to generate the heap snapshots for the screenshots in nodejs/node#23072 See snippet'use strict';
const http2 = require('http2');
const inspector = require('inspector');
const session = new inspector.Session();
session.connect();
const fs = require('fs');
function createSnapshot() {
let buf = '';
session.on('HeapProfiler.addHeapSnapshotChunk', ({
method, params: { chunk }
}) => {
console.log('addHeapSnapshotChunk', chunk.length);
buf += chunk;
});
session.on('HeapProfiler.reportHeapSnapshotProgress', (progress) => {
console.log('reportHeapSnapshotProgress', progress);
});
session.post('HeapProfiler.takeHeapSnapshot', {
reportProgress: true
}, () => {
console.log('Writing snapshot');
fs.writeFileSync('./heap.heapsnapshot', buf);
});
}
const server = http2.createServer();
server.on('stream', (stream) => {
stream.respondWithFile(__filename);
});
server.listen(0, () => {
const client = http2.connect(`http://localhost:${server.address().port}`);
const req = client.request();
req.on('response', () => {
createSnapshot();
});
req.resume();
req.on('end', () => {
client.close();
server.close();
});
req.end();
}); |
@joyeecheung the key thing for me is that you should be able to trigger heapdumps without having change your code. What I thought was being suggested was to connect through the external inspector interface and request a heapdump. The potential issue with that is whether it is acceptable/practical to have that enabled/accessible security in production. One other question, for your example above did you need to enable the inspector through a command line option or is simply requiring like you show enough? |
@mhdawson Without code changes we will probably have to achieve this through certain IPC mechanisms, AFAIK user land tend to use signals for this (so does node-report). The snippet above just needs to be run with the node executable without any flags to produce a heapdump. |
To be clear; we're talking about two separate things here: I would be happy to collaborate on both topics. I could possibly get some time on this within the company i work for, because it benefits them as well. Let me know if my summary checks out for you as well and if so, go from there. |
I like these ideas but I'm still concerned about performance issues and memory footprint while generating heap dumps. AFAIK, taking a heapdump of large heaps takes too much time, uses at least the same amount of memory already allocated for the heap and can't be done in a separate thread (e.g.: it is a blocking operation which will stop the event loop). These problems won't be a problem when dealing with smaller heaps, but most Node.js users will experience OOM only after they're using more than 1Gb (which is the default max-old-space-size setting). I think we need to solve these problems before integrating this into core. |
That certainly is an issue for bullet 1, but not for 2. In case of OoM, blocking the event loop for a heapdump doesn't sound like an issue to me? The process is dead anyway. |
This is already possible in JS land through the inspector API. if the request is for something that directly calls into For bullet 2 I would perfer something more general that allows users to configure actions in OOM other than just taking an heapdump - but again we can only find out how comfortable collaborators feel about this with an issue in the core repo, since that's where the code will land. |
Ok, that is clarifying! That indeed is what i mean for bullet 1. Just an API for creating heapdumps. Will test next week and create an issue to move that to a more public API, or document it somewhere. For bullet 2, making it configurable would make it more flexibele, i agree. Maybe an ability to hook into SetOOMErrorHandler from JS code would then be enough? Something like process.on('fatal_error') or something similar? |
It’s an internal API for testing so I believe documenting it (without moving it to the public API surface) is not really on the table. |
I created nodejs/node#23328 for creating heapdumps in core. |
And the follow up issue here: nodejs/node#27552 |
Hi,
While reading the readme on https://github.com/nodejs/diagnostics, one of the domains is "Heap and Memory Analysis", although i cannot find anything about post-mortem out of memory analysis.
In our usecase, we run several hundred nodejs processes (in a Docker container), which are restricted to run with very low memory (+- 80MB). When these go out of memory, we want to be able to analyze what's on the heap when it happened.
For that purpose, we created "node-oom-heapdump" (https://github.com/blueconic/node-oom-heapdump), which does exactly that.
Is post-mortem out of memory analysis planned to be included in the Node.js core?
The text was updated successfully, but these errors were encountered: