-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdout/stderr buffering considerations #6379
Comments
|
Developers are a smart bunch. With the proposed |
There is, every option comes at a cost from both sides. That option would need to be supported, and it would actually just toggle the behaviour between two states that are far from being ideal in many cases. As for the option here, the said cyclic buffer maximum size could perhaps be configured using a command-line argument, at the buffer size 0 turning in into a fully blocking API. |
The proposed API On UNIX it is a trivial system call on a fd. It's pretty much the same thing on Windows as far as I recall. Edit: See #6297 (comment) |
We should not be documenting underscore-prefixed properties. This would need to be a separate, true public API. |
In CLI tools, I don't want to have stdout/stderr be blocking though, rather I want to be able to explicitly exit the process and have stdout/stderr flushed before exiting. Has there been any discussion about introducing a way to explicitly exit the process gracefully? Meaning flushing the stdout/stderr and then exit. Maybe something like a |
The exit module is a drop in replacement for Edit: upon further analysis, the |
I think there's a slight discrepancy in interpretation here. Are we talking about console blocking, or console buffering? Albeit similar, these are two different considerations. Further, I'm not entirely sure what the problem is you're trying to fix, @ChALkeR. Could you elaborate on a problem that is currently being faced? I'm not groking what it is that's broken here. One thing to note is that on most systems (usually UNIX systems) However, an explicit This is why the following program still outputs while the loop is running, and on most terminal emulators the same reason why it looks like it's 'jittery'. #include <stdio.h>
int main(void) {
for (int i = 10000000; i > 0; i--) {
printf("This is a relatively not-short line of text. :) %d\n", i);
}
return 0;
} Blocking, on the other hand, doesn't make much sense. What do you want to block? Are you suggesting we
I respectfully disagree. There is absolutely no point in blocking on stdout/stderr. It's very rarely done in the native world and when it is it's usually for concurrency problems (two The above program without
and with
Thats 18.3% of a time increase on this microbenchmark alone. I'm in the 'time is relative and isn't a great indicator of actual performance' camp, so I included the involuntary context switch count (voluntary context switches would be the same between the two runs, obviously). The results show that Further, the feature that was requested by @sindresorhus is part of how executables are formed - at least on UNIX systems, all file descriptors that are buffered are flushed prior to/during I'm still lost as to what is trying to be fixed here... I wouldn't mind seeing a Also, just to reiterate - please don't even start with a |
See the issues I linked above.
No, of course I don't. |
@ChALkeR are we actually seeing this happen? var str = new Array(8000).join('.');
while (true) {
console.log(str);
} works just fine. In what realistic scenario are we seeing the async queue fill up? The three linked issues are toy programs that are meant to break Node, and would break in most languages, in just about all situations, on all systems (resources are finite, obviously). The way I personally see it, there are tradeoffs to having coroutine-like execution paradigms and having slightly higher memory usage is a perfectly acceptable tradeoff for the asynchronicity Node provides. It doesn't handle extreme cases because those cases are well outside any normal or even above normal scenario. The first case, which mentions a break in execution to flush the async buffer, is perfectly reasonable and expected. Side note: thanks for clarifying :) that makes more sense. I was hoping it wasn't seriously proposed we |
|
|
@Qix- Are you saying that memory usage eventually stabilizes when you run this? How much memory does it consume for you? |
@ChALkeR ah no, memory usage goes off the charts. Again, though, is this happening in practical environments outside of stress tests? |
@Qix- Yes, it does. Users have reported it several times already, see links above. |
for (var i = 0; i < 1000000000; i++) {
value = nextPrime(value);
console.log(value);
} This is definitely a stress test. This isn't a practical or realistic use of Node.js - an async scripting wrapper for IO-heavy applications. For instance, function loggit() {
console.log('hello');
setImmediate(loggit);
}
loggit() tops out at 15M memory consumption. I've tweaked #3171's code to reflect the changes here, which now hovers around 45M (instead of inflating to over a gig in about a minute). Of course a stupidly huge while/for loop with a ton of non-blocking IO (not to mention, a function that is solving an NP hard problem) is going to cause problems. Don't do it. It's a simple misunderstanding of how Node works, and if it were up to me I'd brush it off as an error on the developer's part. The second we start baby-sitting bad design like all three of the examples you mentioned and start ad-hoc adding methods like |
Either blocking tty writes or graceful drain of stdout/stderr upon I think both APIs should be part of node core. Devs should have the option to have stdout/stderr block if they so should choose or have stdout/stdout gracefully flushed if they find that more appropriate. Logging to the console 10000000 times in a tight loop is not real life behavior. In reality performance oriented servers rarely log to the console - and when they do it's important that the information is output. But if you did want to log to the console 1e7 times (and run out of memory due to another known node issue) then you would not enable blocking on stdout - so it's a red herring. |
stdout/stderr should flush on program exit anyway. It's the fact node uses the My problem with |
Comment just posted in other thread: #6297 (comment)
|
@kzc yes, that is also a problem. |
The idea that this is "expected" for non blocking streams is fatally flawed: nobody expects logging or stdio to be non blocking, and if you've changed the file descriptor flags, that's an implementation detail that should be hidden from the outside world. Moreover buffering constitutes an implicit promise that it will eventually be written, so at least at exit all buffers should be fully written out. Likewise OOM should trigger memory recovery, including writing out buffers. Delays are less important than broken promises. |
Just posting workaround for this issue for future reference: for (const stream of [process.stdout, process.stderr]) {
stream?._handle?.setBlocking?.(true);
} Before: $ node -p 'process.stdout.write("x".repeat(5e6)); process.exit()' | wc -c
65536 After: $ node -p 'for (const stream of [process.stdout, process.stderr]) stream?._handle?.setBlocking?.(true); process.stdout.write("x".repeat(5e6)); process.exit()' | wc -c
5000000 |
It seems that the phrase "non blocking" has been used in this discussion to mean something different from the POSIX definition, and that may account for some of the confusion. In POSIX terms, buffering is what happens within a process, while non blocking and asynchronous are two different ways of interacting with the kernel. A non blocking ( In contrast, an asynchronous kernel call will neither succeed nor fail, but rather it returns to the running process immediately and subsequently uses a signal to notify the process of the success or failure of the requested operation. That's also not what is being discussed here, but could be useful as an implementation detail for buffering in userspace. What is wanted here is the equivalent of the C language's I would strongly urge avoiding using the term "blocking" when defining a new public method to control this, as it unnecessarily muddles the meaning of related terms cf |
@kurahaupo - Irrespective of whether the usage of the term |
If I understand this issue correctly this is the place to post my feedback. I wrote a Node.js Native Messaging host that streams output from I observed that RSS constantly increases during usage. Perhaps someone in the Node.js realm can explain why this occurs and how to fix this memory leak https://thankful-butter-handball.glitch.me/. |
This issue has been open for nearly 7 years. In the mean time mitigating work has been done and lots of discussion took place in other issues. I'm going to put this one out to pasture. |
For people that are looking for a workarounds, many are given here: https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe. The one I choose was https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe/25378#25378, which is to wrap the command you want to execute with the 'script' command, which has a way to unbuffer the stdout.
All the other solutions in that linked thread required installing a new program, but the |
I tried to discuss this some time ago at IRC, but postponed it for quite a long time. Also I started the discussion of this in #1741, but I would like to extract the more specific discussion to a separate issue.
I could miss some details, but will try to give a quick overview here.
Several issues here:
console.log
(e.g. calling it in a loop) could chew up all the memory and die — Why does node/io.js hang when printing to the console inside a loop? #1741, Silly program or memory leak? #2970, Strange memory leaks #3171, memory leaks of console.log #18013.console.log
has different behavior while printing to a terminal and being redirected to a file. — Why does node/io.js hang when printing to the console inside a loop? #1741 (comment).As I understand it — the output has an implicit write buffer (as it's non-blocking) of unlimited size.
One approach to fixing this would be to:
For almost all cases, except for the ones that are currently broken, this would behave as a non-blocking buffer (because writes to the buffer are considerably faster than writes from the buffer to file/terminal).
For cases when the data is being piped to the output too quickly and when the output file/terminal does not manage to output it at the same rate — the write would turn into a blocking operation. It would also be blocking at the exit until all the data is written.
Another approach would be to monitor (and limit) the size of data that is contained in the implicit buffer coming from the async queue, and make the operations block when that limit is reached.
The text was updated successfully, but these errors were encountered: