-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stream: mutable highwatermark #33346
stream: mutable highwatermark #33346
Conversation
As far as the original issue goes, perhaps it would be enough to instead pass in |
yep, I was intending to do that. But just seemed like a hack than a solution. And it was suggested that I just make them mutable. |
I see it as the opposite. Being able to dynamically change the watermark after stream creation doesn't make much sense and I've never seen a real use case for it. Specifying the |
I am not sure why the checks failed. I don't see any error message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making it official for the reasons previously stated. additionally because i think we should strive to not "pollute" streams objects as much as possible, even if the chance of a property clashing is unlikely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Soo … I agree that it seems odd to be able to change the objectMode
flag on a stream after creation – conceptually, whether a stream is in object mode or not is part of its type and should always be known by the time its constructor is called.
However, for the highWaterMark
, I see no reason not to make it adjustable, and just because @mscdex has “never seen a real use case for it” doesn’t mean that it isn’t a sensible thing to add. The HWM is mostly relevant for the producer/consumer of a stream, which is often different from the creator of the stream and thus not necessarily in control over constructor arguments.
I don’t think property clashing is a realistic argument here.
@nodejs/streams @nodejs/tsc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 in changing objectMode
. I'm ok in adding a way to change the highWaterMark
.
However, instead of adding a new method, we should just add a set
accessor to
Lines 1087 to 1092 in 8a6fab0
readableHighWaterMark: { | |
enumerable: false, | |
get: function() { | |
return this._readableState.highWaterMark; | |
} | |
}, |
@mcollina why are you against objectMode? because I can see use case of wanting to use objectMode over the network. |
Can you please articulate why you'd want to change objectMode? Changing it means that downstream or upstream should be aware of the change. Morever, it means that it will possible to change this for streams that are binary-only by default. It also changes how encoding works etc. |
wouldn't the client receive chunks of object when requesting (http.request) if objectMode is not set to true? |
var http = require('http');
const { Readable, Writable } = require('stream')
let i = 0;
const myReadable = new Readable({
objectMode: true,
read(size) {
if(i < 3){
this.push(Buffer.alloc(65536));
} else {
this.push(null);
}
i++;
}
});
const myWritable = new Writable({
objectMode: true,
write(chunk, encoding, callback) {
console.log("write length: ", chunk.length);
callback();
}
});
var server = http.createServer(function (req, res) {
myReadable.pipe(res);
});
server.listen(8000, '127.0.0.1');
http.get('http://localhost:8000', res => {
res.pipe(myWritable);
}); Output:
The idea behind making objectMode mutable was to "fix" this issue. Where each objects are broken down into chunks even tho both |
@0807Jpatel Sorry, I'm confused here. What is it we're trying to solve? The changing of |
In your example @0807Jpatel, the code is not really broken. In I agree with the others, making |
@jasnell @ronag, Sorry. My fault it seems like I understood the objectMode wrong. |
Correct, most streams operate in byte-mode ( |
@jasnell could you please explain why we don't support objectMode for outgoing message, because to me it makes sense that user would want to send objects through SSE. |
Not true. A trick I use all the time it to emit strings or buffer in object mode, and then pipe that into a Writable in binary. |
Yeah, we do have tests that check that piping between object-mode/non-object-mode streams works so long as there’s only Buffers/strings involved :) |
@0807Jpatel I think you might be mixing up a few things here:
In order to send an object over HTTP - you must first specify how Node.js should translate it into bytes. This is done by serializing it into a data exchange format like protocol buffers or JSON. Now your use case isn't unreasonable. Some frameworks (asp.net WebAPI comes to mind) perform "content negotiation" - based on the incoming message and serialize the stream in an output format they think the client understands (still in bytes). Streams solve a very real case for Node and most Node programs build on them (file streams, network streams etc). They serve a fundamental and crucial role in the platform. Now - what about object mode? If I understand correctly it was a happy accident (@Raynos asked for them?). Object mode is certainly useful but it doesn't serve the same "building block" IMO and doesn't belong in the same layer. An object mode "stream" for HTTP would need to perform said "content negotiation" with the client and switching in the middle of a request would be confusing. Mostly since the stream would need to know how to serialize its entities in order to do so correctly (and not all objects are serializable and certainly not in every representation). As for this PR, I agree with @addaleax's comment #33346 (review) and I echo her use case of streams whose creation users don't control. There are plenty of examples, here is one related to a use case I was discussing with Matteo:
I could pass a stream in instead of a string (that's how I worked around it), but I can definitely see the use case here and the solution I ended up owith was very unergonomic. |
doc/api/stream.md
Outdated
@@ -568,6 +568,24 @@ added: v12.3.0 | |||
|
|||
Getter for the property `objectMode` of a given `Writable` stream. | |||
|
|||
##### `writable.updateWritableHighwaterMark(highwaterMark)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can actually see a case for this part and updating the watermarks. I even needed to use the lowWatermark at some point for a specific use case at my old job and was sad to find out we don't have one anymore :]
I think this sort of change would need to come with a concrete use case of why I'd update the watermarks.
The original use case for This is a Writable stream of bytes ( or utf8 strings ) and a Readable stream of javascript objects I don't believe node core has a use case for I believe I can't imagine any use case for changing it after the creation of a stream. If you want to know another fun story about why |
For changing |
@ronag I think the major issue is that none of are are completely sure about this :D I have an example use case here at the bottom #33346 (comment) To be fair at my last job (building stuff for p2p CDNs) this sort of "games" with watermarks were extremely common and I ran into a lot of issues with watermarks I worked around - so my point of view is probably very biased towards one use case where really accurate control of the amount of buffering is important. For example - I ran into the lack of low watermarks since Node.js 0.9 and maintained only to be the second person to ask for it if someone ever did - but no one did while I was working on the p2p stuff so I never asked for it :] |
You can already raise the Which makes me wonder why we need https://github.com/nodejs/node/pull/33346/files#diff-ba6a0df0f5212f5cba5ca5179e209a17R1096 (which is my primary concern). Likewise, I'm uncertain why exactly we need the corresponding condition in
Note that sometimes these kind of things are a chain of streams and you are just receiving the head ot the chain. Thinking that changing the watermark will do what you expected is going to be unfortunate and confusing. e.g. function myApi(...) {
return pipeline(readFromFile(...), someTransform())
}
const src = myApi()
src.highWaterMark = 1e9 // this will only change the last in the pipeline chain... if it doesn't kill it with the current PR :/ |
Technically that would work but that's not the same thing as just setting the high watermark. In my use case I don't actually want to read anything yet and I don't want to pull the data - I just want to set up the amount of buffering. Asking users to One can argue these use cases are so unique they don't justify complicating our code or API - I will concede to that if you feel strongly it doesn't justify the added complexity.
Yes, that's actually the point David raised (of double buffering and watermarks) in the deno streams discussion I linked to above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissing my review in the meantime so this doesn't land by accident until I am convinced again this is a good idea and discussions have exhausted.
To be clear - the actual code and changes LGTM and I am still +1 on this changes, but there are ongoing discussions and I want to make sure this doesn't get merged (with the 2 LGTMs) while they are happening.
I think configuration of watermarks should be done during creation to make it consistent and work without edge cases. So with @benjamingr fork example #33346 (review). I think the watermark is an option you pass to |
The problem I see is even if configuration of watermarks happens during creation that doesn't ensure no double buffering occurs or make any guarantees mutable watermarks don't have. Maybe @0807Jpatel can share what use case they had that these changes address? I want more voices and use cases to be heard other than mine here :] |
I'm -0 with the |
I agree on @ronag that previous solution made more sense, but it would definitely lead to some bad practice. AFAICT there should not be any reason to change highWaterMark while reading/writing. Our use case is actually very simple. we need to reduce the highwatermark value of response. var server = http.createServer(function (request, response) {
response.writableHighWaterMark = 5 * 1024;
myReadable.pipe(response);
}); |
Neither do I. But |
The whole idea behind this is to change highWaterMark before you start doing read/write that also includes pipe. |
|
Why do you want to reduce it? Reduce memory usage? Can we think of some other way to solve this? e.g. a watermark option to Another consideration would be to add it as an option to |
@ronag we have gone full circle here. This thread started with that exact suggestion. When I asked for the feature request, it was to add IMO we should make |
@0807Jpatel hey, just wanted to apologize for the (probably) frustrating process of being asked to do multiple contradicting things :] I hope it's clear that this is genuinely because we are honestly not sure what the right thing here (as a project) is. I think there was a misunderstanding (in our user base and in core) regarding what Here are some things we may want to do to move this forward:
I'll happily help facilitate either of these 4 options (or an alternative I haven't considered yet). You are also welcome to send me an email (all our emails are in the README). Intuitively I recommend the first one (so more people have a chance to attend). |
@0807Jpatel any particular preference to either of the options? |
@benjamingr I think making highwaterMark is what I would prefer. Someone has done a patch work for http.request to take in highwatermark as options. The issue with that route is we will eventually see a use case that can only be solve by making highwater mark mutable, and all these patch works will still need to be part of the code. |
@nodejs/tsc can you please provide guidance about how to proceed here? This isn't "stalled" and there are no fundamental disagreements, we need some way to:
Right now we have a first-time-contributor who has opened several PRs attempting to resolve this and we have given them confusing/conflicting comments in several places. I'm a bit frustrated that this has been their experience here. (It's a tricky change that's fair) |
I'm more than happy to further discuss this. Though I think what OP is actually trying to solve here has gotten a bit lost (see refs). I believe there are a few different things here: More backpressure on server incoming message (#30107). Which lead to (#30107):
Which lead to this PR:
@benjamingr FYI the OP's referenced original issue #32731 has been resolved/closed #32731 (comment) |
@ronag sorry that close was premature, transform did not solve the issue we are facing. We are just trying to reduce highWaterMark of
So I think making highwaterMark mutable is better, as we have seen couple of use cases already and no significant draw back. I think most of us in this thread agree that it is okay to add this, we are just trying to figure out whats the best way to implement this. |
This comment has been minimized.
This comment has been minimized.
What is the issue you are facing and why did transform not resolve it? From what I understood you just wanted to create more backpressure?
I'm not blocking this but I'm skeptical of its usefulness and worried about complexity and user confusion. Since the highwatermark is kind of a hint... writing code that actually depends on the observable behavior of the highwatermark is something I would discourage (unless in objectMode).
I'm not sure I followed the reason for not going down this track? |
I would actually recommend this (independently of this patch). Thinking about this a bit more, It would be really hard to support this for |
8ae28ff
to
2935f72
Compare
not up to sync on this and don't want to block by accident
I'm guessing that this PR, being quite old, should be closed. |
added methods to update highWaterMark and objectMode values of Readable and Writable streams.
Refs: #32731
Can also be used for following: #30107
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes