Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for chunked transfer encoding #1911

Open
carlos-verdes opened this issue Oct 20, 2023 · 38 comments
Open

Support for chunked transfer encoding #1911

carlos-verdes opened this issue Oct 20, 2023 · 38 comments

Comments

@carlos-verdes
Copy link

Hi, I have a server that is responding with a http chunked response to load data as it's available (also it reduces the consumption of memory on the server):
https://en.wikipedia.org/wiki/Chunked_transfer_encoding

When I make a direct call with my browser I can see how the results pop-up as they are flushed from the server (as expected) however if I use hx-get I don't see anything in the screen until the full response is sent (not the expected behavior)

I see there is support for SSE and Web Sockets, is there a plan to support this feature also?

@nickchomey
Copy link
Contributor

nickchomey commented Oct 20, 2023

This doesn't answer your question, but have you considered using the newer Streams API instead of chunked transfer encoding?
https://developer.mozilla.org/en-US/docs/Web/API/Streams_API

You might also consider using a service worker to fix this problem more immediately than htmx might get around to it - it'll intercept all network requests and you could deal with it as needed (as well as do LOTS more)

@carlos-verdes
Copy link
Author

I use that API to read from the server when doing SPA development, that's why I'm asking if HTMX will support this feature or not

The good thing about it is you don't need to change the protocol and most of HTTP and actually HTMX just works, the only problem is behavior is not as expected (HTMX waits for full response to be sent before rendering anything back to the user)

@joetifa2003
Copy link

I really need this feature (http streaming).

@sjc5
Copy link
Contributor

sjc5 commented Nov 14, 2023

It would be lovely to see this supported

@jtoppine
Copy link

+1

Example use case: Streaming search results. Instead of complex "infinite scroll" or other types of convoluted result delivery schemes, we could just start pushing results to the client as soon as we get first hits from the database - client could start loading and rendering images etc for the first entries in listing right away. Ohh, it would be so straightforward and beautiful, so old school in the bestest of ways.

@carlos-verdes
Copy link
Author

I send all collections from my backend using streams to avoid also memory pressure on big collections so for me is a natural thing to do

@alexpetros
Copy link
Collaborator

Had to close the above PR but I think the existing extension mechanism should be more than sufficient to implement this. Would love if someone wanted to take that on.

@douglasduteil
Copy link

@alexpetros I'm on it ;)

@carlos-verdes
Copy link
Author

Are you doing extension for this @douglasduteil ?
Can you share link to the PR when ready?

@douglasduteil
Copy link

🎉 https://github.com/douglasduteil/htmx.ext...chunked-transfer 🎉

\to @carlos-verdes it's christmas time again 🎄

Install

$ npm install htmx.ext...chunked-transfer
<script src="https://unpkg.com/htmx.ext...chunked-transfer/dist/index.js"></script>

Usage

<body hx-ext="chunked-transfer">
  ...
</body>

⚠️ It's a very early version that I'm not using myself

@mattbrandman
Copy link

I don't know if there is a plan to add this into the base of HTMX or rely on extensions but I thought this is an example of the functionality folks are looking for https://livewire.laravel.com/docs/wire-stream. The ability to append vs replace is a nice addition as well.

I would bet a good deal of this is around streaming back AI based content. While using SSE or webhooks is an option it adds complexity to infra depending on the surrounding infra. The chunked transfer encoding feels clean because once all the data is sent the connection is closed vs SSE needing to be replaced to be closed with no real "polite close" or having to deal with websockets in general.

The other simplicity comes from not having to deal with channels for SSE or websockets on the server for multiple client, when you want to send back to only the sender the websocket / sse solutions feel heavyweight.

@JEBailey
Copy link

I have an ES5 version of an extension that supports the chunked encoding that I've been using internally at the company I work for.

https://github.com/JEBailey/htmx/blob/master/src/ext/chunked.js

@alexpetros
Copy link
Collaborator

So I originally closed #2101 because 2.0 was coming up and we weren't going to make that happen in time. I'm seeing some compelling use-cases and @douglasduteil's extension looks like it's been working. Are people using it? What's the case for including this in core?

@fabge
Copy link

fabge commented Aug 9, 2024

one usecase would be a fairly simple chatbot app which supports streaming.
no need for websockets, no need for server-sent events, no need for keeping a connection to the server.
you could simply make a request to the server, the server sends a Transfer-Encoding: chunked response, which will then incrementally be swapped/added into the respective chat bubble.

@runeksvendsen
Copy link

So I originally closed #2101 because 2.0 was coming up and we weren't going to make that happen in time. I'm seeing some compelling use-cases and @douglasduteil's extension looks like it's been working. Are people using it? What's the case for including this in core?

Hi @alexpetros,

Yes, I'm using @douglasduteil's extension in https://github.com/runeksvendsen/haskell-function-graph and it's working for me. Thank you @douglasduteil!

The case for including it in core is that it solves a very generic problem: you don't want the user to wait for the very last part of your page to be received until the first part of the page is shown. The larger the time difference between the backend having the first and last result available the worse this problem is.
In my case, I have a page that includes in the following order: (1) a list of results, where the first result is usually available to the backend very quickly (within a few milliseconds) and the last result can take an additional ~second to become available; followed by (2) an SVG graph that's slow to generate, because it calls out to a CLI executable. Without this extension, the user has to wait around ~two seconds to see the first results, even though they're available to the backend (and sent to the client) almost immediately.

@jph00
Copy link

jph00 commented Aug 13, 2024

@alexpetros a lot of our users are asking for chunked transfer support to make it easier to stream responses from language models, and similar tasks.

@alexpetros
Copy link
Collaborator

@jph00 have you tried the extension? I would like to get some feedback on how the extension is working for people

@nickchomey
Copy link
Contributor

Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work

@jph00
Copy link

jph00 commented Aug 13, 2024

Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work

I think basically the benefit is that it's simpler.

@schungx
Copy link

schungx commented Aug 14, 2024

Incidentally I have a couple of pages where the user can (potentially) load a few thousand rows' worth of data on-screen.

How can I setup the extension such that it, sorta, stream like 50 rows at a time?

@mattbrandman
Copy link

I'm using the extension and it works well for my use case there's a few points that break like trying to use templates with declarative shadowdom doesn't seem to render the initial components https://templ.guide/server-side-rendering/streaming. Though that may have to do with how swapping works read somewhere that it doesn't work with innerhtml call but may be wrong.

Having something official just means one less extension, potentially more hooks, and less of a chance of breaking with future updates. Am also using it with LLM streaming like others!

@runeksvendsen
Copy link

Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work

I think basically the benefit is that it's simpler.

Also:

@nickchomey
Copy link
Contributor

Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work

I think basically the benefit is that it's simpler.

Also:

Thanks! Can you do all the same things with chunked transfer encoding as you can with sse? Specifically,

  • keep a long-lived connection open and periodically send messages to the browser?
  • allow for JavaScript (eg htmx or any other script, service worker etc) to initiate and receive/process the messages?

@mattbrandman
Copy link

mattbrandman commented Aug 14, 2024

Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work

I think basically the benefit is that it's simpler.

Also:

Thanks! Can you do all the same things with chunked transfer encoding as you can with sse? Specifically,

  • keep a long-lived connection open and periodically send messages to the browser?
  • allow for JavaScript (eg htmx or any other script, service worker etc) to initiate and receive/process the messages?

I don't believe that you can keep it open for a long time (or at least it's not the norm).

Connections are initiated with a standard PUT / POST / GET etc... which is really nice vs SSEs get-only setup. For LLMs specifically it's almost always a non-get request.

@nickchomey
Copy link
Contributor

Could someone please explain how this would be useful as opposed to using the existing SSE extension? I'm surely missing something fundamental about how they work

I think basically the benefit is that it's simpler.

Also:

Thanks! Can you do all the same things with chunked transfer encoding as you can with sse? Specifically,

  • keep a long-lived connection open and periodically send messages to the browser?
  • allow for JavaScript (eg htmx or any other script, service worker etc) to initiate and receive/process the messages?

I don't believe that you can keep it open for a long time (or at least it's not the norm).

Connections are initiated with a standard PUT / POST / GET etc... which is really nice vs SSEs get-only setup. For LLMs specifically it's almost always a non-get request.

You can initiate SSE with POST etc... Here's two libraries that make it easy

https://github.com/rexxars/eventsource-client
https://github.com/Azure/fetch-event-source

@runeksvendsen
Copy link

Thanks! Can you do all the same things with chunked transfer encoding as you can with sse?

No, you definitely cannot. Chunked transfer encoding is just a way to transfer data from a server to a client (and not the other way around).

That's why it doesn't require JavaScript and supports ancient browsers.

@fabge
Copy link

fabge commented Aug 15, 2024

That's why it doesn't require JavaScript and supports ancient browsers.

which is exactly why it should be supported by default 🙏

@nickchomey
Copy link
Contributor

nickchomey commented Aug 16, 2024

But htmx IS JavaScript... And given that some have confirmed that SSE is a more robust and functional protocol than chunked transfers, what's the point of this request?

The only thing I can perhaps think of is that chunked transfers can return binary streams whereas SSE is only text. But no one has brought that up as a use case.

And it's not like sse is complicated to set up - it, too, is part of the http protocol so you literally just use a different header.

I'm just very perplexed, but quite open to being educated on what I'm missing about why this is useful, let alone needed.

@fabge
Copy link

fabge commented Aug 18, 2024

sse - just as websockets - establishes long lived connections to the server.

let's take the example of a simple chatbot with a backend deployed on a serverless platform (e.g. aws lambda) where the backend runs just a couple of millseconds at a time (up to a couple of minutes at most).

if we want to have a streaming response, the simplest approach would be to use chunked transfer of the content. this would mean not needing any persistent connection (websockets or sse) between the client and server.

e.g. it would be a simple POST request with the message from the client side and a chunked transfer of the response from the server.

@nickchomey
Copy link
Contributor

nickchomey commented Aug 18, 2024

Thanks - what I was missing was that the use case is not long-lived, server push. It's just looking for a way to send a streamed response to a user-initiated request, and then close the connection when done. The streaming part of this is what had me focusing on the sse/ws stuff, but it's more a matter of push vs pull.

@fabge
Copy link

fabge commented Aug 18, 2024

yes! and also reducing complexity, when the only thing you want is a streaming response. sse and websocket can be quite intimidating at first i think.

@runeksvendsen
Copy link

But htmx IS JavaScript... And given that some have confirmed that SSE is a more robust and functional protocol than chunked transfers, what's the point of this request?

[...]

I'm just very perplexed, but quite open to being educated on what I'm missing about why this is useful, let alone needed.

The main reason I think it should be included in HTMX core is that not including it breaks Progressive Enhancement for chunked transfer encoding.

As I explain in my duplicate issue (#2789), the degree of brokenness varies depending on how long it takes the server to close the connection, but in the worst case it's a difference between a boosted link never displaying any content if HTMX is enabled while content is displayed immediately in case it's disabled.

@carlos-verdes
Copy link
Author

I found another reason, someone said you can achieve the same feature with SSE however SSE is meant to live all the time the component is rendered.

In a classic search example where you want to send results to the browser as they are found (for user experience and to avoid memory pressure in the backend) using SSE causes the browser to reconnect once the last result is found (as the channel is closed on the server) making it show like "infinite results" coming when in reality is just the same query executed again and again.

@wjkoh
Copy link

wjkoh commented Oct 29, 2024

I found another HTMX extension that is similar to chunked transfer encoding: https://github.com/alarbada/htmx-stream. Thanks to @alarbada, it's very simple to display progressively generated content over a single GET or POST request. I would really love it if this HTTP/1.1 functionality were supported by HTMX by default.

@alarbada
Copy link

alarbada commented Oct 29, 2024

Hello @wjkoh. My very much alpha project works somewhat similar to SSE, but I did have the difficulties that @carlos-verdes said. In a way it is much easier to use, as you don't need to keep any SSE connection.

You could think of it as react server components, new html is streamed from the server. I did find that replacing the whole thing instead of appending chunks is easier to think about from the server.

I did not further continue with that project as I'm not using htmx anymore, if anybody interested feel free to fork.

@carlos-verdes
Copy link
Author

@alarbada can you mention which framework are you using now?
I really like htmx, just having issues with this specific point.

@carlos-verdes
Copy link
Author

@alarbada you saved my life honestly, I copy/pasted your code and just changed behavior to append instead of full rewrite, now I can just send chunked responses and it works as expected!

@alarbada
Copy link

@carlos-verdes man I changed my stacks back and forth this year, the current thing I've settled with is go + my own thing that compiles handlers into ts sdk + astro with htmx and solidjs.

I know it looks like a pile of random tools but in my pet project works like wonders :D

htmx is my spa router, solidjs is for client side interactivity, astro is the backbone of all, go is there because I hate backend js with a passion.

If I was to program this streaming thing with this stack I would just stream jsons in go and manually parse them in solidjs in this setup, and because htmx doesn't reload the page my solidjs state is still there, even if the solidjs html is replaced by htmx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests