Llama stream multiple models supported #3505

nigel-daniels · 2023-12-03T03:28:03Z

This adds the streamingModel option to the streaming methods properties, this can be set to: llama2, chatML, falcon, or general. This then turns messages passed to the streaming model into the appropriate format for that model. If no option is specified it defaults to be llama2.

vercel · 2023-12-03T03:28:08Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
langchainjs-api-refs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Dec 3, 2023 3:36am
langchainjs-docs	✅ Ready (Inspect)	Visit Preview		Dec 3, 2023 3:36am

jacoblee93 · 2023-12-04T19:49:03Z

This is fantastic, but I think causing some merge conflicts with some of your earlier changes. Can you rebase off main?

bracesproul · 2023-12-06T20:28:54Z

docs/core_docs/docs/integrations/chat/llama_cpp.mdx

+Or you can provide multiple messages. Nb. The default is for messages to be submitted in Llama2 format, if you are using a different backend model with `node-llama-cpp` then it is possible to specify the model format to use using the `streamingModel` option.  The supported formats are:
+- `llama2`	- A Llama2 model, this is the default.
+- `chatML`  - A ChatML model
+- `falcon`	- A Falcon model


Why are the indents different for each?

bracesproul · 2023-12-06T20:35:39Z

langchain/src/chat_models/llama_cpp.ts

@@ -29,8 +30,14 @@ export interface LlamaCppInputs
 export interface LlamaCppCallOptions extends BaseLanguageModelCallOptions {
  /** The maximum number of tokens the response should contain. */
  maxTokens?: number;
-  /** A function called when matching the provided token array */
-  onToken?: (tokens: number[]) => void;


This is not going to be backwards compatible. Instead we should mark as deprecated

The onToken was included in error and was not actually applied as the eventual implementation used a raw call. I can mark it as deprecated and not used if preferred.

bracesproul · 2023-12-06T20:36:30Z

langchain/src/chat_models/llama_cpp.ts

+   *  'falcon'	- A Falcon model
+   *  'genral'  - Any other model, uses "### Human\n", "### Assistant\n" format
+   */
+  streamingModel?: string;


Use a union type or a check in the constructor that verifies the input is one of these?

Sounds like a good idea, I'll move this up to the constructor.

bracesproul · 2023-12-06T20:43:01Z

langchain/src/chat_models/tests/chatllama_cpp.int.test.ts

+  expect(chunks.length).toBeGreaterThan(1);
+});
+
+test.skip("test multi-mesage streaming call", async () => {


fyi if you were unaware, we don't run int tests in CI so no need to add skip

bracesproul

Overall looks good! I have a few comments, please re-request my review once resolved!

nigel-daniels · 2023-12-08T00:27:04Z

Resubmitted based off main under #3588.

nigel-daniels added 3 commits November 30, 2023 16:27

Added multi message streaming to llama_cpp

24cbe24

Added streamingModel option to support backend models other than Llama2.

ff3d0dd

Added streamingModel option to support backend models other than Llama2.

5a5ddbd

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 3, 2023

dosubot bot added the auto:improvement Medium size change to existing code to handle new use-cases label Dec 3, 2023

nigel-daniels mentioned this pull request Dec 3, 2023

langchain[minor]: Added multi-message streaming to llama_cpp #3463

Merged

vercel bot deployed to Preview – langchainjs-api-refs December 3, 2023 03:35 View deployment

vercel bot deployed to Preview – langchainjs-docs December 3, 2023 03:36 View deployment

jacoblee93 self-assigned this Dec 4, 2023

jacoblee93 added the close PRs that need one or two touch-ups to be ready label Dec 4, 2023

bracesproul reviewed Dec 6, 2023

View reviewed changes

bracesproul requested changes Dec 6, 2023

View reviewed changes

nigel-daniels closed this Dec 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama stream multiple models supported #3505

Llama stream multiple models supported #3505

nigel-daniels commented Dec 3, 2023

vercel bot commented Dec 3, 2023 •

edited

Loading

jacoblee93 commented Dec 4, 2023

bracesproul Dec 6, 2023

bracesproul Dec 6, 2023

nigel-daniels Dec 7, 2023

bracesproul Dec 6, 2023

nigel-daniels Dec 7, 2023

bracesproul Dec 6, 2023

nigel-daniels Dec 7, 2023

bracesproul left a comment

nigel-daniels commented Dec 8, 2023

Llama stream multiple models supported #3505

Llama stream multiple models supported #3505

Conversation

nigel-daniels commented Dec 3, 2023

vercel bot commented Dec 3, 2023 • edited Loading

jacoblee93 commented Dec 4, 2023

bracesproul Dec 6, 2023

Choose a reason for hiding this comment

bracesproul Dec 6, 2023

Choose a reason for hiding this comment

nigel-daniels Dec 7, 2023

Choose a reason for hiding this comment

bracesproul Dec 6, 2023

Choose a reason for hiding this comment

nigel-daniels Dec 7, 2023

Choose a reason for hiding this comment

bracesproul Dec 6, 2023

Choose a reason for hiding this comment

nigel-daniels Dec 7, 2023

Choose a reason for hiding this comment

bracesproul left a comment

Choose a reason for hiding this comment

nigel-daniels commented Dec 8, 2023

vercel bot commented Dec 3, 2023 •

edited

Loading