Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama stream multiple models supported #3505

Closed

Conversation

nigel-daniels
Copy link
Contributor

This adds the streamingModel option to the streaming methods properties, this can be set to: llama2, chatML, falcon, or general. This then turns messages passed to the streaming model into the appropriate format for that model. If no option is specified it defaults to be llama2.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 3, 2023
Copy link

vercel bot commented Dec 3, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 3, 2023 3:36am
langchainjs-docs ✅ Ready (Inspect) Visit Preview Dec 3, 2023 3:36am

@jacoblee93
Copy link
Collaborator

This is fantastic, but I think causing some merge conflicts with some of your earlier changes. Can you rebase off main?

@jacoblee93 jacoblee93 self-assigned this Dec 4, 2023
@jacoblee93 jacoblee93 added the close PRs that need one or two touch-ups to be ready label Dec 4, 2023
Comment on lines +64 to +67
Or you can provide multiple messages. Nb. The default is for messages to be submitted in Llama2 format, if you are using a different backend model with `node-llama-cpp` then it is possible to specify the model format to use using the `streamingModel` option. The supported formats are:
- `llama2` - A Llama2 model, this is the default.
- `chatML` - A ChatML model
- `falcon` - A Falcon model
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the indents different for each?

@@ -29,8 +30,14 @@ export interface LlamaCppInputs
export interface LlamaCppCallOptions extends BaseLanguageModelCallOptions {
/** The maximum number of tokens the response should contain. */
maxTokens?: number;
/** A function called when matching the provided token array */
onToken?: (tokens: number[]) => void;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not going to be backwards compatible. Instead we should mark as deprecated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The onToken was included in error and was not actually applied as the eventual implementation used a raw call. I can mark it as deprecated and not used if preferred.

* 'falcon' - A Falcon model
* 'genral' - Any other model, uses "### Human\n", "### Assistant\n" format
*/
streamingModel?: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a union type or a check in the constructor that verifies the input is one of these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea, I'll move this up to the constructor.

expect(chunks.length).toBeGreaterThan(1);
});

test.skip("test multi-mesage streaming call", async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi if you were unaware, we don't run int tests in CI so no need to add skip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx.

Copy link
Member

@bracesproul bracesproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! I have a few comments, please re-request my review once resolved!

@nigel-daniels
Copy link
Contributor Author

Resubmitted based off main under #3588.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases close PRs that need one or two touch-ups to be ready size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants