-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate fast and smart llm providers #813
Conversation
Hey @kesamet sorry it took me long to reply here. First of all this is super super valuable!! We've been getting a lot of requests to use different LLMs for different actions. A bit of feedback:
This approach is probably in the right direction and would be optimal. Lmk if you're up for it or I can take it from here! |
Hey @assafelovic, thanks for the feedback and sorry for the late reply. Taking inspiration from AWS Bedrock where the model name is something like "anthropic.claude-3-sonnet-20240229-v1:0", I think it is best to combine LLM_PROVIDER and LLM_MODEL in env vars into a combined name "<llm_provider>:<llm_model>" using semi-colon. I call it "FAST_LLM_NAME" and "SMART_LLM_NAME". Eg
What do you think? |
This actually sounds great! Will require some refactoring but love it |
Please help to review the PR. |
Embeddings would be amazing @kesamet as well. Thank you for your contributions, I will dive into this PR in the coming days |
Hey @kesamet is this ready for review? |
@assafelovic yes |
But I noticed that |
@kesamet it was used to summarize retrieved articles but since we're using embedding retrieval it's in no use right now. Still worth having the configs in case we find other use cases for it |
@kesamet everything looks great, may i ask for one last revision and remove the |
@assafelovic Done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @kesamet this is huge and will open many oppurtunities! I'll send an update on this to the community in a few days
Different LLM sources for "SMART" and "FAST".
For issues #702, #598