-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: model parameter threads
doesn't work
#114
Comments
@pafik13 Thanks for the detailed issue, it helped me a lot to investigate this problem :) |
@giladgd can you point me to where you found the problem? if I can fix it locally, I can submit a PR. |
🎉 This issue has been resolved in version 3.0.0-beta.2 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
🎉 This issue has been resolved in version 3.0.0-beta.4 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
🎉 This PR is included in version 3.0.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Issue description
It seems to me that parameter
threads
doesn't work as expectedExpected Behavior
If I have 24 CPUs and pass
threads:24
then all CPUs should be utilized. II tried call originalllama.cpp
with argument-t 24
and it works normally as expected.Actual Behavior
I pass parameter
thread: 24 or 1
to constructor and nothing is changed: always it starts utilize 4 CPUs upper 80% and sometimes use 1-2 additional with 25-50% utilization.Steps to reproduce
Try to pass different
threads
value to model constructor and observe CPUs utilization (for example,htop
)My Environment
node-llama-cpp
versionAdditional Context
./llama.cpp/main -m ./catai/models/phind-codellama-34b-q3_k_s -p "Please, write JavaScript function to sort array" -ins -t 24
Relevant Features Used
Are you willing to resolve this issue by submitting a Pull Request?
Yes, I have the time, but I don't know how to start. I would need guidance.
The text was updated successfully, but these errors were encountered: