-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Models can not be scaled up to 2 copies #151
Comments
Hi @haiminh2001, here is the commit with more details about the motivation, not sure if you already see it, if not, it can help understand it better. |
Hi @spolti , I have read the commit before, but in short, my question is: Why using a model continuously would not trigger the second copy, but I have to use the model, wait for 7 minutes without any usage and then finally use the model again? That does not make sense to me.
And in the commit said: |
Context
I am learning how the auto scaling of the model mesh works. I found this piece of docs:
It is so vague that I have to dive in the source code, and then I found this code:
As far as I understand, the logic if a model is recently used and there is a prior usage of it falling into the interval of 40 minutes and 7 minutes before the correspond time, the model should be scaled to 2 copies.
Current behaviour
If a model is consistently used, the earlierUseIteration and lastUsedIteration will be updated continuously, to the last check time. That logic is indicated in these lines of code:
i2inRange and i1inRange will never be true, since both i1 i2 will always be updated to the most recently point of time and consequently exceed the upper point.
Therefore a model has to be used once, wait for 7 minutes without receiving any requests in order to be scaled to 2 copies ( I have tested that behaviour).
Having to wait for 7 minutes
Expectation
If a model is being used consistently for over 7 minutes, that model should be scaled to 2 copies.
Suggestion
Perhaps the point of time that the oldest request that not exceed 40 minutes should be recorded instead of the earlierUseIteration. The remaining logic is the same.
The text was updated successfully, but these errors were encountered: