Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise idleTime mechanism #10008

Closed
vladsud opened this issue Apr 22, 2022 · 1 comment
Closed

Revise idleTime mechanism #10008

vladsud opened this issue Apr 22, 2022 · 1 comment
Assignees
Milestone

Comments

@vladsud
Copy link
Contributor

vladsud commented Apr 22, 2022

@NicholasCouri asked in #10007

I'm not sure if changing idleTime right now, together with MaxOps is the right thing to do - should we wait for more data ? Currently, idleTime is 15 secs.

I believe it makes total sense to reduce idleTime to 10s, maybe even 5s. And data suggests that 72% of last summaries (files at rest, looking at last summary produced for such files) are done with reason = "idle" and P95 = 138 ops being summarized, so reducing this timer will reduce number of trailing ops for sure. It's possible that the need for that change will be reduced due to MaxOps change in that PR.

But the main reason why I think we should take it a bit slower is different - it's wasteful to summarize 2-3 ops frequently - it does not give us much but increases COGS and pressure on service. So I think we should think how to have much tighter idleTime when there are many unsummarized ops, and lax idleTime when we have few. I.e. it seems the best outcome is

  • Remove maxOps trigger (but keep the value in policy)
  • Defined idleTime = log(unsummarized ops / maxOps), or something very similar - i.e. having close to infinite value (well, not infinite, but high, let's say 30 seconds) when we have 1 unsummarized op, and zero when we reach maxOps unsummarized ops.

That way it becomes less function of idleness of the product, and more a function of rate of changes.

Useful query to look at last summary stats:

let lastSummary = union Office_Fluid_FluidRuntime_*
| where Data_eventName == "fluid:telemetry:Summarizer:Running:Summarize_generate"
| summarize max(Event_Time) by Data_docId;
let end = union Office_Fluid_FluidRuntime_*
| where Data_eventName == "fluid:telemetry:Summarizer:Running:Summarize_generate"
| where Event_Time < ago(3d)
| project Event_Time, Data_docId, reason = Data_summarizeReason, Data_opsSinceLastSummary;
lastSummary | join (end) on Data_docId, $left.max_Event_Time == $right.Event_Time
| summarize count(), avg(Data_opsSinceLastSummary), percentile(Data_opsSinceLastSummary, 95) by reason

@kian-thompson
Copy link
Contributor

kian-thompson commented May 25, 2022

Closed to track in AB#128

kian-thompson added a commit that referenced this issue Jun 16, 2022
AB#128
* Initial implementation

* Use const variables so expression is easier to read

* Add comments

* Add to config in unit tests

* Add comments

* API changes

* Fix unit tests

* Unit test new idle timer functionality

* Fix minIdleTime comment

* Attempt to fix SharedTree tests

* Using new min max idleTime in unit tests

* Fix lint

* Update BREAKING.md

* Calculate idleTime based on op weights

* Change default minIdleTime to 0

* Fix lint

* Improve doc comments

* Fix test coming from main integration

* Changes to API md
ghost pushed a commit that referenced this issue Aug 24, 2022
# [AB#762](https://dev.azure.com/fluidframework/internal/_workitems/edit/762)

For more information about how to contribute to this repo, visit [this page](https://github.com/microsoft/FluidFramework/blob/main/CONTRIBUTING.md).

## Description

Deprecated by [AB#128](https://dev.azure.com/fluidframework/internal/_workitems/edit/128) and [#10008](#10008).

## Does this introduce a breaking change?

Please move all usage to the new `minIdleTime` and `maxIdleTime` properties in `ISummaryConfigurationHeuristics`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants