Revise idleTime mechanism #10008

vladsud · 2022-04-22T15:38:25Z

I'm not sure if changing idleTime right now, together with MaxOps is the right thing to do - should we wait for more data ? Currently, idleTime is 15 secs.

I believe it makes total sense to reduce idleTime to 10s, maybe even 5s. And data suggests that 72% of last summaries (files at rest, looking at last summary produced for such files) are done with reason = "idle" and P95 = 138 ops being summarized, so reducing this timer will reduce number of trailing ops for sure. It's possible that the need for that change will be reduced due to MaxOps change in that PR.

But the main reason why I think we should take it a bit slower is different - it's wasteful to summarize 2-3 ops frequently - it does not give us much but increases COGS and pressure on service. So I think we should think how to have much tighter idleTime when there are many unsummarized ops, and lax idleTime when we have few. I.e. it seems the best outcome is

Remove maxOps trigger (but keep the value in policy)
Defined idleTime = log(unsummarized ops / maxOps), or something very similar - i.e. having close to infinite value (well, not infinite, but high, let's say 30 seconds) when we have 1 unsummarized op, and zero when we reach maxOps unsummarized ops.

That way it becomes less function of idleness of the product, and more a function of rate of changes.

Useful query to look at last summary stats:

let lastSummary = union Office_Fluid_FluidRuntime_*
| where Data_eventName == "fluid:telemetry:Summarizer:Running:Summarize_generate"
| summarize max(Event_Time) by Data_docId;
let end = union Office_Fluid_FluidRuntime_*
| where Data_eventName == "fluid:telemetry:Summarizer:Running:Summarize_generate"
| where Event_Time < ago(3d)
| project Event_Time, Data_docId, reason = Data_summarizeReason, Data_opsSinceLastSummary;
lastSummary | join (end) on Data_docId, $left.max_Event_Time == $right.Event_Time
| summarize count(), avg(Data_opsSinceLastSummary), percentile(Data_opsSinceLastSummary, 95) by reason

kian-thompson · 2022-05-25T00:04:49Z

Closed to track in AB#128

AB#128 * Initial implementation * Use const variables so expression is easier to read * Add comments * Add to config in unit tests * Add comments * API changes * Fix unit tests * Unit test new idle timer functionality * Fix minIdleTime comment * Attempt to fix SharedTree tests * Using new min max idleTime in unit tests * Fix lint * Update BREAKING.md * Calculate idleTime based on op weights * Change default minIdleTime to 0 * Fix lint * Improve doc comments * Fix test coming from main integration * Changes to API md

# [AB#762](https://dev.azure.com/fluidframework/internal/_workitems/edit/762) For more information about how to contribute to this repo, visit [this page](https://github.com/microsoft/FluidFramework/blob/main/CONTRIBUTING.md). ## Description Deprecated by [AB#128](https://dev.azure.com/fluidframework/internal/_workitems/edit/128) and [#10008](#10008). ## Does this introduce a breaking change? Please move all usage to the new `minIdleTime` and `maxIdleTime` properties in `ISummaryConfigurationHeuristics`.

vladsud added this to the May 2022 milestone Apr 22, 2022

vladsud assigned kian-thompson Apr 22, 2022

vladsud mentioned this issue Apr 22, 2022

There are summaries that summarize too many ops, indicating we are not summarizing often enough #10007

Merged

kian-thompson mentioned this issue May 5, 2022

#10008 Revise idleTime mechanism #10179

Merged

kian-thompson closed this as completed May 25, 2022

kian-thompson mentioned this issue Aug 24, 2022

Remove ISummaryConfigurationHeuristics.idleTime #11644

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revise idleTime mechanism #10008

Revise idleTime mechanism #10008

vladsud commented Apr 22, 2022

kian-thompson commented May 25, 2022 •

edited by azure-boards bot

Loading

Revise idleTime mechanism #10008

Revise idleTime mechanism #10008

Comments

vladsud commented Apr 22, 2022

kian-thompson commented May 25, 2022 • edited by azure-boards bot Loading

kian-thompson commented May 25, 2022 •

edited by azure-boards bot

Loading