-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: native histogram: exemplars index out of range #1608
bugfix: native histogram: exemplars index out of range #1608
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but @fatsheep9146 might have an opinion here, too.
Also, leaving it to the maintainers @ArthurSens @kakkoyun @bwplotka to make the final call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Does it needs a minor 1.20 release with this? Or we are ok with 1.21 in a few months or so? |
I'd love to get a minor release so we can do clean dependency update in Mimir. |
I'll keep this open for another day or so just in case Bartek or Kemal want to give another review. I can create another release, but then we need to point this PR to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this bug!
This fix is LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some readability suggestion, some question likely due to my lack of context for the details. Plus please rebase on release-1.20
for this to be in minor patch. Otherwise LGTM!
prometheus/histogram.go
Outdated
@@ -1764,23 +1776,22 @@ func (n *nativeExemplars) addExemplar(e *dto.Exemplar) { | |||
if nIdx > 0 { | |||
diff := math.Abs(elog - math.Log(n.exemplars[nIdx-1].GetValue())) | |||
if diff < md { | |||
// The closest exemplar pair is this: |e.Value - n.exemplars[nIdx-1].Value| is minimal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is off here. Do you mean...
// The closest exemplar pair is this: |e.Value - n.exemplars[nIdx-1].Value| is minimal. | |
// The smaller exemplar is closer (n.exemplars[nIdx-1].Value), replace that one. |
prometheus/histogram.go
Outdated
if n.exemplars[nIdx].Timestamp.AsTime().Before(e.Timestamp.AsTime()) { | ||
mdIdx = nIdx | ||
} | ||
// The closest exemplar pair is this: |n.exemplars[nIdx].Value - e.Value| is minimal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
if n.exemplars[nIdx-1].Timestamp.AsTime().Before(e.Timestamp.AsTime()) { | ||
mdIdx = nIdx - 1 | ||
} | ||
rIdx = nIdx - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we not forgetting to choose larger exemplar rIdx = nIdx
for diff > md
case? Why we have to recalculate another diff below?
@@ -1764,23 +1776,22 @@ func (n *nativeExemplars) addExemplar(e *dto.Exemplar) { | |||
if nIdx > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did is odd, why we need this, we just set it to len if it's -1, plus we know len is > 0 because we are in the logic that has to remove something. Is it because nIdx
can be zero which means we have to put exemplar upfront?
Then let's fix comment on line 1766, it's wrong:
// Here, we have the following relationships:
// n.exemplars[nIdx-1].Value < e.Value <= n.exemplars[nIdx].Value
if n.exemplars[nIdx-1].Timestamp.AsTime().Before(e.Timestamp.AsTime()) { | ||
mdIdx = nIdx - 1 | ||
} | ||
rIdx = nIdx - 1 | ||
} | ||
} | ||
if nIdx < len(n.exemplars) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is then guarding the case when nIdx is the last element? Then again, comment 1766 is kind of wrong, or misleading.
@bwplotka I've updated the comments in the code with more explanations and my understanding of the code :) |
I forced this PR to be against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! (just rebase)
Signed-off-by: György Krajcsovits <[email protected]>
Signed-off-by: György Krajcsovits <[email protected]>
mdIdx was redundant when len(exemplars)>1, so got rid of it, rIdx is enough. Don't compare timestamp of incoming exemplar to timestamp of minimal distance exemplar. Most of the time the incoming exemplar will be newer. And if not, the previous code just replaced an exemplar one index after the minimal distance exemplar. Which had an index out of range bug, plus is essentially random. Signed-off-by: György Krajcsovits <[email protected]>
Signed-off-by: György Krajcsovits <[email protected]>
Signed-off-by: György Krajcsovits <[email protected]>
d82721b
to
d6b8c89
Compare
Rebased and PR re-targetted |
Fixes: #1607
Fixes: #1605
fix: native histogram: Simplify and fix addExemplar
mdIdx was redundant when len(exemplars)>1, so got rid of it, rIdx
is enough.
Don't compare timestamp of incoming exemplar to timestamp of
minimal distance exemplar. Most of the time the incoming exemplar
will be newer. And if not, the previous code just replaced an
exemplar one index after the minimal distance exemplar. Which had
an index out of range bug, plus is essentially random.
Contains an unoptimized fix for #1605 , see discussion / fix in #1609