-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddl notifier: use pagination for SELECT to reduce memory usage #58376
Conversation
Signed-off-by: lance6716 <[email protected]>
Hi @lance6716. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/check-issue-triage-complete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (1)
pkg/ddl/notifier/subscribe.go:171
- The variable name ProcessEventsBatchSize is clear and consistent with the changes made throughout the code.
var ProcessEventsBatchSize = 1024
Signed-off-by: lance6716 <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #58376 +/- ##
================================================
+ Coverage 73.1849% 75.2997% +2.1148%
================================================
Files 1681 1727 +46
Lines 463027 480149 +17122
================================================
+ Hits 338866 361551 +22685
+ Misses 103358 96359 -6999
- Partials 20803 22239 +1436
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/retest |
@lance6716: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
changes := make([]*SchemaChange, ProcessEventsBatchSize) | ||
|
||
for { | ||
count, err2 := result.Read(changes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not let read return []*SchemaChange
directly?
Return count
and use changes[:count]
is just a unnecessary detour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's like an io.Reader
calling convention which is caller pass-in slices so caller can reuse the previous results. Because we are in 1M tables project, I'm afraid that returning 1M []*SchemaChange
(it also has internal pointers like TableInfo
) will add pressure to GC.
Though it's less common, I think this package / function is the only caller of ListResult.Read
. Other developers will not modify this package and will not be confused by it.
However I notice that I use json.Unmarshal
on an non-empty object, not sure the leftover of the object will be overwritten or causes some problems. I'll check it tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was your test result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass-in slices so caller can reuse the previous results
Yes, I mean the same thing, just that the code can be write as:
func f(changes []*SchemaChange) []*SchemaChange {
ret := changes[:0]
ret = append(ret, xxx)
return ret
}
Anyway, this is just some personal taste, so you can choose to accept it or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was your test result?
I didn't test it yet 😂 I hope other colleague can help check it. Theoretically there's no big query now.
func f(changes []*SchemaChange) []*SchemaChange {
In this new signature, the batch size of one Read
is not obvious when input is nil
, or it seems like an append
where the slice will be grown and batch size is not be limited. Your suggestion is like EncodeBytes
but here we want to control the batch size as well as reuse buffer.
Signed-off-by: lance6716 <[email protected]>
Signed-off-by: lance6716 <[email protected]>
Signed-off-by: lance6716 <[email protected]>
Signed-off-by: lance6716 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to verify if this can fix the memory consumption issue. before merging it.
Thanks!
/hold
if changes[i] == nil { | ||
changes[i] = new(SchemaChange) | ||
} | ||
if changes[i].event == nil { | ||
changes[i].event = new(SchemaChangeEvent) | ||
} | ||
if changes[i].event.inner == nil { | ||
changes[i].event.inner = new(jsonSchemaChangeEvent) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
// Add constructor function
func NewSchemaChange() *SchemaChange {
return &SchemaChange{
event: &SchemaChangeEvent{
inner: &jsonSchemaChangeEvent{},
},
}
}
if changes[i] == nil {
changes[i] = NewSchemaChange()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or do you mean there is a case that changes[i]
is not nil, but the event or the inner is nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to improve robustness, I don't want to assume caller uses some kind of SchemaChange
.
I'll test the OOM problem soon
[LGTM Timeline notifier]Timeline:
|
@lance6716: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: D3Hunter, Rustin170506, tiancaiamao The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold due to test resource limit, I can't create a test cluster before the coming version release. So I plan to test the effect later, and don't cherry-pick to v8.5.1 before we get the test result |
/retest |
@lance6716: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #58368
Problem Summary:
What changed and how does it work?
as title. refine the
Store.List
.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.