-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debug dump of MemoryManager #6934
Comments
I think this is a straightforward coding exercise and well specified, so marking it as good first issue |
Hello, I noticed this issue, also I have submitted a pull request #6934 that I believe addresses the issue. Please let me know if there is anything else I can do to help. Thank you! |
Thank you @tanmay-veer -- I'll take a look today |
Is the goal of this ticket to improve the OOM messages? Or to provide an API to get the top memory consumers (in the absence of an error)? We have updated the OOM message to include the top 5 consumers (see PRs related to this issue). The code was implemented in such a way that it could be used for data dumps too using a public api. Do we have any other action item needed to close this ticket? |
Is your feature request related to a problem or challenge?
@JayjeetAtGithub and I are investigating improving memory performance for certain queries
When we hit the memory limit, we see different error messages. For example sometimes we see
And sometimes we see
We would like to know what operators are consuming the memory
I theorize the fact that we see different operators appear in the logs is due to the fact when we near the memory limit, any of the operators might be the "canary" that happened to be the operator that asked for memory next, rather than the one that was actually using it all)
Describe the solution you'd like
I would like some way to know how much each operator is consuming in a particular pool, and prior to returning an allocation error, entering a debug log with this information
The memory pool code is here: https://github.com/apache/arrow-datafusion/blob/main/datafusion/execution/src/memory_pool/pool.rs
I was imagining implementing
Display
for the pools likeWhich would produce a report like this with the reservations sorted in descending order:
Then prior to returning a resources exhausted error:
https://github.com/apache/arrow-datafusion/blob/50135e8c039b82d32b57db12ca06d789e9cbea4c/datafusion/execution/src/memory_pool/pool.rs#L87
We would add a log message like
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: