-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surface arrangement memory size/capacity #18766
Surface arrangement memory size/capacity #18766
Conversation
This should be ready for review. For some arrangements, it reports an under-approximation of allocated memory because we treat the diff field not in a special way, which misses allocations it might reference. This is the case for some of the reductions. |
This PR is approaching a state where we can discuss remaining work. I think at least the following should be included in here:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks complex enough that QA involvement might be warranted. I triggered nightly, which looks "ok" (same as on main currently): https://buildkite.com/materialize/nightlies/builds/2437
I marked inline the interesting spots from coverage run, maybe some of them can be tested?: https://buildkite.com/materialize/coverage/builds/116
Arranged<G, TraceAgent<Tr>>: ArrangementSize, | ||
{ | ||
if must_consolidate { | ||
self.mz_consolidate::<Tr>(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line, the actual consolidation, appears not to be covered by any test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I'll defer to @vmarcos who might have more context on why this is! It might be that it only triggers within ad-hoc selects with monotonic optimizations enabled, and this might not be the case within tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the line should only be executed if the feature flag enable_monotonic_oneshot_selects
is on
and we have a one-shot select where the optimization applies, e.g., a query with MAX
/MIN
or a top-k pattern. AFAIK, the tests are currently run with the feature flag turned off
, so that we get to test our regular optimization path. The design doc #19250 has some suggestions on how to address this issue, but no solution is implemented yet at this very moment. Perhaps it would make sense to run coverage with the feature flag on
as a validation?
self.arrange_named(name).log_arrangement_size() | ||
} | ||
|
||
fn mz_arrange_core<P, Tr>(&self, pact: P, name: &str) -> Arranged<G, TraceAgent<Tr>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function also appears uncovered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. At the moment it's only used by the introspection code, could it be that introspection is turned off for coverage runs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to turn on/off introspection? Coverage is run the same way as normal tests, so this would also mean we are not using it in CI tests probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be switched on in CI. There are both testdrive and sqllogictests that query introspection sources.
2cd2bc1
to
a43c136
Compare
69872a0
to
0769c0e
Compare
a4ce2fe
to
dddefbf
Compare
2d26688
to
bdc925b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adapter changes LGTM (I did not review the .jsx
files).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs LGTM.
Based on the linked issue, it seems these changes will enable users to understanding their most expensive or memory-intensive indexes and materialized views. Is that right? If so, would it make sense to revise or complement this troubleshooting FAQ as part of this PR? Or should this rather be a follow-up docs issue?
I'd prefer to do this in a follow-up issue. I filed MaterializeInc/database-issues#5793 and MaterializeInc/database-issues#5794 to not forget about it! |
2926a14
to
0976a85
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks mostly good, but there are two things I'd like to ensure:
- That we have user docs for all new relation fields.
- That the retraction of arrangement size logging works correctly.
My other comments are nits and/or questions.
Also, I don't quite grok the changes to reduce.rs
. Maybe someone more familiar with this code (@vmarcos?) could review those?
619e44c
to
c6b9cc5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very exciting, and will definitely surface some very valuable information! I have a few comments regarding documentation, some general questions, and one point for discussion in the change to basic aggregates in reduce.rs
.
I think I addressed all comments, so if you have time, please take another look. I also kicked off a nightly run and coverage. I had to change the logic that uses arrangements with reductions to determine when an arrangement trace wasn't shared anymore because it fails to work when the arrangement is created and dropped within a single log window. In this case, the arrangement will not reveal any information about it because the events cancel each other out. We might want to think if there's a way we could still achieve the same behavior. For the time being, I switched to a map-based implementation. |
The SQLsmith failures are unrelated. The feature benchmark regression is probably expected with this change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
> SELECT records, batches, size, capacity, allocations FROM mz_internal.mz_dataflow_arrangement_sizes WHERE name='ii_empty' | ||
0 0 0 0 0 | ||
|
||
# Tests that arrangement sizes are approximate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment missing some words?
self.state.sharing.remove(&op); | ||
logger.log(ComputeEvent::ArrangementHeapSizeOperatorDrop { operator: op }); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works because compute_logger
will always be Some
in practice, but if we ever change the code to make it temporarily None
the sharing
tracking might become inconsistent. How about we always update sharing
and only gate the actual log
call behind if let Some(logger)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we only track the sharing information if the logger is Some
. That's currently the only case when the information is needed. Do you think we should always track this information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more future-proof that way, but I'm fine with leaving it as is.
Out of scope for this PR, but I wonder if we could get rid of the if let Some(logger)
pattern throughout the compute code and make the logger non-optional instead. I think we always have a compute logger, except perhaps during initialization.
Signed-off-by: Moritz Hoffmann <[email protected]>
Otherwise, we'll accumulate all historical state. Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Add the columnation requirement to R Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
* Actually handle delta values as such. * Explain why `CloneRegion` is acceptable for `DataflowError` * Remove `IntoKeyCollection` and replace it by `From` * Some cleanup Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Using it for all consolidates introduces a performance regression. Signed-off-by: Moritz Hoffmann <[email protected]>
3a661f9
to
599425b
Compare
Verified locally that the |
This reverts commit 2760464.
This reverts commit 2760464.
This reverts commit 2760464.
Measure the size of arrangements in memory.
This PR adds support to measure the size of arrangements:
Motivation
This PR adds a known-desirable feature: MaterializeInc/database-issues#5740
Tips for reviewer
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-proto
label.