Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surface arrangement memory size/capacity #18766

Merged
merged 17 commits into from
May 31, 2023

Conversation

antiguru
Copy link
Member

@antiguru antiguru commented Apr 13, 2023

Measure the size of arrangements in memory.

This PR adds support to measure the size of arrangements:

  • Total allocated capacity,
  • used capacity,
  • number of allocations.

Motivation

This PR adds a known-desirable feature: MaterializeInc/database-issues#5740

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • This PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way) and therefore is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

@antiguru antiguru requested a review from umanwizard April 13, 2023 19:51
@antiguru antiguru marked this pull request as ready for review April 14, 2023 20:45
@antiguru antiguru requested review from a team April 14, 2023 20:45
@antiguru antiguru requested a review from a team as a code owner April 14, 2023 20:45
@antiguru antiguru requested a review from teskje April 14, 2023 20:46
@antiguru
Copy link
Member Author

This should be ready for review. For some arrangements, it reports an under-approximation of allocated memory because we treat the diff field not in a special way, which misses allocations it might reference. This is the case for some of the reductions.

@antiguru
Copy link
Member Author

antiguru commented May 20, 2023

This PR is approaching a state where we can discuss remaining work. I think at least the following should be included in here:

  • Currently, only a single _raw collection is exposed. We'd need to add convenience views on top of this to aggregate across workers and dataflows.
  • The introspection source we expose doesn't have the count in the diff field, but materializes it automatically in the actual row. This is bad because it means we have to do more work than if we could simply maintain the count. However, this is not possible at the moment, because we're counting used size, raw capacity, and number of allocations at the same time. To move the values to the diff field, we'd need three separate sources.
  • We're printing warnings when the logging operator cannot be activated from dataflow logging. This is helpful to determine which arrangements still need to be logged, but doesn't make much sense for ad-hoc queries because the activator will most likely not be available yet/anymore when we process differential logs. I don't know a smart approach that would allow us to keep the warning and don't produce bogus warnings for short-lived dataflows.
  • We're moving several operators and abstractions into mz-compute. This feels restrictive, and we might want to consider splitting the logging infrastructure out of this, together with the abstraction introduced here. This will become more relevant once we actually work on cluster unification (#17413).
  • We're undercounting the size of DataflowError because we only consider the immediate allocation, but not the transitive closure of reachable allocations.. Maybe we could implement Columnation for it and get accurate accounting? It seems hard because it's a recursive data structure...
  • I have not looked into how many resources this feature requires. It adds a significant number of operators, activates these operators regularly, and maintains the introspection data. Each can be expensive.
    • We could use a rate-limiting activator if activations turn out to be problematic.
  • The name mz_arrangement_size_raw isn't great, because there is also mz_arrangement_sizes.... We need to come up with something better.
  • We should adjust the memory visualizer to print both number of records as well as memory requirements for operators.
  • There's a recurring test failure in cloud test where a clusterd ooms. See buildkite for an example.

Copy link
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks complex enough that QA involvement might be warranted. I triggered nightly, which looks "ok" (same as on main currently): https://buildkite.com/materialize/nightlies/builds/2437

I marked inline the interesting spots from coverage run, maybe some of them can be tested?: https://buildkite.com/materialize/coverage/builds/116

Arranged<G, TraceAgent<Tr>>: ArrangementSize,
{
if must_consolidate {
self.mz_consolidate::<Tr>(name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line, the actual consolidation, appears not to be covered by any test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I'll defer to @vmarcos who might have more context on why this is! It might be that it only triggers within ad-hoc selects with monotonic optimizations enabled, and this might not be the case within tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the line should only be executed if the feature flag enable_monotonic_oneshot_selects is on and we have a one-shot select where the optimization applies, e.g., a query with MAX/MIN or a top-k pattern. AFAIK, the tests are currently run with the feature flag turned off, so that we get to test our regular optimization path. The design doc #19250 has some suggestions on how to address this issue, but no solution is implemented yet at this very moment. Perhaps it would make sense to run coverage with the feature flag on as a validation?

self.arrange_named(name).log_arrangement_size()
}

fn mz_arrange_core<P, Tr>(&self, pact: P, name: &str) -> Arranged<G, TraceAgent<Tr>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function also appears uncovered

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. At the moment it's only used by the introspection code, could it be that introspection is turned off for coverage runs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to turn on/off introspection? Coverage is run the same way as normal tests, so this would also mean we are not using it in CI tests probably.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be switched on in CI. There are both testdrive and sqllogictests that query introspection sources.

src/compute/src/logging/compute.rs Outdated Show resolved Hide resolved
src/compute/src/render/threshold.rs Outdated Show resolved Hide resolved
@antiguru antiguru force-pushed the arrangement_heap_size branch from 2cd2bc1 to a43c136 Compare May 22, 2023 20:46
@antiguru antiguru requested a review from a team as a code owner May 22, 2023 20:46
@antiguru antiguru force-pushed the arrangement_heap_size branch from 69872a0 to 0769c0e Compare May 23, 2023 13:25
@antiguru antiguru force-pushed the arrangement_heap_size branch from a4ce2fe to dddefbf Compare May 24, 2023 13:36
@antiguru antiguru changed the title Exploration of arrangement memory size/capacity Surface arrangement memory size/capacity May 24, 2023
@antiguru antiguru force-pushed the arrangement_heap_size branch 3 times, most recently from 2d26688 to bdc925b Compare May 25, 2023 00:42
Copy link
Contributor

@jkosh44 jkosh44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapter changes LGTM (I did not review the .jsx files).

Copy link
Contributor

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs LGTM.

Based on the linked issue, it seems these changes will enable users to understanding their most expensive or memory-intensive indexes and materialized views. Is that right? If so, would it make sense to revise or complement this troubleshooting FAQ as part of this PR? Or should this rather be a follow-up docs issue?

@antiguru
Copy link
Member Author

Based on the linked issue, it seems these changes will enable users to understanding their most expensive or memory-intensive indexes and materialized views. Is that right? If so, would it make sense to revise or complement this troubleshooting FAQ as part of this PR? Or should this rather be a follow-up docs issue?

I'd prefer to do this in a follow-up issue. I filed MaterializeInc/database-issues#5793 and MaterializeInc/database-issues#5794 to not forget about it!

@antiguru antiguru force-pushed the arrangement_heap_size branch 2 times, most recently from 2926a14 to 0976a85 Compare May 26, 2023 01:56
Copy link
Contributor

@teskje teskje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks mostly good, but there are two things I'd like to ensure:

  1. That we have user docs for all new relation fields.
  2. That the retraction of arrangement size logging works correctly.

My other comments are nits and/or questions.

Also, I don't quite grok the changes to reduce.rs. Maybe someone more familiar with this code (@vmarcos?) could review those?

src/adapter/src/catalog/builtin.rs Show resolved Hide resolved
src/compute/src/extensions/mod.rs Show resolved Hide resolved
src/compute/src/logging/compute.rs Show resolved Hide resolved
src/compute/src/logging/compute.rs Outdated Show resolved Hide resolved
src/compute/src/logging/compute.rs Outdated Show resolved Hide resolved
src/compute/src/logging/compute.rs Show resolved Hide resolved
src/compute/src/extensions/operator.rs Outdated Show resolved Hide resolved
src/compute/src/extensions/operator.rs Outdated Show resolved Hide resolved
src/storage-client/src/types/errors.rs Show resolved Hide resolved
test/testdrive/introspection-sources.td Show resolved Hide resolved
@antiguru antiguru force-pushed the arrangement_heap_size branch 2 times, most recently from 619e44c to c6b9cc5 Compare May 29, 2023 03:03
Copy link
Contributor

@vmarcos vmarcos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very exciting, and will definitely surface some very valuable information! I have a few comments regarding documentation, some general questions, and one point for discussion in the change to basic aggregates in reduce.rs.

src/adapter/src/catalog/builtin.rs Show resolved Hide resolved
src/adapter/src/catalog/builtin.rs Show resolved Hide resolved
src/compute/src/extensions/arrange.rs Show resolved Hide resolved
src/compute/src/extensions/reduce.rs Outdated Show resolved Hide resolved
src/compute/src/logging/differential.rs Outdated Show resolved Hide resolved
src/compute/src/render/reduce.rs Outdated Show resolved Hide resolved
@antiguru
Copy link
Member Author

I think I addressed all comments, so if you have time, please take another look. I also kicked off a nightly run and coverage.

I had to change the logic that uses arrangements with reductions to determine when an arrangement trace wasn't shared anymore because it fails to work when the arrangement is created and dropped within a single log window. In this case, the arrangement will not reveal any information about it because the events cancel each other out. We might want to think if there's a way we could still achieve the same behavior. For the time being, I switched to a map-based implementation.

@def-
Copy link
Contributor

def- commented May 31, 2023

I also kicked off a nightly run and coverage.

The SQLsmith failures are unrelated. The feature benchmark regression is probably expected with this change?

Copy link
Contributor

@teskje teskje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

> SELECT records, batches, size, capacity, allocations FROM mz_internal.mz_dataflow_arrangement_sizes WHERE name='ii_empty'
0 0 0 0 0

# Tests that arrangement sizes are approximate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment missing some words?

self.state.sharing.remove(&op);
logger.log(ComputeEvent::ArrangementHeapSizeOperatorDrop { operator: op });
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works because compute_logger will always be Some in practice, but if we ever change the code to make it temporarily None the sharing tracking might become inconsistent. How about we always update sharing and only gate the actual log call behind if let Some(logger)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we only track the sharing information if the logger is Some. That's currently the only case when the information is needed. Do you think we should always track this information?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more future-proof that way, but I'm fine with leaving it as is.

Out of scope for this PR, but I wonder if we could get rid of the if let Some(logger) pattern throughout the compute code and make the logger non-optional instead. I think we always have a compute logger, except perhaps during initialization.

antiguru added 17 commits May 31, 2023 09:55
Signed-off-by: Moritz Hoffmann <[email protected]>
Otherwise, we'll accumulate all historical state.

Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Add the columnation requirement to R

Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
* Actually handle delta values as such.
* Explain why `CloneRegion` is acceptable for `DataflowError`
* Remove `IntoKeyCollection` and replace it by `From`
* Some cleanup

Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Using it for all consolidates introduces a performance regression.

Signed-off-by: Moritz Hoffmann <[email protected]>
@antiguru antiguru force-pushed the arrangement_heap_size branch from 3a661f9 to 599425b Compare May 31, 2023 13:56
@antiguru
Copy link
Member Author

Verified locally that the GroupBy regression goes away with the last commit.

@antiguru antiguru merged commit 2760464 into MaterializeInc:main May 31, 2023
@antiguru antiguru deleted the arrangement_heap_size branch May 31, 2023 14:54
teskje added a commit to teskje/materialize that referenced this pull request Jun 23, 2023
umanwizard pushed a commit to umanwizard/materialize-1 that referenced this pull request Jun 23, 2023
umanwizard pushed a commit to umanwizard/materialize-1 that referenced this pull request Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants