Surface arrangement memory size/capacity #18766

antiguru · 2023-04-13T19:50:55Z

Measure the size of arrangements in memory.

This PR adds support to measure the size of arrangements:

Total allocated capacity,
used capacity,
number of allocations.

Motivation

This PR adds a known-desirable feature: MaterializeInc/database-issues#5740

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
This PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way) and therefore is tagged with a T-proto label.
If this PR will require changes to cloud orchestration, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
This PR includes the following user-facing behavior changes:

antiguru · 2023-04-14T20:47:32Z

This should be ready for review. For some arrangements, it reports an under-approximation of allocated memory because we treat the diff field not in a special way, which misses allocations it might reference. This is the case for some of the reductions.

antiguru · 2023-05-20T03:04:45Z

This PR is approaching a state where we can discuss remaining work. I think at least the following should be included in here:

Currently, only a single _raw collection is exposed. We'd need to add convenience views on top of this to aggregate across workers and dataflows.
The introspection source we expose doesn't have the count in the diff field, but materializes it automatically in the actual row. This is bad because it means we have to do more work than if we could simply maintain the count. However, this is not possible at the moment, because we're counting used size, raw capacity, and number of allocations at the same time. To move the values to the diff field, we'd need three separate sources.
We're printing warnings when the logging operator cannot be activated from dataflow logging. This is helpful to determine which arrangements still need to be logged, but doesn't make much sense for ad-hoc queries because the activator will most likely not be available yet/anymore when we process differential logs. I don't know a smart approach that would allow us to keep the warning and don't produce bogus warnings for short-lived dataflows.
We're moving several operators and abstractions into mz-compute. This feels restrictive, and we might want to consider splitting the logging infrastructure out of this, together with the abstraction introduced here. This will become more relevant once we actually work on cluster unification (#17413).
We're undercounting the size of DataflowError because we only consider the immediate allocation, but not the transitive closure of reachable allocations.. Maybe we could implement Columnation for it and get accurate accounting? It seems hard because it's a recursive data structure...
I have not looked into how many resources this feature requires. It adds a significant number of operators, activates these operators regularly, and maintains the introspection data. Each can be expensive.
- We could use a rate-limiting activator if activations turn out to be problematic.
The name mz_arrangement_size_raw isn't great, because there is also mz_arrangement_sizes.... We need to come up with something better.
We should adjust the memory visualizer to print both number of records as well as memory requirements for operators.
There's a recurring test failure in cloud test where a clusterd ooms. See buildkite for an example.

def-

This change looks complex enough that QA involvement might be warranted. I triggered nightly, which looks "ok" (same as on main currently): https://buildkite.com/materialize/nightlies/builds/2437

I marked inline the interesting spots from coverage run, maybe some of them can be tested?: https://buildkite.com/materialize/coverage/builds/116

def- · 2023-05-22T14:57:54Z

src/compute/src/extensions/collection.rs

+        Arranged<G, TraceAgent<Tr>>: ArrangementSize,
+    {
+        if must_consolidate {
+            self.mz_consolidate::<Tr>(name)


This line, the actual consolidation, appears not to be covered by any test.

Interesting, I'll defer to @vmarcos who might have more context on why this is! It might be that it only triggers within ad-hoc selects with monotonic optimizations enabled, and this might not be the case within tests.

Yes, the line should only be executed if the feature flag enable_monotonic_oneshot_selects is on and we have a one-shot select where the optimization applies, e.g., a query with MAX/MIN or a top-k pattern. AFAIK, the tests are currently run with the feature flag turned off, so that we get to test our regular optimization path. The design doc #19250 has some suggestions on how to address this issue, but no solution is implemented yet at this very moment. Perhaps it would make sense to run coverage with the feature flag on as a validation?

def- · 2023-05-22T14:58:31Z

src/compute/src/extensions/operator.rs

+        self.arrange_named(name).log_arrangement_size()
+    }
+
+    fn mz_arrange_core<P, Tr>(&self, pact: P, name: &str) -> Arranged<G, TraceAgent<Tr>>


This function also appears uncovered

Interesting. At the moment it's only used by the introspection code, could it be that introspection is turned off for coverage runs?

How to turn on/off introspection? Coverage is run the same way as normal tests, so this would also mean we are not using it in CI tests probably.

It should be switched on in CI. There are both testdrive and sqllogictests that query introspection sources.

src/compute/src/logging/compute.rs

src/compute/src/render/threshold.rs

doc/user/content/sql/system-catalog/mz_internal.md

src/compute/src/extensions/operator.rs

jkosh44

Adapter changes LGTM (I did not review the .jsx files).

jseldess

Docs LGTM.

Based on the linked issue, it seems these changes will enable users to understanding their most expensive or memory-intensive indexes and materialized views. Is that right? If so, would it make sense to revise or complement this troubleshooting FAQ as part of this PR? Or should this rather be a follow-up docs issue?

antiguru · 2023-05-25T18:33:34Z

Based on the linked issue, it seems these changes will enable users to understanding their most expensive or memory-intensive indexes and materialized views. Is that right? If so, would it make sense to revise or complement this troubleshooting FAQ as part of this PR? Or should this rather be a follow-up docs issue?

I'd prefer to do this in a follow-up issue. I filed MaterializeInc/database-issues#5793 and MaterializeInc/database-issues#5794 to not forget about it!

src/compute/src/extensions/operator.rs

teskje

This looks mostly good, but there are two things I'd like to ensure:

That we have user docs for all new relation fields.
That the retraction of arrangement size logging works correctly.

My other comments are nits and/or questions.

Also, I don't quite grok the changes to reduce.rs. Maybe someone more familiar with this code (@vmarcos?) could review those?

src/adapter/src/catalog/builtin.rs

src/compute/src/extensions/mod.rs

src/compute/src/logging/compute.rs

src/compute/src/extensions/operator.rs

src/storage-client/src/types/errors.rs

test/testdrive/introspection-sources.td

vmarcos

This is very exciting, and will definitely surface some very valuable information! I have a few comments regarding documentation, some general questions, and one point for discussion in the change to basic aggregates in reduce.rs.

src/adapter/src/catalog/builtin.rs

src/compute/src/extensions/arrange.rs

src/compute/src/extensions/reduce.rs

src/compute/src/logging/differential.rs

src/compute/src/render/reduce.rs

antiguru · 2023-05-31T01:39:48Z

I think I addressed all comments, so if you have time, please take another look. I also kicked off a nightly run and coverage.

I had to change the logic that uses arrangements with reductions to determine when an arrangement trace wasn't shared anymore because it fails to work when the arrangement is created and dropped within a single log window. In this case, the arrangement will not reveal any information about it because the events cancel each other out. We might want to think if there's a way we could still achieve the same behavior. For the time being, I switched to a map-based implementation.

def- · 2023-05-31T07:14:37Z

I also kicked off a nightly run and coverage.

The SQLsmith failures are unrelated. The feature benchmark regression is probably expected with this change?

teskje

LGTM, thanks!

teskje · 2023-05-31T09:00:07Z

test/testdrive/introspection-sources.td

+> SELECT records, batches, size, capacity, allocations FROM mz_internal.mz_dataflow_arrangement_sizes WHERE name='ii_empty'
+0 0 0 0 0
+
+# Tests that arrangement sizes are approximate


Is this comment missing some words?

teskje · 2023-05-31T09:05:21Z

src/compute/src/logging/differential.rs

+                self.state.sharing.remove(&op);
+                logger.log(ComputeEvent::ArrangementHeapSizeOperatorDrop { operator: op });
+            }
+        }


This works because compute_logger will always be Some in practice, but if we ever change the code to make it temporarily None the sharing tracking might become inconsistent. How about we always update sharing and only gate the actual log call behind if let Some(logger)?

Yep, we only track the sharing information if the logger is Some. That's currently the only case when the information is needed. Do you think we should always track this information?

It would be more future-proof that way, but I'm fine with leaving it as is.

Out of scope for this PR, but I wonder if we could get rid of the if let Some(logger) pattern throughout the compute code and make the logger non-optional instead. I think we always have a compute logger, except perhaps during initialization.

Signed-off-by: Moritz Hoffmann <[email protected]>

Otherwise, we'll accumulate all historical state. Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>

Add the columnation requirement to R Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>

* Actually handle delta values as such. * Explain why `CloneRegion` is acceptable for `DataflowError` * Remove `IntoKeyCollection` and replace it by `From` * Some cleanup Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>

Using it for all consolidates introduces a performance regression. Signed-off-by: Moritz Hoffmann <[email protected]>

antiguru · 2023-05-31T14:46:28Z

Verified locally that the GroupBy regression goes away with the last commit.

This reverts commit 2760464.

antiguru requested a review from umanwizard April 13, 2023 19:51

antiguru marked this pull request as ready for review April 14, 2023 20:45

antiguru requested review from a team April 14, 2023 20:45

antiguru requested a review from a team as a code owner April 14, 2023 20:45

antiguru requested a review from teskje April 14, 2023 20:46

def- reviewed May 22, 2023

View reviewed changes

antiguru force-pushed the arrangement_heap_size branch from 2cd2bc1 to a43c136 Compare May 22, 2023 20:46

antiguru requested a review from a team as a code owner May 22, 2023 20:46

antiguru force-pushed the arrangement_heap_size branch from 69872a0 to 0769c0e Compare May 23, 2023 13:25

umanwizard reviewed May 23, 2023

View reviewed changes

doc/user/content/sql/system-catalog/mz_internal.md Outdated Show resolved Hide resolved

src/compute/src/extensions/operator.rs Outdated Show resolved Hide resolved

antiguru force-pushed the arrangement_heap_size branch from a4ce2fe to dddefbf Compare May 24, 2023 13:36

antiguru changed the title ~~Exploration of arrangement memory size/capacity~~ Surface arrangement memory size/capacity May 24, 2023

antiguru force-pushed the arrangement_heap_size branch 3 times, most recently from 2d26688 to bdc925b Compare May 25, 2023 00:42

jkosh44 approved these changes May 25, 2023

View reviewed changes

jseldess approved these changes May 25, 2023

View reviewed changes

antiguru mentioned this pull request May 25, 2023

Introduce abstractions over Differential #19467

Closed

5 tasks

antiguru commented May 25, 2023

View reviewed changes

src/compute/src/extensions/operator.rs Outdated Show resolved Hide resolved

antiguru force-pushed the arrangement_heap_size branch 2 times, most recently from 2926a14 to 0976a85 Compare May 26, 2023 01:56

teskje requested changes May 26, 2023

View reviewed changes

antiguru force-pushed the arrangement_heap_size branch 2 times, most recently from 619e44c to c6b9cc5 Compare May 29, 2023 03:03

vmarcos reviewed May 29, 2023

View reviewed changes

teskje mentioned this pull request May 29, 2023

Introspection of dataflow shutdown durations #19560

Merged

5 tasks

teskje approved these changes May 31, 2023

View reviewed changes

antiguru added 17 commits May 31, 2023 09:55

Arrangement memory size

ff65607

Signed-off-by: Moritz Hoffmann <[email protected]>

Explicitly drive trace compaction

986ef60

Otherwise, we'll accumulate all historical state. Signed-off-by: Moritz Hoffmann <[email protected]>

Fix test

2389ce8

Signed-off-by: Moritz Hoffmann <[email protected]>

Cleanup and UI improvements

bca9363

Signed-off-by: Moritz Hoffmann <[email protected]>

Define MzArrange in terms of associated types

a92632e

Add the columnation requirement to R Signed-off-by: Moritz Hoffmann <[email protected]>

No columnation for diff types.

0853db4

Signed-off-by: Moritz Hoffmann <[email protected]>

Address review comments

54064ce

* Actually handle delta values as such. * Explain why `CloneRegion` is acceptable for `DataflowError` * Remove `IntoKeyCollection` and replace it by `From` * Some cleanup Signed-off-by: Moritz Hoffmann <[email protected]>

Columnation dependency

e453f3c

Signed-off-by: Moritz Hoffmann <[email protected]>

Restore current keying behavior in reductions

33018b0

Signed-off-by: Moritz Hoffmann <[email protected]>

Call reduce_abelian instead _core

561b8d2

Signed-off-by: Moritz Hoffmann <[email protected]>

Fix comment

2667842

Signed-off-by: Moritz Hoffmann <[email protected]>

Update documentation

e1a75b7

Adding tests, fixing passing trace dropping information

0ef9e34

Signed-off-by: Moritz Hoffmann <[email protected]>

Do not document the _per_worker introspection sources

5a9c7a0

Signed-off-by: Moritz Hoffmann <[email protected]>

Remove incorrect test

bc36590

Signed-off-by: Moritz Hoffmann <[email protected]>

Remove unused columnation implementation

1b33eeb

Signed-off-by: Moritz Hoffmann <[email protected]>

Only use ahash for mz_consolidate_if

599425b

Using it for all consolidates introduces a performance regression. Signed-off-by: Moritz Hoffmann <[email protected]>

antiguru force-pushed the arrangement_heap_size branch from 3a661f9 to 599425b Compare May 31, 2023 13:56

antiguru merged commit 2760464 into MaterializeInc:main May 31, 2023

antiguru deleted the arrangement_heap_size branch May 31, 2023 14:54

teskje added a commit to teskje/materialize that referenced this pull request Jun 23, 2023

Revert "Surface arrangement memory size/capacity (MaterializeInc#18766)"

cd5d663

This reverts commit 2760464.

teskje mentioned this pull request Jun 23, 2023

Revert "Surface arrangement memory size/capacity" #20116

Closed

5 tasks

umanwizard pushed a commit to umanwizard/materialize-1 that referenced this pull request Jun 23, 2023

Revert "Surface arrangement memory size/capacity (MaterializeInc#18766)"

43b0f05

This reverts commit 2760464.

antiguru mentioned this pull request Jun 26, 2023

Holding on to a trace with physical/logical compaction to the empty frontier stalls compaction TimelyDataflow/differential-dataflow#398

Open

umanwizard pushed a commit to umanwizard/materialize-1 that referenced this pull request Jul 12, 2023

Revert "Surface arrangement memory size/capacity (MaterializeInc#18766)"

4efc6aa

This reverts commit 2760464.

antiguru mentioned this pull request Aug 10, 2023

Revert "Merge pull request #20124 from umanwizard/backport_57.9_reverts" #20903

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surface arrangement memory size/capacity #18766

Surface arrangement memory size/capacity #18766

antiguru commented Apr 13, 2023 •

edited

Loading

antiguru commented Apr 14, 2023

antiguru commented May 20, 2023 •

edited

Loading

def- left a comment

def- May 22, 2023

antiguru May 22, 2023

vmarcos May 23, 2023

def- May 22, 2023

antiguru May 22, 2023

def- May 22, 2023

teskje May 23, 2023

jkosh44 left a comment

jseldess left a comment •

edited

Loading

antiguru commented May 25, 2023

teskje left a comment •

edited

Loading

vmarcos left a comment

antiguru commented May 31, 2023

def- commented May 31, 2023

teskje left a comment

teskje May 31, 2023

teskje May 31, 2023

antiguru May 31, 2023

teskje May 31, 2023

antiguru commented May 31, 2023

Surface arrangement memory size/capacity #18766

Surface arrangement memory size/capacity #18766

Conversation

antiguru commented Apr 13, 2023 • edited Loading

Motivation

Tips for reviewer

Checklist

antiguru commented Apr 14, 2023

antiguru commented May 20, 2023 • edited Loading

def- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkosh44 left a comment

Choose a reason for hiding this comment

jseldess left a comment • edited Loading

Choose a reason for hiding this comment

antiguru commented May 25, 2023

teskje left a comment • edited Loading

Choose a reason for hiding this comment

vmarcos left a comment

Choose a reason for hiding this comment

antiguru commented May 31, 2023

def- commented May 31, 2023

teskje left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antiguru commented May 31, 2023

antiguru commented Apr 13, 2023 •

edited

Loading

antiguru commented May 20, 2023 •

edited

Loading

jseldess left a comment •

edited

Loading

teskje left a comment •

edited

Loading