[rfc] Fixing MultiPartitionsDefinition subset -> range -> keys inconsistency #26652
+190
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary & Motivation
This bug reported a backfill of a multi-partitioned asset never completing #26307. I was able to replicate this issue and believe the cause is that we are losing information when we convert a sequence of MultiParitition keys to a subset, then to a range, then back to a list of keys. I've written up a test case demonstrating the behavior.
It looks like this conversion used to work until #19448 which was meant to improve the performance of the fn (so it doesn't have to iterate through all partition keys)
I'm not sure what the right way to resolve this is. We could go back to the old version for converting a range to a list of keys, but that loses the performance gains of #19448 .
(Less important to this PR, but documenting how this manifests as an issue for backfills so that we don't lose context over the holidays)
In backfills this manifests like this:
2023-01-01|a, 2023-01-01|b, 2023-01-01|c, 2023-01-02|a
get_partition_key_ranges
returns the range[PartitionKeyRange(start='2023-01-01|a', end='2023-01-02|a')]
. This is accurate since if you list the partition keys in order, the sequence goes2023-01-01|a, 2023-01-01|b, 2023-01-01|c, 2023-01-02|a, 2023-01-02|b, ...
2023-01-01|a
and2023-01-02|a
. I believe this is because we are callingget_partition_keys_in_range
when determining what materialization events to emit for the ranged run (still tracking down where this happens in the code)2023-01-01|b, 2023-01-01|c
because they are root assets for the backfill, and we could make some changes to see that they didn't get materialized. But the core bug is with these conversion methodsHow I Tested These Changes
Changelog