Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manage group values and states by blocks in aggregation #11931

Open
Rachelint opened this issue Aug 10, 2024 · 7 comments
Open

Manage group values and states by blocks in aggregation #11931

Rachelint opened this issue Aug 10, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@Rachelint
Copy link
Contributor

Rachelint commented Aug 10, 2024

Is your feature request related to a problem or challenge?

Now we manage the group values and the aggregation states by a single big vector growing constantly.
This solution is simple to impl, but really leads to some extra cpu cost according to the cpu profile.
Maybe we should manage them by blocks like duckdb.

Describe the solution you'd like

It may be a big work, I want to finish it through following steps:

  • Sketch the total procedure.
  • Impl the block based group values management in GroupValuesRows.
  • Impl the block based group values management in other GroupValues impls.
  • Impl the block based states management in different GroupAccumulator impls.

The general design is similar as #7065 , but introduce it into GroupValues, not only GroupAccumulators.

Describe alternatives you've considered

No response

Additional context

The cpu cost flamegraph:
https://github.com/Rachelint/drawio-store/blob/main/cpucosts0811.png

@Rachelint
Copy link
Contributor Author

take

@2010YOUY01
Copy link
Contributor

Is the plan to manage group values and states in two different kinds of blocks, or a unified block?

There are so many good optimizations in the aggregation code now, they make the implementation a bit hard to understand, I was thinking managing group values + states in a single block could also be a good cleanup

@Rachelint
Copy link
Contributor Author

Rachelint commented Aug 11, 2024

Is the plan to manage group values and states in two different kinds of blocks, or a unified block?

There are so many good optimizations in the aggregation code now, they make the implementation a bit hard to understand, I was thinking managing group values + states in a single block could also be a good cleanup

Plan to make them two kinds of blocks, because group values is the inner struct in GroupValues, and states is the one in different GroupAccumulator, seems hard to manage them in a same place.

The design is similar as #7065 , but introduce it into GroupValues, not only GroupAccumulators.

The reason why doing it is according to the cpu flamegraph, the growing of the big single group values in GroupValues and states in respecitve GroupAccumulator seems really cost cpu(due to copy caused by growing).

@Rachelint
Copy link
Contributor Author

Rachelint commented Aug 12, 2024

The sketch's detailed design

1. When will the blocked method triggered?

  • It should not be streaming aggregation, because steaming depends on the excact Emit::First(n) mode, and it is too expansive to impl it in blocked method.
  • The blocked GroupValues will be triggered, if we found the usedGroupValues impl support it.
  • The blocked GroupAccumulator will be only triggered, when all the used GroupAccumulators support blocked, and the used GroupValues supports blocked too.

2. Introduce new emit modes used in blocked method

It can support emit multiple blocks in GroupValuess and GroupAccumulators now:

/// Describes how many rows should be emitted during grouping.
#[derive(Debug, Clone, Copy)]
pub enum EmitTo {
    /// Emit all groups
    All,
    /// Emit only the first `n` groups and shift all existing group
    /// indexes down by `n`.
    ///
    /// For example, if `n=10`, group_index `0, 1, ... 9` are emitted
    /// and group indexes '`10, 11, 12, ...` become `0, 1, 2, ...`.
    First(usize),
    /// Emit all groups managed by blocks
    AllBlocks,
    /// Emit only the first `n` group blocks,
    /// similar as `First`, but used in blocked `GroupValues` and `GroupAccumulator`.
    ///
    /// For example, `n=3`, `block size=4`, finally 12 groups will be returned.
    FirstBlocks(usize),
}

For incrementally development for blocked method for so many detailed GroupValues and GroupAccumulator impls. This sketch pr did a lot of compatibility works, and combinations are allowed:

  • Single GroupAccumulator + single GroupAccumulator
  • Blocked GroupValues + single GroupAccumulator
  • Blocked GroupValues + blocked GroupAccumulator

3. Introduce GroupIndices to do communication between GroupValues and GroupAccumulator

One of the problem is how to let GroupAccumulator know the if group indices is flat or blocked?
I introduce GroupIndices to make it, but it indeed leads to api change for GroupAccumulator.

pub enum GroupIndices<'a> {
    Flat(&'a [u64]),
    Blocked(&'a [u64]),
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum GroupIndicesType {
    Flat,
    Blocked,
}

impl GroupIndicesType {
    pub fn typed_group_indices<'a>(&self, indices: &'a [u64]) -> GroupIndices<'a> {
        match self {
            GroupIndicesType::Flat => GroupIndices::Flat(indices),
            GroupIndicesType::Blocked => GroupIndices::Blocked(indices),
        }
    }
}

@jayzhan211
Copy link
Contributor

/// For example, n= 10, block size=4, n will be aligned to 12,
/// and finally 3 blocks will be returned.
FirstBlocks(usize)

I think emitting with "n" blocks is much more straightforward. n = 3, block size = 4. emit 3 * 4 = 12 elements

@Rachelint
Copy link
Contributor Author

/// For example, n= 10, block size=4, n will be aligned to 12,
/// and finally 3 blocks will be returned.
FirstBlocks(usize)

I think emitting with "n" blocks is much more straightforward. n = 3, block size = 4. emit 3 * 4 = 12 elements

It seems indeed more clear! I have switched to this in codes.

@Rachelint
Copy link
Contributor Author

Rachelint commented Aug 12, 2024

@alamb
I have finished the draft framework for blocked aggregation intermediate management, and we can incrementally impl blocked method for different GroupAccumulators and GroupValues on it. Minding have a quick look?

The general design can see:
#11931 (comment)

And the pr is here:
#11943

cc @jayzhan211 @2010YOUY01 @JasonLi-cn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants