Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review TreeNode Refactor #1

Conversation

berkaysynnada
Copy link

Here is an explanation of the changes I made and their reasons:

  1. visit and rewrite were skipping the f_up (post_visit) of the node if f_down (pre_visit) results with "jump". This behavior is inconsistent with the post_visit "jump"s. In two passes traversals, "jump" actually does jump to the next f_down if the operation is f_up, or jump to the next f_up if the operation is f_down (bypassing the nodes children). It is important to establish the same behavior.
  2. pop_enter_mark method is adapted to updated behavior of jump.
  3. TODO in LogicalPlan's map_children is resolved.
  4. There is no need to transform method. Old behavior of transform has been changed while its name is preserved. Since it is a simplification and ease-of-use PR rather than adding new features, we should add it with a separate PR when it is really needed. I also think that having the assumption of first top-down and then bottom-up may also be misleading for a general purpose API.
  5. There is no need for handle_transform_recursion anymore.
  6. I have added more comments and done some renaming that I thought were more explanatory.

@peter-toth
Copy link
Owner

Thanks for the fix @berkaysynnada, the changes look good/acceptable to me but let me check it thoroughly tomorrow and come back to you.

self.expressions(),
new_children.into_iter().map(|child| child.data).collect(),
)
.map(Transformed::yes)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to propogate up the transformed flag based on the detection result (old_children.into_iter().zip(new_children.iter()).any(|(c1, c2)| c1 != &c2.data)) and not based on if the transformation closure (f) applied on any children reported (propagated up)transformed true state?

I mean I left similar TODOs in DynTreeNode and ConcreteTreeNode where the same conflic between the "detected" and "reported" transformed state can happen.

In my PR I always propagated up the trasnformed state from the closures despite the detection could be different. We can agree on to propagate the detected but then IMO we should change this in DynTreeNode and ConcreteTreeNode as well.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought there may exist some rules such that it transforms the tree in a few iterations, but at the final stage, the final data is the same as the initial data. Therefore, it would be better to check the data's itself rather than trusting the transformed flag. However, now I think that the equality check at each step could be expensive, and I think we need to rely on transformed flag. Otherwise it would have no meaning to carry.

I am converting it to propagate the transformed information up if any children is transformed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, my take on this general, user provided transformed flag in the transformation closure results is that it is pointless. In the current code on the main branch it doesn't work at all and so it isn't used anywhere for anything. This tells me that in most of the cases noone misses it. (You know I wanted to remove it entirelly first: apache#8835.)

Now the above doesn't mean that I can't imagine a case where we want to check if changes were made to the tree during a transformation. But I see other options without using the transformed flag.

  • If it is a tree with Arc "links" and enum nodes like LogicalPlan we can compare the old cloned top level node with the new top level node we got back from transform to detect changes. This comparison is cheap (and similar to the old_children.into_iter().zip(new_children.iter()).any(|(c1, c2)| c1 != &c2.data) above).
  • If it is a tree with Arc<dyn trait> nodes (Arc<DynTreeNode> and ConcreteTreeNode trees fall into this category) then we can compare the old cloned top level node data ptr addres with the new top level node data prt address we got back from transform to detect changes. This is similar to how with_new_children_if_necessary works.
  • If we have an Expr tree (with Box links) then it would be costly to clone the tree before the transformation so the only cheap way to detect if changes were made is to ask the API user to provide the information if they changed a node. But instead of requesting the transformed flag in all transformation closures from the API users, I would rather offer an new "transform with payload" like API (Transform with payload apache/datafusion#8664) to let the user propagate arbitrary state during transformation. The transformed flag in the Transformed struct is nothing more than a special payload in that regard.

Copy link
Author

@berkaysynnada berkaysynnada Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The options you mentioned make sense, but instead of going through so much effort, we should use the already existing transform. I can't see any advantage of removing Transform either. I also use it a lot in the projection optimizer rule I'm currently working on. As far as I remember, there were also some ideas: apache#8835 (comment).

I have sent a new commit related with these. Could you please take a look? I think you are okey with propagating the transformed result according to the children flags.

Copy link
Owner

@peter-toth peter-toth Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm ok with keeping the flag, especially if you are going to use it in rules. But I have some concerns with that commit: #1 (comment)

/// ```
///
/// See [`TreeNodeRecursion`] for more details on how the traversal can be controlled.
///
/// If `f_down` or `f_up` returns [`Err`], recursion is stopped immediately.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to remove this? I really need this and I would rather keep this here than adding it back in the next PR.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't like that we reuse the name transform then we can rename it like transform_down_up.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing a new API, could we consider using transform_down followed by transform_up? I'm not trying to make things difficult for you, but I'm just curious if we could achieve the same result by adding an extra line of code where needed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's not the same. I will need a short form of rewrite() (top-down and bottom-up closures provided for one combined traversal). This is an example how I would like to use such an API:
apache#8891 (comment)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got your point. Not related with the topic, but I guess contains_pattern in your example deep dives on each node until the leaves, so adding such an f_down may get worse the performance actually.

However, in general there may exist such use cases where f_down can speed up the f_up phase. I'm reverting this change. Can I just ask from you to modify the transform adapting the same behavior with updated visit and rewrite? I have updated the test results, but cannot adapt the transform implementation yet.

Copy link
Owner

@peter-toth peter-toth Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calculating the pattern bitset will be very cheap. (BTW, I want to do that lazyly.) Also, I think in many cases several subtrees remain unchanged during transformations and the patterns of those subtrees don't need to be recalculated, which means that multiple subsequent transformations can take advantage of some patterns calculated during the first transformation.

I'm reverting this change. Can I just ask from you to modify the transform adapting the same behavior with updated visit and rewrite?

Thanks. Sure, I will do that. (Most likely only tomorrow...)

Copy link
Author

@berkaysynnada berkaysynnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sending a commit for the mentioned parts. Thanks for your cooperation :)

datafusion/expr/src/tree_node/plan.rs Outdated Show resolved Hide resolved
self.expressions(),
new_children.into_iter().map(|child| child.data).collect(),
)
.map(Transformed::yes)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought there may exist some rules such that it transforms the tree in a few iterations, but at the final stage, the final data is the same as the initial data. Therefore, it would be better to check the data's itself rather than trusting the transformed flag. However, now I think that the equality check at each step could be expensive, and I think we need to rely on transformed flag. Otherwise it would have no meaning to carry.

I am converting it to propagate the transformed information up if any children is transformed.

datafusion/common/src/tree_node.rs Show resolved Hide resolved
/// ```
///
/// See [`TreeNodeRecursion`] for more details on how the traversal can be controlled.
///
/// If `f_down` or `f_up` returns [`Err`], recursion is stopped immediately.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing a new API, could we consider using transform_down followed by transform_up? I'm not trying to make things difficult for you, but I'm just curious if we could achieve the same result by adding an extra line of code where needed.

@peter-toth
Copy link
Owner

I am sending a commit for the mentioned parts. Thanks for your cooperation :)

Sure, thanks for the fixes!

let new_children = children.into_iter().map_until_stop_and_collect(f)?;
// Propagate up `new_children.transformed` and `new_children.tnr`
// along with the node containing transformed children.
if new_children.transformed {
Copy link
Owner

@peter-toth peter-toth Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this very risky? You now are relying on if the transformation closures filled the transformed flag correctly. What if they don't? So far noone used that flag for anything and noone checked if the flag is correcty filled. If a closure changes a child node but reports transformed=false (incorrectly) then the node's child will not be changed.
And such incorrect closures can exists not only in Apache DataFusion code, but anything that depends on DataFusion...

I think it would be safer to always call let t2 = self.with_new_arc_children(arc_self, t.data)?;. What I'm undecided about is if we should return:

  • Ok(Transformed::new(t2.data, t.transformed, t.tnr))
  • or Ok(Transformed::new(t2.data, t2.transformed, t.tnr))
  • or Ok(Transformed::new(t2.data, t.transformed || t2.transformed, t.tnr))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand your concern, but we are providing an API and I think we should assume it is used correctly. If we provide a good documentation, I don't think there will be any problems. It is clear and not hard to understand, so I don't think someone would return modified data as transformed::no. Carrying this information along would be a burden if we're not going to use it.

Maybe we can change the default value of the methods that create new objects to transformed::yes, or we can close the direct access to data, and force users to modify it through a method (set_data()), and when this method is run at least once, we can automatically set the transformed to yes.

Copy link
Owner

@peter-toth peter-toth Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let's run the tests and see if the current closures use the flag correctly.

BTW, why is ConcreteTreeNode different? As far as I see you didn't apply the if new_children.transformed check there..

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tnr needs to come from new_children, returning it from self.with_new_arc_children(arc_self, new_children.data) will not work.

How about:

            if new_children.transformed {
                new_children.map_data(|data| {
                    let arc_self = Arc::clone(&self);
                    self.with_new_arc_children(arc_self, data).map(|t| t.data)
                })
            } else {
                Ok(Transformed::no(self))
            }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, why is ConcreteTreeNode different? As far as I see you didn't apply the if new_children.transformed check there..

You are right. I thought with_new_children already does that propagation from children, but it only checks plans, not the data. I am updating it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tnr needs to come from new_children, returning it from self.with_new_arc_children(arc_self, new_children.data) will not work.

if new_children.transformed is true, I believe there is no possibility of that self.with_new_arc_children(arc_self, new_children.data) returns transformed::no. Therefore I thought the existing implementation is safe.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tnr needs to come from new_children, returning it from self.with_new_arc_children(arc_self, new_children.data) will not work.

if new_children.transformed is true, I believe there is no possibility of that self.with_new_arc_children(arc_self, new_children.data) returns transformed::no. Therefore I thought the existing implementation is safe.

I'm not talking about .transformed but about the .tnr field of the result. self.with_new_arc_children(arc_self, new_children.data) always returns TreeNodeRecursion::Continue so we need to return .trn from new_children.

@berkaysynnada berkaysynnada force-pushed the refactor-treenode-minor branch from cb7f965 to dbe24db Compare February 29, 2024 08:32
@berkaysynnada
Copy link
Author

I believe this PR is ready after the transform behavior is updated according to tests. I will add more comments to transform with an example scenario of its usage. It seems easy to understand and use to us because we have been dealing with it for a while, but it may be a difficult API to understand and use for someone using TreeNode for the first time.

@peter-toth
Copy link
Owner

peter-toth commented Feb 29, 2024

Looks good to me now. Let me merge this PR and fix the transform behaviour on the main PR.

I see that you renamed transform to transform_down_up_with_control in dbe24db. Is it a better name for it? I'm ok with that, but I don't get the with_control part.

@berkaysynnada
Copy link
Author

Looks good to me now. Let me merge this PR and fix the transform behaviour on the main PR.

I see that you renamed transform to transform_down_up_with_control in dbe24db. Is it a better name for it? I'm ok with that, but I don't get the with_control part.

I tried to emphasize the difference of calling transform_down then transform_up with that. Renamed it transform_down_up only now. Maybe you can show up with better ones. It is an API that should be used with caution and I would like to emphasize that it is quite different from the old transform.

The only TODO is converting transform and related macros to apply f_up on the jumped node also.

@peter-toth
Copy link
Owner

Ok, thanks! I will merge this PR later today and fix the TODO.

@peter-toth peter-toth merged commit 271a0fd into peter-toth:refactor-treenode-rewrite Feb 29, 2024
1 check passed
peter-toth added a commit that referenced this pull request Mar 30, 2024
* refactor `TreeNode::rewrite()`

* use handle_tree_recursion in `Expr`

* use macro for transform recursions

* fix api

* minor fixes

* fix

* don't trust `t.transformed` coming from transformation closures, keep the old way of detecting if changes were made

* rephrase todo comment, always propagate up `t.transformed` from the transformation closure, fix projection pushdown closure

* Fix `TreeNodeRecursion` docs

* extend Skip (Prune) functionality to Jump as it is defined in https://synnada.notion.site/synnada/TreeNode-Design-Proposal-bceac27d18504a2085145550e267c4c1

* fix Jump and add tests

* jump test fixes

* fix clippy

* unify "transform" traversals using macros, fix "visit" traversal jumps, add visit jump tests, ensure consistent naming `f` instead of `op`, `f_down` instead of `pre_visit` and `f_up` instead of `post_visit`

* fix macro rewrite

* minor fixes

* minor fix

* refactor tests

* add transform tests

* add apply, transform_down and transform_up tests

* refactor tests

* test jump on both a and e nodes in both top-down and bottom-up traversals

* better transform/rewrite tests

* minor fix

* simplify tests

* add stop tests, reorganize tests

* fix previous merges and remove leftover file

* Review TreeNode Refactor (#1)

* Minor changes

* Jump doesn't ignore f_up

* update test

* Update rewriter

* LogicalPlan visit update and propagate from children flags

* Update tree_node.rs

* Update map_children's

---------

Co-authored-by: Mustafa Akur <[email protected]>

* fix

* minor fixes

* fix f_up call when f_down returns jump

* simplify code

* minor fix

* revert unnecessary changes

* fix `DynTreeNode` and `ConcreteTreeNode` `transformed` and `tnr` propagation

* introduce TransformedResult helper

* fix docs

* restore transform as alias to trassform_up

* restore transform as alias to trassform_up 2

* Simplifications and comment improvements (#2)

---------

Co-authored-by: Berkay Şahin <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants