Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack Overflow with Deeply Nested Filter Expressions #8900

Open
orlandohohmeier opened this issue Jan 18, 2024 · 4 comments
Open

Stack Overflow with Deeply Nested Filter Expressions #8900

orlandohohmeier opened this issue Jan 18, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@orlandohohmeier
Copy link

orlandohohmeier commented Jan 18, 2024

Describe the bug

When running a query with a deeply nested filter expression the query fails with stack overflow – the bug initially manifested as a EXC_BAD_ACCESS error on macOS in our application. The problem is that the filter expression is recursively normalized using transform_up which can cause stack overflows. This probably also happens in other scenarios where one would end up with a deeply nested tree.

Tested/Reproduced with:
version = "34.0.0"
macOS = 14.2.1

To Reproduce

Minimal Reproducible Example:

use datafusion::arrow::array::Int64Array;
use datafusion::arrow::datatypes::DataType;
use datafusion::arrow::datatypes::Field;
use datafusion::arrow::datatypes::Schema;
use datafusion::arrow::record_batch::RecordBatch;
use datafusion::error::Result;
use datafusion::prelude::*;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
    let ctx = SessionContext::new();
    let batch = RecordBatch::try_new(
        Arc::new(Schema::new(vec![Field::new("a", DataType::Int64, false)])),
        vec![Arc::new(Int64Array::from(vec![
            1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
        ]))],
    )?;
    let df = ctx.read_batch(batch)?;

    let mut expr = col("a").eq(lit(1));
    for _ in 0..1000 {
        expr = expr.or(col("a").eq(lit(1)));
    }

    let df = df.filter(expr).unwrap();

    df.show().await
}

For it to run into an SO with an optimized release build the depth needs to be increased to 10000.

Expected behavior

The query should complete without errors, despite the complexity of the filter expression.

Additional context

No response

/cc @nfnt

@orlandohohmeier orlandohohmeier added the bug Something isn't working label Jan 18, 2024
@orlandohohmeier orlandohohmeier changed the title Stack Overflo with Deeply Nested Filter Expressions Stack Overflow with Deeply Nested Filter Expressions Jan 18, 2024
@orlandohohmeier
Copy link
Author

Current Quickfix: Increasing the stack size e.g. via the RUST_MIN_STACK environment variable (e.g., to 16MB) temporarily resolves the issue.

Long-term Solution: We should probably implement and use a stackless variant of transform_up, transform_down, and their mutable counterparts. I'd be happy to draft a PR, given their widespread use, the limited set of functions in TreeNode and need for post-order traversal, I am unsure how to best fix this bug though.

@alamb
Copy link
Contributor

alamb commented Jan 21, 2024

I'd be happy to draft a PR, given their widespread use, the limited set of functions in TreeNode and need for post-order traversal, I am unsure how to best fix this bug though.

Thanks @orlandohohmeier -- note there is some significant work currently being undertaken on the TreeNode APIs -- see #8913

Maybe once that is done, it will be easier to implement a stackless variant 🤔

We have some version of this pattern in the SQL planner already (SqlToRel)

@orlandohohmeier
Copy link
Author

@alamb, thanks for the pointers! I'll dive into #8913 to understand the ongoing changes in the TreeNode APIs better. Glancing over it and aiming at maintaining the interface things could be done in parallel. I'll also review the existing implementation of the SqlToRel pattern in the SQL planner. This might provide valuable insights or a potential pathway for implementing a stackless variant for transform_up and related functions. I will get back once I have an idea of how to do the refactoring.

@alamb
Copy link
Contributor

alamb commented Dec 14, 2024

This may have been fixed by

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants