Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient filter-rewriting for st.text(...).filter(str.isidentifier) #3725

Merged
merged 2 commits into from
Sep 5, 2023

Conversation

reaganjlee
Copy link
Contributor

@reaganjlee reaganjlee commented Aug 30, 2023

Implements #3480

Adds more efficient strategy for st.text(...).filter(str.isidentifier). When running st.text().filter(str.isidentifier) on 100 iterations,

Before After
Time 0.6106s 0.2300s

@reaganjlee reaganjlee marked this pull request as draft August 31, 2023 00:10
Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking on this issue Reagan! I've added some comments around the implementation (I thought of a more efficient way to do it 😅) and tests, but this is already a great start.

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase on #3730 to fix merge conflicts and get some nice new helpers, then I think we're nearly there!

Comment on lines +355 to +356
# Should we deterministically check whether ascii or not or st.characters fine?
@pytest.mark.parametrize("al", [None, "cdef123", "cd12¥¦§©"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine to ignore the non-OneCharStringStrategy case (fyi sampled_from() is automatically converted in text()!); definitely good to test both ascii and non-ascii.

@Zac-HD Zac-HD changed the title Text filter strat Efficient filter-rewriting for st.text(...).filter(str.isidentifier) Sep 5, 2023
@Zac-HD Zac-HD force-pushed the text_filter_strat branch 3 times, most recently from a2ca289 to 4d146f5 Compare September 5, 2023 09:29
@Zac-HD Zac-HD marked this pull request as ready for review September 5, 2023 16:31
@Zac-HD
Copy link
Member

Zac-HD commented Sep 5, 2023

Thanks Reagan! This was surprisingly tricky to get working at a low level, but your logic and faster intersection calculation were really useful - and will have a big impact in Zac-HD/hypothesmith#32 very soon 😁

(comment instead of review because github is currently degraded!)

@Zac-HD Zac-HD merged commit 2a93b1e into HypothesisWorks:master Sep 5, 2023
46 checks passed
@reaganjlee reaganjlee deleted the text_filter_strat branch September 5, 2023 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants