Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parsing performance #24

Merged
merged 3 commits into from
Jun 24, 2024
Merged

Improve parsing performance #24

merged 3 commits into from
Jun 24, 2024

Conversation

unlessgames
Copy link
Collaborator

Fixes #23

I was too lazy to benchmark and measure precisely and the pain points were fairly obvious after some quick testing. While the parser got a bit more ugly, but it's also way faster.

Added a stress test line here, this was hanging for a long-time before, now it doesn't.

I'm a bit bummed by the fact that utilizing the pest grammar to do the heavy-lifting leads to this much performance penalty. Maybe in the future I'll rewrite the entire parser using chumsky or combine...

@unlessgames
Copy link
Collaborator Author

I removed some is_err from bindings/cycle.rs that were checking empty cycles, an empty cycle will just lead to a single rest, I think this is fair as we already parse [] as a rest, which I think is better than crashing for that.

@unlessgames unlessgames requested a review from emuell June 23, 2024 22:06
@emuell
Copy link
Owner

emuell commented Jun 24, 2024

That's a huge improvement. Unfortunately the #[bench] feature still is unstable, so I've created some quick and dirty benches with https://github.com/bluss/bencher

master

running 2 tests
test nested_groups ... bench: 2,577,790 ns/iter (+/- 205,782)
test parser ... bench: 46,256,770 ns/iter (+/- 5,336,864)

perf/cycle-nest

running 2 tests
test nested_groups ... bench: 21,611 ns/iter (+/- 1,390)
test parser ... bench: 120,066 ns/iter (+/- 16,314)

See See https://github.com/emuell/afseq/tree/perf/benches

Would not hurt to have a few other benches in there, as all the stuff is very time critical.
Feel free to change or add some tests in that branch, if you want.

The benches can be run with cargo bench


I'm a bit bummed by the fact that utilizing the pest grammar to do the heavy-lifting leads to this much performance penalty. Maybe in the future I'll rewrite the entire parser using chumsky or combine...

It's also very surprising to me. As long as there are workarounds like this, this is manageable. The other Rust PEG parsers that I reviewed back then are definitely more difficult to use. I guess you can't have everything here...

@unlessgames
Copy link
Collaborator Author

This benchmarking looks neat, it's good to see how much speed we gained. It could be nice to have something like it in a github action that would run a comparison benchmark for master and the branch to be merged and log it along with test results.

The other Rust PEG parsers that I reviewed back then are definitely more difficult to use.

Chumsky seems alright (getting to v1.0 now). It's not as clean as the grammar file, but the parser could skip the entire intermediate "rule pairs" form and just generate the Step struct right away, this reduces complexity in a way (at the cost of readability) and probably gains even more speed. But for now, this is clearly good enough.

@emuell
Copy link
Owner

emuell commented Jun 24, 2024

But for now, this is clearly good enough.

Yup.

@unlessgames unlessgames merged commit b3295c3 into master Jun 24, 2024
1 check passed
@unlessgames unlessgames deleted the perf/cycle-nest branch June 24, 2024 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parsing nested groups
2 participants