-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: adjust inefficient regular expression #336
Conversation
Fix a potential performance cliff in pathological case. Matching large numbers of repetitions of `[` can take a minute or more in the current code. This change gets it back down into the milliseconds as expected.
For what is worth I suggest adding https://lgtm.com/projects/g/mrmlnc/fast-glob?mode=list or CodeQL (this is the successor of LGTM). |
I use LGTM and my plan was to fix those other two after this one lands. I figured changing one regular expression at a time is easier for people to review and feel comfortable landing. Changing a whole bunch at once can be challenging to review/test sometimes. Aside from the tools you mention, @meekdenzo is working (when he has time) on creating a GitHub Action that uses https://github.com/davisjam/vuln-regex-detector to detect these things in pull requests. Adding that as part of a GitHub Action test suite might be more effective than CodeQL. I have CodeQL on many repositories and it has never flagged anything (or if it does flag stuff, it does it in a way I don't notice). |
CodeQL works for me so far, but I will fork this repo to confirm later.
And I agree with landing one change at a time. I just wanted to mention a
couple of tools that can catch these issues for the future.
…On Wed, Dec 29, 2021, 19:01 Rich Trott ***@***.***> wrote:
For what is worth I suggest adding
https://lgtm.com/projects/g/mrmlnc/fast-glob?mode=list or CodeQL (this is
the successor of LGTM).
I use LGTM and my plan was to fix those other two after this one lands. I
figured changing one regular expression at a time is easier for people to
review and feel comfortable landing. Changing a whole bunch at once can be
challenging to review/test sometimes.
Aside from the tools you mention, @meekdenzo
<https://github.com/meekdenzo> is working (when he has time) on creating
a GitHub Action that uses https://github.com/davisjam/vuln-regex-detector
to detect these things in pull requests. Adding that as part of a GitHub
Action test suite might be more effective than CodeQL. I have CodeQL on
many repositories and it has never flagged anything (or if it does flag
stuff, it does it in a way I don't notice).
—
Reply to this email directly, view it on GitHub
<#336 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACVLNPUHZH5LR2L3MOSRZLUTM5HLANCNFSM5KJEZF5Q>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
All right, I just tested this quickly and it correctly flags the same stuff LGTM does (which was expected): I'd say it's definitely worth adding CodeQL later to the repo. EDIT: I opened #338 |
@@ -9,7 +9,7 @@ const GLOBSTAR = '**'; | |||
const ESCAPE_SYMBOL = '\\'; | |||
|
|||
const COMMON_GLOB_SYMBOLS_RE = /[*?]|^!/; | |||
const REGEX_CHARACTER_CLASS_SYMBOLS_RE = /\[.*]/; | |||
const REGEX_CHARACTER_CLASS_SYMBOLS_RE = /\[[^[]*]/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to test is to make sure this handles [
inside of brackets as expected. If provided '[[]'
, the current regex matches '[[]'
(the whole string), but the regex proposed here only matches the last two characters ('[]'
). I don't know if one or the other would be considered a bug, or if '[[]'
is malformed and the behavior there is undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main goal of this regex is to say that the pattern contains a character classes ([1-5]
, [:alnum:]
, …). Based on this statement, I think it is enough here to check that there is something enclosed in square brackets in the pattern. We don't need to check that the pattern is correct (syntax and nesting checking). Pattern validation is a separate layer delegated to the micromatch
package. In this case, your proposed solution is correct.
Looks good for me. Thanks for the contributing 🎉 |
It probably makes sense to try to fix the other issues too and
|
I don't think that this is a real security issue and that it needs to be backported to the previous major version. |
What is the purpose of this pull request?
Fix a potential performance cliff in pathological case.
What changes did you make? (Give an overview)
Matching large numbers of repetitions of
[
can take a minute or more in thecurrent code. This change gets it back down into the milliseconds as
expected.
Instead of matching "open bracket followed by whatever and then a close bracket", this change has the regexp ignore repeated open brackets to avoid polynomial backtracking.