Skip to content

Commit

Permalink
Fix potential ReDoS (#37)
Browse files Browse the repository at this point in the history
  • Loading branch information
yetingli authored Sep 10, 2021
1 parent c1b5e45 commit 8d1d7cd
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion index.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
export default function ansiRegex({onlyFirst = false} = {}) {
const pattern = [
'[\\u001B\\u009B][[\\]()#;?]*(?:(?:(?:[a-zA-Z\\d]*(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]*)*)?\\u0007)',
'[\\u001B\\u009B][[\\]()#;?]*(?:(?:(?:(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]+)*|[a-zA-Z\\d]+(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]*)*)?\\u0007)',
'(?:(?:\\d{1,4}(?:;\\d{0,4})*)?[\\dA-PR-TZcf-ntqry=><~]))'
].join('|');

Expand Down

13 comments on commit 8d1d7cd

@BeigeBox
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which versions of ansi-regex are impacted by this vulnerability? Several scanners are flagging all versions of this library other than 5.0.1 and 6.0.1. I suspect some of the older versions aren't impacted, because the pattern variable is different (just spot checked I think v2.1.1 and v3).

@Qix-
Copy link
Member

@Qix- Qix- commented on 8d1d7cd Sep 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure because the security researchers are not doing their due diligence and are instead farming bounty money on e.g. huntr.dev, targeting the most used repositories for clout.

I'm really not sure what to do about this and it's eating a lot of time away. The vulnerability itself is not major - unless you're allowing long AND unsanitized user input to hit the API directly, the vulnerability doesn't affect you.

I have no idea which versions it actually affects because they didn't test it correctly.

@thoger
Copy link

@thoger thoger commented on 8d1d7cd Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed 4.1.0 and 3.0.0 as affected testing using the provided reproducer. 2.1.1 does not reproduce the issue.

3.0.0 is the first affected, as that's the first version that includes 69bebf6 that the problematic part of the regex.

@Qix-
Copy link
Member

@Qix- Qix- commented on 8d1d7cd Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @thoger. 🙂

@arungpro
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thoger Does 5.0.0 or above, are not affected this? Can you confirm on the same?

@Qix-
Copy link
Member

@Qix- Qix- commented on 8d1d7cd Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arungpro 5.0.0 is affected, 5.0.1 is not. Likewise, 6.0.0 is affected, 6.0.1 is not.

@OnlineCop
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would someone be able to help me understand the exact formatting going on here?

I assume that the \ backslashes are escaped because this is within a string, meaning \u001B must be represented as \\u001B ?

I copied the "proposed" escaped original into regex101 and removed the escape characters to produce this:

[\u001B\u009B][[\]()#;?]*(?:(?:(?:(?:;[-a-zA-Z\d\/#&.:=?%@~_]+)*|[a-zA-Z\d]+(?:;[-a-zA-Z\d\/#&.:=?%@~_]*)*)?\u0007)

I then plugged that unescaped version back in to determine whether there were any problems with the pattern itself, and see that there is an incomplete/unclosed (?: which should probably be removed:

[\u001B\u009B][[\]()#;?]*(?:(?:;[-a-zA-Z\d\/#&.:=?%@~_]+)*|[a-zA-Z\d]+(?:;[-a-zA-Z\d\/#&.:=?%@~_]*)*)?\u0007

The next outer (?: ... ) group appears to have no alternations or quantifiers, so can probably also be removed:

(?:(?:;[-a-zA-Z\d\/#&.:=?%@~_]+)*|[a-zA-Z\d]+(?:;[-a-zA-Z\d\/#&.:=?%@~_]*)*)?\u0007)

The next (?: ... | ... )? group does contain an alternation, and is qualified by its ? quantifier. Immediately after this optional group, it looks for \u0007 at the end.

Within both alternations, the character groups [-a-zA-Z\d\/#&.:=?%@~_] can be shortened: [a-zA-Z0-9_] or [a-zA-Z\d_] can be replaced with \w, so: [-\w\/#&.:=?%@~].

That would look like this:

[\u001B\u009B][[\]()#;?]*(?:(?:;[-\w\/#&.:=?%@~]+)*|[a-zA-Z\d]+(?:;[-\w\/#&.:=?%@~]*)*)?\u0007

So, within the first alternation, it matches a semicolon ; followed by one or more of those character-group characters. Within the second alternation, it first looks for letters or numbers, followed by any number of semicolon ; followed by zero or more of those character-group characters. I'm not sure whether that would want to be a + one-or-more, since that would allow "abc;;;;;;;;;" to match there instead of requiring characters between those semicolons.

If you look at this example, I've changed the outer non-capturing (?: ... )? to a capturing group just for illustrative purposes so you can see how the semicolon is matched in one case but not in the other due to this zero or more quantifiers. You can see this when the semicolon is added to the text in this next example.

If both ;[-\w\/#&.:=?%@~] character groups are quantified with zero-or-more *, then subsequent semicolons ;; will match, while if they are quantified with one-or-more +, the match fails due to subsequent semicolons ;;.

@lorand-horvath
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ansi-regex v3 and v4 should now include the vulnerability fixes:
3.0.1 in 419250f and c57d4c2
4.1.1 in 75a657d

@romans-ovo
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ansi-regex v3 and v4 should now include the vulnerability fixes: 3.0.1 in 419250f and c57d4c2 4.1.1 in 75a657d

Neither 3.0.1, nor 4.1.1 are listed on https://www.npmjs.com/package/ansi-regex (under Versions), btw.

@Qix-
Copy link
Member

@Qix- Qix- commented on 8d1d7cd Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

You sure?

@papb
Copy link

@papb papb commented on 8d1d7cd Aug 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would someone be able to help me understand the exact formatting going on here?

@OnlineCop can you first help me understand what exactly is your question here? Are you trying to understand what the code is doing? Are you proposing a readability improvement? Or performance improvement? Sorry, you said a lot of things but I couldn't even understand what is your goal, roughly...

I assume that the \ backslashes are escaped because this is within a string, meaning \u001B must be represented as \\u001B ?

Yes, exactly.

@OnlineCop
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original, lines 2-5 were looking for:

  1. either \u001B (ESC) or \u009B (¢ cent sign) followed by 0 or more of these characters: []()#;?

After that, it was looking for either:
2. any number of letters, numbers, semi-colons, or special characters that came before \u0007 (BEL)
- There were a lot of 0-or-more groups, all of which was also within its own (?:...)? 0-or-1 (optional) group, before that BEL.
- It was definitely something that could cause ReDoS.
3. or an optional group of digits and semi-colons, followed by a single character in the range [\dA-PR-TZcf-ntqry=><~].
- This optional group also had potential for a ReDoS.

In the revised version, lines 2-5 were looking for the same characters as 1. above, but increased the variety of characters allowed before \u0007 and still has a lot of potential for ReDoS (possibly more than before).

I don't understand what exactly is supposed to be matched here. Both BEFORE and AFTER versions have a high probability to cause catastrophic backtracking. Are there any tests that should match either of the BEFORE or AFTER regexes?

@Qix-
Copy link
Member

@Qix- Qix- commented on 8d1d7cd Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to create a reproduction case that exploits exponential time complexity and file an issue. Otherwise I'm not sure what you're trying to say.

Please sign in to comment.