-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parsing url(…)
with special characters such as ;
or {}
#14879
Conversation
c37dd4e
to
0018d99
Compare
@@ -42,6 +42,7 @@ export function parse(input: string) { | |||
|
|||
let buffer = '' | |||
let closingBracketStack = '' | |||
let parenStack = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the closingBracketStack
for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this is a number but should be a string if for some reason we can't use closingBracketStack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I did originally, but you would run into issues relatively quickly. When you are parsing CSS like this:
.foo {
background: url(data:image/png;base64,abc==);
}
The moment you reach the ;
character, then the closingBracketStack
looks like this: {(
where {
is coming from the selector .foo {
, and (
is coming from the url(
function. Meaning that if the last character is (
we know that we are not top-level anymore, and we can safely ignore the ;
.
However, if we are parsing CSS like this:
.foo {
background: url(https://example-image-search.org?q={query;limit=5}&ids=[1,2,3]);
}
If we now reach the ;
, then the closingBracketStack
looks like this: {({
, where the first {
, comes from the selector .foo {
, the (
comes from the url(
function and the last {
comes from the query string.
In this case we can't just look at the last character in the closingBracketStack
, but have to check if any (
exists to know that we can safely ignore the ;
. This means that our O(1) lookup turns into an O(n) lookup. Sure, it's a small n, but still not ideal.
A second issue is that we validate that the closingBracketStack
is empty at the end of the parsing. Which ensures that every opening bracket has a closing bracket and thus properly balanced.
However, brackets inside the url, don't need to be balanced. This is valid CSS:
.foo {
background: url(https://example-image-search.org?curlies=}});
}
As you can see, the request is actually made with the curlies in the query param.
If we use the closingBracketStack
, in this case it would look like this:
{
— from the selector.foo {
{(
— from the start of the declaration valueurl(
{(}
— fromcurlies=}
, this would also error because improperly balanced. But let's assume it
didn't error.{(}}
— fromcurlies=}}
.{(}})
— from closing theurl)
, but the closing bracket is not balanced and in a strange
state.{(}})}
— from closing the.foo {
block}
.
Because we don't care about balancing curlies, and because we don't want to use O(n) instead of O(1) lookups, I moved it to a separate variable.
@thecrypticace, I picked a number because this value is only concerned with parentheses, and not with any other type of brackets. Since we don't care about the other types of brackets sandwiched between the parentheses (e.g.: {(})
), a number is a bit nicer to work with because we can just increment/decrement the value.
Eventually we do want the url(…)
to be closed though, so properly closing the parentheses is still important, but this basically means that the number should be 0
at the end, which we already account for.
If we don't like this solution, and still prefer to use the closingBracketStack
, we could also apply this rule instead:
If we see (
, ignore every other special character like ;
or }
until we see an )
, because that is essentially exactly what we are trying to achieve. If we then encounter an {
or }
, we don't even touch the closingBracketStack
at all.
It does mean that we have to work with strings instead of numbers a bit more, but that's fine I think? It does reduce the amount of concepts in the parser which is a benefit.
Diff would then look like this:
diff --git a/packages/tailwindcss/src/css-parser.ts b/packages/tailwindcss/src/css-parser.ts
index b91011e0c..3b6e283e5 100644
--- a/packages/tailwindcss/src/css-parser.ts
+++ b/packages/tailwindcss/src/css-parser.ts
@@ -42,7 +42,6 @@ export function parse(input: string) {
let buffer = ''
let closingBracketStack = ''
- let parenStack = 0
let peekChar
@@ -332,7 +331,10 @@ export function parse(input: string) {
// }
// ```
//
- else if (currentChar === SEMICOLON && parenStack <= 0) {
+ else if (
+ currentChar === SEMICOLON &&
+ closingBracketStack[closingBracketStack.length - 1] !== ')'
+ ) {
let declaration = parseDeclaration(buffer)
if (parent) {
parent.nodes.push(declaration)
@@ -344,7 +346,10 @@ export function parse(input: string) {
}
// Start of a block.
- else if (currentChar === OPEN_CURLY && parenStack <= 0) {
+ else if (
+ currentChar === OPEN_CURLY &&
+ closingBracketStack[closingBracketStack.length - 1] !== ')'
+ ) {
closingBracketStack += '}'
// At this point `buffer` should resemble a selector or an at-rule.
@@ -369,7 +374,10 @@ export function parse(input: string) {
}
// End of a block.
- else if (currentChar === CLOSE_CURLY && parenStack <= 0) {
+ else if (
+ currentChar === CLOSE_CURLY &&
+ closingBracketStack[closingBracketStack.length - 1] !== ')'
+ ) {
if (closingBracketStack === '') {
throw new Error('Missing opening {')
}
@@ -459,17 +467,17 @@ export function parse(input: string) {
// `(`
else if (currentChar === OPEN_PAREN) {
- parenStack += 1
+ closingBracketStack += ')'
buffer += '('
}
// `)`
else if (currentChar === CLOSE_PAREN) {
- if (parenStack <= 0) {
+ if (closingBracketStack[closingBracketStack.length - 1] !== ')') {
throw new Error('Missing opening (')
}
- parenStack -= 1
+ closingBracketStack = closingBracketStack.slice(0, -1)
buffer += ')'
}
Note: I know the closingBracketStack
actually holds the inverse of the actual brackets, but it's a bit easier to reason about it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I SWEAR when I read parenStack += …
I saw a (
. My brain is fried. Carry on…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, no worries!
Implemented that last change here: 9177c68
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed info. I do think the diff you ended up and the concept of "("
will not cause any other stack to be created until it's closed again seems good and it feels like the diff is lighter this way. Awesome!
When parsing the `url(…)`, then quotes to create a string are not required. If you use special characters inside the `url(…)` such as `;` then we should not end the declaration. Likewise, when the `url(…)` contains `{` or `}`, then we don't want to create a new block (for nesting) either.
Some characters, such as `{`, `}`, and `;` are allowed inside CSS functions, e.g.: `url(https://example?q={query;limit=5})`. Notice that the example is not in quotes, and this is a valid URL. If we don't handle these characters when inside of `(` and `)`, it means that the `{` would start a nested block, `}` would end the block, and `;` would end the current declaration. Instead, if we are in parens, we do allow the values.
We still apply the same rules where we don't really care about special characters such as `{`, `}` or `;` when we are inside of `(` and `)`. We can safely look at the last character in the `closingBracketStack` because when we are in parentheses, we don't update the `closingBracketStack` until we see a closing `)`.
0018d99
to
9177c68
Compare
Co-authored-by: Philipp Spiess <[email protected]>
This PR fixes an issue where some special characters (with an actual meaning CSS) were used inside of the
url(…)
function, would result in incorrectly parsed CSS.For example, when we encounter a
{
, then we would start a new "block" for nesting purposes. If we encounter an}
, then the block would end. If we encounter a;
, then that would be the end of a declaration.All of that is true, unless we are in a
url(…)
function. In that case, we should ignore all of those characters and treat them as part of the URL.This is only an issue because:
url(…)
without quotes. With quotes, this would not be an issue.