-
-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
single quotes within double quoted brace sub treated differently for the # ## % %% / operators #291
Comments
Here's a script that shows they behave similarly in bash:
|
OK that makes more sense. Now I see what you meant by Most of your test cases come down to a disagreement between OSH and bash about what single quotes means within a double quoted var sub. I am having a hard time reconciling most of the bash behavior you observed with this behavior of bash:
Basically I treat single quotes as LITERALS inside double quotes for that reason. In OSH, this affects the meaning of:
Which are the cases you filed bugs about. Do you know of any intuitive explanation for the difference between the behaviors? In my mind if |
FWIW running
You will get something like this. In the first case the arg_word has a string of 3 characters. In the second case the arg word has a SQ token of one character.
|
Whoops, I didn't delete my comment in time. I remembered that there is a difference with the quotes but forgot exactly what. |
Ah, I think that it my use case it was interpreting the |
OSH was giving the syntax error because it was passing However that's not what bash does AFAICT. Bash appears to evaluate the In my model of the world, bash is being inconsistent by treating single quotes in #1 differently than #2 and #3.
Two possibilities:
Certainly I could parse single quotes within double quotes differently depending on whether the operator is I would have thought that the quoting rules are the same regardless of what the operator is. I think maybe the underlying reason is that OSH processes the operator and the quotes in a single pass, "at the same time". But bash and mksh process the operator first, and then the inner quotes as a separate step. And the second step can use knowledge of the operator to perform an additional layer of dequoting (or not). This seems dumb to me but it's my best guess as to what's going on. (And I would expect that this is explained nowhere in the manual for any shell, or in the POSIX spec ...) |
Oh, I meant that Bash interpreted it as a range expression for my use-case when unquoted, which is why I quoted it.
The latter two are evaluated by pattern matching, not regex, which is what handles the quotes.
It is documented in both the Bash manual and the POSIX spec, and pattern matching is what is standardly used in shells; the only application of regex I know of in POSIX-compatible shells is Bash's |
Where is it documented that single quotes are literals in |
Hm I guess you removed this part from your message (because Github sent me a mail earlier), but it doesn't explain the difference between the two constructs:
This would only explain the behavior outside of double quotes anyway. Once you're inside double quotes, you can argue that single quotes ARE quoted. |
Anyway, my goal is not to have a debate about POSIX... my goal is to come up with some reasonable semantics for shell and then get rid of it :) In other words, we should be able to write a manual with a "straight face", while still running existing shell scripts. It sounds like you are not particularly attached to Ideally OSH would have a parse error, but the runtime error is reasonably informative. I think I should maintain a list of "unresolved cases", which would include this and the If there are trivial patches to a script that "improve" it, then I'm inclined to keep OSH simple. But someone could come along later with a widely used script that relies on the subtle behavior, and it might be worth changing OSH. What do you think? |
I think you're thinking about this the wrong way; don't think of quoting the parameter for it as actual quoting for its contents, but rather making it behave like a keyword, making See, globs and backslashes are also literal when quoted, as in keywords, but it evaluates it itself, which is what also handles the quotes. Or think about it in my original use:
|
Hm I'm having a very hard time following what you're saying. What do you even mean by "keyword"? Some examples would help. Another thing that would help is if you can address the question above:
|
Also, dash and busybox ash agree with OSH on the "${var#'a'}" case. They do NOT turn So are dash and busybox ash not POSIX compliant? No, POSIX simply doesn't say what the correct behavior is. I care more about what "real" shell scripts rely on than POSIX... but you claimed that this behavior was documented somewhere, which seems false. |
Shell keywords, like
Does my most recent quote not document that? If you're not satisfied with that, they're documented to use pattern matching, where quotes are special. |
Wait, what? I tried it on |
OK let me check in a test case so we can agree on the observations. I don't think you're understanding my question. Explain how the quote you gave addresses the difference between
and
That's not the issue I'm talking about. |
The quote only applies to
Single-quotes are quotes, too...
|
I understand that globs are not expanded for The issue is whether the single quotes are processed by the shell, or not processed and sent through to the string matching engine. |
From what I understand, they are not processed, and the string matching engine is what removes them. I brought up those two because I think the same thing happens there, and IIRC OSH does handle single-quotes properly in them. |
I checked in repros for the issues we're discussing here so we can be on the same page: http://www.oilshell.org/git-branch/dev/andy-11/a21a1b80/spec/bugs.html
Now I see that bash, mksh, and zsh all AGREE on that single quotes are processed by the shell in this case. So that's very strong evidence, so we should probably change it. dash and ash disagree in both cases #4 and #5. They strip the string However I do think dash and ash are less "important". The versions I'm using are perhaps a bit old: http://www.oilshell.org/blob/spec-bin/ If newer versions of dash and ash adopt the bash/mksh/zsh behavior, then that would be very strong evidence to change it. Regarding POSIX, I see what you're saying -- single quotes are quotes -- but where does it document this behavior?
In the second case, the single quotes are NOT processed by the shell. My claim is that this behavior is inconsistent with the above behavior of The best rule I can come up with is that the Anyway, like I said, I care more about what shells do and what shell scripts rely on rather than POSIX, but that is why I'm not really satisifed by the quote. But we don't have to resolve this debate to move forward. |
I'll double-check
That's kind of it; the single-quotes are literal, as they should be since it's quoted, but
And as I mentioned earlier, POSIX mentions it again specifically for
|
The issue is that we were computing s[:-0] when the suffix is empty. Added an explicit check to fix it. Addresses issue #291. I also moved the spec tests around. This bug was reported by Crestwave as the "nested strip" case in spec/var-op-strip.
I just fixed the bug you mentioned on the other thread with
But it actually just boiled down to As for the single quote issue, I think I'll end up changing it at some point in the future. Let me see if I can run your BF interpreter first :) It sounded like that change was not essential to run it. |
That just tries to remove two single-quotes because of this issue. Did you mean
The one written in |
There was a bug whenever the suffix to be stripped by Yes please file a bug about Thanks for the confirmation. I noticed that ash has been copying bash features, so they are probably also fixing corner cases to be more like bash. |
Isn't POSIX compliancy also one of their goals? |
@wertercatt hit this in #161 with:
|
I'm not sure how that is related to this issue, though? Actually, that seems to be because they ran it as |
This is a weird exception for the # ## % %% and / operators. Fixes issue #291.
Finally fixed this! It wasn't that hard to fix, but it is a weird rule ... there is no syntactic indication of the difference between |
Forking from #290.
This is indeed a difference between OSH and bash/mksh. It has to do with whether the single quote and hyphen is a SHELL metcharacter or a REGEX metacharacter.
+
and-
?+
through'
? (which is a regex syntax error)@Crestwave can you tell me why you have the single quotes? What are they supposed to do? How is it different than
This removes all
+
and-
characters from the string in bash. I can't see what the extra single quotes do?I'm not saying this is not a bug -- it's definitely a difference between bash and OSH. But I'm trying to understand the intention behind the code. Thanks.
The text was updated successfully, but these errors were encountered: